from Optics to Neuroscience and Statistical Learning
The PSF of this nice human eye can be measured in-vivo [Opth.Phys.Opt.97].
Vision is the ability to interpret the surrounding environment by analyzing the measurements drawn by imaging systems.
This ability is particularly impressive in humans when compared to the current state of the art in computers .
The study of all the phenomena related to vision in biological systems (and particularly in humans) is usually referred to as Vision Science. It addresses a variety of issues ranging from the formation of the visual signal (e.g. the Physics of the imaging process that includes Radiometry and Physiological Optics), to the analysis of the visual signal (of interest for Neuroscience and Psychology). This analysis involves the extraction of visual primitives through basic computations in the retina-cortex neural pathway and the information processing leading to scene descriptors of higher abstraction level (see elsewhere). These problems may be addressed from a mechanistic perspective focused on describing the empirical behavior of the system; or from a normative perspective that looks for the functional reasons (organization principles) that explain the behavior. While the mechanistic perspective is based in experimental recordings from Psychophysics and Neurophysiology, the normative perspective is based on the study of Image Statistics and the use of concepts from Information Theory and Statistical Learning. The latter is known as the Efficient Coding Hypothesis.
Over the years we have done original work in all the above subdisciplines related to (low-level) Vision Science. Now we are shifting to more abstract visual functions.
I made experimental contributions in three aspects: Physiological Optics, Psychophysics and Image Statistics. (i) In the field of Physiological Optics, we measured the optical transfer function of the lens+cornea system in-vivo [Opth.Phys.Opt.97]. This work received the European Vistakon Research Award 94'. (ii) In Psychophysics, we proposed simplified methods to measure the Contrast Sensitivity Function in all the frequency domain [J.Opt.94], and a fast and accurate method to measure the parameters of multi-stage linear+nonlinear vision models [Proc.SPIE15]. Finally, (iii) in Image Statistics we gathered spatially and spectrally calibrated image samples to determine the properties of these signals and their variation under changes in illumination, contrast and motion [Im.Vis.Comp.00, Neur.Comp.12, IEEE-TGRS14, PLoS-ONE14, Rem.Sens.Im.Proc.11, Front.Neurosci.15].
|Illustrative experimental equipment: Left: double-pass setting for the measurement of the Modulation Transfer Function of the human eye [Opth.Phys.Opt.97]. Right: Spectrally calibrated light sources, image colorimeter and spectroradiometer to gather accurate color image statistics, see the available color image database [Neur.Comp.12, PLoS-ONE14], and texture and motion datasets [Front.Neurosci.15], with samples ready to be processed.|
We proposed mathematical descriptions of different visual dimensions: Texture, Color, and Motion. (i) we used wavelet representations to propose nonstationary Texture Vision models [J.Mod.Opt.97, MScThesis95], (ii) we developed Color Vision models with illumination invariance that allow the reproduction of chromatic anomalies, adaptation and aftereffects [Vis.Res.97,
J.Opt.96, J. Opt.98, JOSA04, Neur.Comp.12], and (iii) Motion Vision models [Alheteia08] that focus the optical flow computation in perceptually relevant moving regions [J.Vis.01, PhDThesis99], and explain the static motion aftereffect [Front.Neurosci.15]. All these psychophysical and physiological models have a parallel linear+nonlinear structure where receptive fields and surround-dependent normalization play an important role.
Empirical motion model at work:
Waving hands sequence recorded at my lab (just a remake of the original movie from Watson & Ahumada), linear filter model of MT neurons as an aggregate of spatio-temporal wavelet-like filters (bottom left) tuned to certain speed for optical flow computation (example at bottom right for a later frame).
Note that our remake improved the original by including a striped costume for Fourier-obsessed freaks! See how it feels to be color blind!:
We proposed a way to simulate the perception of color blinds (here with Picasso's Dora Maar). As you see, dichromats are not color blind at all: they simply see different colors. Moreover, this simulation can be used to discriminate between color theories just by asking your dichromat friend which image is more similar to him. See additional CODE for Texture, Color and Motion perception
Theory: principled models in Vision Science
Empirical motion model at work: Waving hands sequence recorded at my lab (just a remake of the original movie from Watson & Ahumada), linear filter model of MT neurons as an aggregate of spatio-temporal wavelet-like filters (bottom left) tuned to certain speed for optical flow computation (example at bottom right for a later frame). Note that our remake improved the original by including a striped costume for Fourier-obsessed freaks!
See how it feels to be color blind!: We proposed a way to simulate the perception of color blinds (here with Picasso's Dora Maar). As you see, dichromats are not color blind at all: they simply see different colors. Moreover, this simulation can be used to discriminate between color theories just by asking your dichromat friend which image is more similar to him.
See additional CODE for Texture, Color and Motion perception
This category refers to the proposition of organization laws of sensory systems that explain the empirical phenomena. These principles show that neural function has been adapted to (or is determined by) the statistics of visual stimuli. In this regard, (i) we worked on the derivation of the linear properties of the sensors, and we found that their spatio-chromatic sensitivity, the way the receptive fields change, and their phase properties, come from optimal solutions to the adaptation problem under noise constraints and manifold matching [PLoS-ONE14, IEEE-TGRS13], from statistical independence requirements [LNCS11, NeuroImag.Meeting11], and from optimal estimation of object reflectance [IEEE TGRS14]. (ii) We also worked on the derivation of the non-linear behavior for a variety of visual sensors (chromatic, texture, and motion sensors). We found that in all cases the nonlinearities are related to optimal information transmission (entropy maximization) and/or to error minimization in noisy systems (optimal vector quantization). We studied this relation in the classical statistics-to-perception direction (deriving the nonlinearity from the regularities in the scene) [Network06, Neur.Comp.12, Front.Neurosci.15], as well as in the (more novel) perception-to-statistics direction, i.e. by looking at the statistical effect of perceptually motivated nonlinearities [J.Opt.95, Im.Vis.Comp.00, LNCS00, Patt.Recog.03, Neur.Comp.10, LNCS10, NeuroImag.Meeting11].
|Illustrative organization principle: Optimal adaptation and information transmission with noise constraints (Higher Order Canonical Correlation) predicts shifts in oscillatory responses of gabor-like opponent spatio-chromatic receptive fields when adapted to visual scenes under different illumination (similarly to V1 neurons). See CODE here.|
In theoretical neuroscience the derivation of properties of biological sensors from the regularities visual scenes requires novel tools for statistical learning. In this field, we developed new techniques for unsupervised manifold learning, feature extraction (or symmetry detection in datasets), dimensionality reduction, probability density estimation, multi-information estimation, distance learning, and automatic adaptation from optimal dataset matching. Given my interest in applicability in Vision Science problems, I focused on techniques that can be explicitly represented in the image domain to be compared with receptive fields of visual neurons, as opposed to the usual practice in the Machine Learning community. Techniques include Rotation-based Iterative Gaussianization -RBIG- [IEEE TNN 11], Sequential Principal Curves Analysis -SPCA- [Network06, Neur.Comp.12, Front. Neurosci.15], Principal Polynomial Analysis -PPA- [Int.J.Neur.Syst.14], Dimensionality Reduction based on Regression -DRR- [IEEE JSTSP15], and Graph Matching for Adaptation [IEEE TGRS13].
Illustrative learning technique: data induced metric and identified features using nonlinear feature extraction (in this case, the Principal Polynomial Analysis, PPA). The scatter plot (at the top) shows the training set, the 2nd row shows the features (curves in black) and the discrimination ellipsoids (in yellow) in different representation domains: the original domain (left), the data-unfolded domain after PPA (center), and the whitened PPA domain (right). Principal Polynomial Analysis looks for the intrinsic curvilinear coordinates generalizing the linear directions of PCA (principal polynomials instead of eigen directions). Similarly to Mahalanobis distance when using PCA, the Jacobian of PPA can be used to define data dependent measures. Similarly to what is done in (linear) Independent Component Analysis, the Jacobian of PPA can be used to identify intrinsic features in the input domain. As opposed to the linear counterparts, metric and features are local. Different sets of local features (local basis vectors) in the input domain (3rd row) or in the PPA-transformed domain (4th row) can be obtained from the Jacobian or the jacobian-related metric matrix. See available CODE for this and other feature extraction techniques here.