Remote sensing


We have a solid background and experience in developing and applying machine learning and signal/image processing methods in remote sensing applications: from biophysical parameter retrieval and model inversion to image classification/segmentation, change and target detection, spectral unmixing and source separation, multitemporal image analysis, multimodal analysis, multiresolution fusion, quality assessment, and image restoration. Other important contributions within this context are related to ground data acquisition campaigns, involvement on the backbone of future developments in hyperspectral imaging research, and the development of algorithms for satellite sensors calibration. Lately, we focused our attention in very challenging applications, such as cloud screening from optical data and nonlinear retrieval of atmospheric profiles using infrared sounders. Complement these applications our expertise in database development and in management of calibrated images and remote sensing data.


In the last decade, our group has focused on the application and development of specific machine learning and signal/image processing algorithms in the field of remote sensing. Our expertise has followed different approaches and tackled diverse problems, but in all cases the inclusion of physical knowledge as well as the physical analysis of the obtained results and models have been a common factor. The main action lines and objectives follow.

Objectives and research lines

Feature extraction, selection, dimensionality reduction and domain adaptation

The high data dimensionality involved in remote sensing data reinforces the idea of extracting the most relevant features for each particular problem. For example, analyzing the joint spatio-spectral higher order image statistics and applying advanced feature extraction techniques such as KPCA. In image classification, KPLS efficiently exploits the non linear dependency of the input features with the class labels. When no supervised information is available KECA provides features that preserve the input data distribution while maximizing the divergence among clusters. Also, biophysical parameter retrieval from hundreds to thousands of spectral bands benefits from noise-robust feature extraction techniques (KSNR) focused on maximizing the information content of the descriptors (KPLS). As an alternative to implicit mappings performed by kernel machines, we have also done research in explicit mappings proposing novel parametric, invertible and nonlinear versions of PCA. For example, we have studied the characteristics of remote sensing images with our Principal polynomial analysis (PPA), Gaussianization transforms (RBIG), and dimensionality reduction via regression (DRR). Recently, we are paying attention to the field of dimensionality reduction of remote sensing data with deep learning: on the one hand we have successfully shown the representation power of (unsupervised) convolutional deep neural networks compared to kernel machines, and have proposed a cascaded deep structured kernel machine that alleviates the information bottleneck issue. Complementary to feature extraction, feature selection may be appropriate in cases of collinearity and to address the issue of interpretability. Recently, kernel methods have been proposed to tackle the feature selection problem of either estimating dependence in remote sensing images exploiting the concept of Hilbert-Schmidt norms, or for selecting features via multiple (MKL) kernel learning.

Very often in Earth observation applications, image processing is challenged by changing acquisition conditions (illumination, atmosphere, observation, land cover, etc.). Models developed for images (or alternatively EO observations) acquire at a given date are generally not valid for images acquired later in different conditions. Two main approaches are being explored in this context. The first one consists in adapting the models to the different domains. The second is based on transforming the data from each domain using nonlinear correlations and dependencies between images, and then matching their manifolds. We believe that the second strategy is more adequate, and meets our philosophy line: learn the proper representation space and signal characteristics. Here we deploy our manifold learning algorithms, either parametric (like PPA or DRR) or nonparametric (like graphMatching and KEMA). Actually, domain adaptation or manifold alignment are the two sides of the same coin.

Remote sensing data classification

In the last decade we have been very active in remote sensing image classification/segmentation, and have developed several strategies following supervised, semi-supervised or totally unsupervised learning. Another different concern is that a complete and representative training set is essential for a successful classification. When this is not possible, often the case in EO applications. Actually, little attention has been paid to the case of having an incomplete knowledge of the classes (or uncertainly defined). This may be critical since, in many applications, acquiring ground truth information for all classes is very difficult, especially when complex and heterogeneous geographical areas should be analyzed. One-class, target detection and one-shot learning classifiers can efficiently deal with incomplete training datasets. Intimately related to this field we find the problems of: (1) change detection and multitemporal image classification (we proposed a kernel-based framework to tackle both problems simultaneously); and (2) anomaly change detection which has to do with the philosophically ill-defined problem of detecting the anomaly buried in the pervasive change (an illusive question for which we recently proposed another kernel anomaly change detection framework).

Following the same line of thought above, we believe that the key issue for remote sensing data classification is how to encode prior knowledge about the problem at hand. For example, we have exploited the joint spatio-spectral-temporal features and higher order image statistics for classification in different ways: via spatially local image models and representations, composite kernels and deep convolutional networks. Note that encoding prior (physical) knowledge is intimately related to the issue of regularizer design and invariance encoding/learning. In remote sensing, it is essential to make classifiers invariant to noise, scale, rotation, illumination, or shadows. This can be partially done using specifically designed features or by embedding physically-based invariances in the classifiers though virtualization; in both cases we have recently presented advances. Learning in EO data classification settings most of the times involves few labeled samples or with high uncertainty regimes. On the one hand, such situations led us to work and propose new semisupervised methods, where we introduced valuable graphs and hypergraphs kernels, graph Laplacian formulations, generative models, and cluster kernels. Yet, simultaneously, we often encounter the problem of obtaining representative labeled samples, and here active learning helped us in developing supervised, semisupervised and unsupervised strategies.

Biophysical information retrieval and model inversion

Remote sensing is typically used to estimate biophysical parameters (e.g., temperature, humidity, canopy density) using the acquired spectra. In remote sensing, these problems can be tackled as model inversion and regression problems. Neural networks, SVMs, and adaptive Gaussian processes, as well as kernel feature extraction have been used to this end. In order to solve the ill-conditioned problems, it is necessary to encode some prior knowledge about the associated physics in the statistical models and to include some realistic constraints in the spatial-spectral-temporal domains.

In this research line, related to Geosciences we aim to improve prediction models by adaptation to Earth Observation data characteristics. We typically rely on the framework of kernel learning, which has emerged as the most appropriate framework for remote sensing data analysis in the last decade. The new retrieval models are adapted to the particular signal characteristics, such as unevenly sampled time series and missing data, non-Gaussianity, presence of heteroscedastic and non-stationary processes, and non-i.i.d. (spatial and temporal) relations. Models based on kernels and GPs allow us to advance in uncertainty quantification using predictive variances under biophysical constraints. Advances in sparse, reduced-rank and divide-and-conquer schemes address the computational cost problem. The proposed kernel framework aims to improve results in terms of accuracy, reduced uncertainty, consistency of the estimations and computational efficiency.

Multiresolution fusion and quality assessment

More and more remote sensing satellites are being launched every year, carrying sensors with increasing spatial, temporal and spectral resolution. However, in most of the cases the low effective temporal sampling of the area does not allow to obtain proper monitoring. This situation stimulates us to explore the possibility of using all available data in order to increase the temporal resolution over the study area. Achieving these objectives implies combining different sensors with different spectral bands and spatial resolutions through image fusion techniques. In this application scenario, we also aim to measure the quality of images via adapting the visual metrics in Visual Neuroscience area to the remote sensing case; for this we have developed nonlinear (kernel) versions of the SSIM index.

Image denoising, restoration and enhancement

Radiometric consistency between bands must be assured in order to retrieve accurately the shape of the spectral signature from the surface. Different noise sources and amounts are present in the data and scattered either in the spatial or specific spectral bands. This makes necessary to design appropriate image restoration and enhancement procedures. In addition to random noise, multispectral images may be typically affected by non-periodic partially deterministic patterns due to the acquisition process that need specifc approaches. For example, adjacent pixels in an image line should provide the same spectrum from a homogeneous surface; otherwise an artificial variation is introduced in the form of vertical stripes. A full general-purpose toolbox of image denoising methods (based on image statistics and kernel machines) have been introduced as well.

Related projects

Cloud detection in the cloud
Google Earth Engine Research Award, L. Gomez-Chova, 01/16 - 12/17

SEDAL: Statistical Learning for Earth Observation Data Analysis
ERC Consolidator Grant, G. Camps-Valls, 01/15 - 12/19

Mapping and the citizen sensor
ICT COST Action, 01/13 - 12/16

LIFE-VISION: Learning Image Features to Encode Visual Information
Spanish Ministry of Economy and Competitiveness, 2012. TIN2012-38102-C03-01, 01/13 - 12/15

SenSyF: Sentinels Synergy Framework
EU 7th Framework Programme on research, technological development and demostration (FP7-Space). Collaborative Projects: Preparing take-up of GMES Sentinel, 01/13 - 12/15

Study on pattern recognition based cloud detection over landmarks
EUMETSAT European Organisation for the Exploitation of Meteorological Satellites, 01/15 - 11/15

Selected references