The simple Regression toolbox, simpleR, contains a set of functions in Matlab to illustrate the capabilities of several statistical regression algorithms. simpleR contains simple educational code for linear regression (LR), decision trees (TREE), neural networks (NN), support vector regression (SVR), kernel ridge regression (KRR), aka Least Squares SVM, Gaussian Process Regression (GPR), and Variational Heteroscedastic Gaussian Process Regression (VHGPR). We also include a dataset of collected spectra and associated chlorophyll content to illustrate the training/testing procedures. This is just a demo providing a default initialization. Training is not at all optimized. Other initializations, optimization techniques, and training strategies may be of course better suited to achieve improved results in this or other problems. We just did it in the standard way for illustration and educational purposes, as well as to disseminate these models.

Standard SVR formulation only considers the single-output problem. In the case of several output variables, other methods (neural networks, kernel ridge regression) must be deployed, but the good properties of SVR are lost: hinge-loss function and sparsity. The proposed model M-SVR extends the single-output SVR by taking into account the nonlinear relations between features but also among the output variables, which are typically inter-dependent.

The combination of the classical Vapnik's e-insensitive loss function and the Huber cost function leads to enhanced performance when different noise sources are present in the data. This cost function has been applied to system identification, gamma-filtering, and to SVR.

It contains two kernel-based methods for semi-supervised regression. The methods rely on building a graph or hypergraph Laplacian with both the available labeled and unlabeled data, which is further used to deform the training kernel matrix. The deformed kernel is then used for support vector regression (SVR). Given the high computational burden involved, we present two alternative formulations based on the Nyström method and the Incomplete Cholesky Factorization to achieve operational processing times. The semi-supervised SVR algorithms are successfully tested in multiplatform LAI estimation and oceanic chlorophyll concentration prediction. Experiments are carried out with both multispectral and hyperspectral data, demonstrating good generalization capabilities when low number of labeled samples are available, which is usually the case in biophysical parameter estimation.

The in-house developed Automated Radiative Transfer Models Operator (ARTMO) Graphic User Interface (GUI) is a software package that provides essential tools for running and inverting a suite of plant RTMs, both at the leaf and at the canopy level. ARTMO facilitates consistent and intuitive user interaction, thereby streamlining model setup, running, storing and spectra output plotting for any kind of optical sensor operating in the visible, near-infrared and shortwave infrared range (400-2500 nm). the ARTMO package includes physical, statistical and hybrid inversion and model emulation. Some modules are pure machine learning techniques for regression, active learning, dimensionality reduction and feature ranking!

Nonlinear system identification based on Support Vector Machines (SVM) has been usually addressed by means of the standard SVM regression (SVR), which can be seen as an implicit nonlinear Auto Regressive and Moving Average (ARMA) model in some Reproducing Kernel Hilbert Spaces (RKHS). The proposal here is twofold: First, the explicit consideration of an ARMA model in RKHS (SVM-ARMA2K) is originally proposed. Second, a general class of SVM-based system identification nonlinear models is presented, based on the use of composite Mercer's kernels.

Nonlinear system identification based on relevance vector machines (RVMs) has been traditionally addressed by stacking the input and/or output regressors and then performing standard RVM regression. Here we introduce a full family of composite kernels to integrate the input and output information in the mapping function. An improved trade-off between accuracy and sparsity is obtained in several benchmark problems. Also, the ARX-RVM yields confidence intervals for the predictions, and it is less sensitive to free parameter selection.

The kernel signal to noise ratio (KSNR) considers a least squares regression model that maximizes the signal variance while minimizes the estimated noise variance in a reproducing kernel Hilbert space (RKHS). The KSNR can be used in any kernel method to deal with correlated (possibly non-Gaussian) noise. KSNR yields more fitted solutions and extracts more noise-free features when confronted with standard approaches.