RBIG4IT

Paper

"Information Theory Measures via Multidimensional Gaussianization"
Valero Laparra, Emmanuel Johnson, Gustau Camps-Valls, Raul Santos-Rodríguez, Jesús Malo
2020

ArXiv https://arxiv.org/abs/2010.03807


Abstract

Information theory is an outstanding framework to measure uncertainty, dependence and relevance in data and systems. It has several desirable properties for real world applications: it naturally deals with multivariate data, it can handle heterogeneous data types, and the measures can be interpreted in physical units. However, it has not been adopted by a wider audience because obtaining information from multidimensional data is a challenging problem due to the curse of dimensionality. Here we propose an indirect way of computing information based on a multivariate Gaussianization transform. Our proposal mitigates the difficulty of multivariate density estimation by reducing it to a composition of tractable (marginal) operations and simple linear transformations, which can be interpreted as a particular deep neural network. We introduce specific Gaussianization-based methodologies to estimate total correlation, entropy, mutual information and Kullback-Leibler divergence. We compare them to recent estimators showing the accuracy on synthetic data generated from different multivariate distributions. We made the tools and datasets publicly available to provide a test-bed to analyze future methodologies. Results show that our proposal is superior to previous estimators particularly in high-dimensional scenarios; and that it leads to interesting insights in neuroscience, geoscience, computer vision, and machine learning.


Software

colab colab RBIG Python toolbox
Includes tools to compute Information Theory measures used in the current paper [RBIG4IT2020]

colab Demo in Google Colab



colabcolab RBIG Matlab toolbox
Includes tools to compute Information Theory measures and scripts to generate the synthetic data for the experiments in the current paper [RBIG4IT2020]


Paper Summary

  • The measures that can be computed using RBIG defined in this paper are the ones in the following figure + the Kulback-Leibler divergence. The main point is that RBIG allows to get acurated estimations of these measures even in multidimensional datasets.

    matlab


Extended results

Here extra results for the paper are shown. Mainly figures for results on synthetic data that would taken too much space in the original paper.

Total Correlation

  • Results for Total Correlation.

    matlab

    FIGURE: Total correlation estimation results in relative mean absolute error. Results for different distributions are given: Gaussian, uniform and the Student PDFs ( μ = 3, 5, 20 for each row respectively). Each column correspond to an experiment of a particular number of dimensions D . Mean and standard deviation are given for five trials.
    matlab

Entropy

  • Results for Entropy.

    matlab
    matlab

KLD

  • Results for KLD.

    matlab
    matlab

mutual information

  • Results for mutual information.

    matlab
    matlab


References

[RBIG4IT2020]: "Information Theory Measures via Multidimensional Gaussianization". V. Laparra, E. Johnson, G. Camps-Valls, R. Santos-Rodriguez, J. Malo.
[TNN2011]: "Iterative Gaussianization: from ICA to Random Rotations". V. Laparra, G. Camps & J. Malo. IEEE Transactions on Neural Networks.


Copyright & Disclaimer

The programs are granted free of charge for research and education purposes only. Scientific results produced using the software provided shall acknowledge the use of the RBIG implementation provided by us. If you plan to use it for non-scientific purposes, don't hesitate to contact us.

Because the programs are licensed free of charge, there is no warranty for the program, to the extent permitted by applicable law. except when otherwise stated in writing the copyright holders and/or other parties provide the program "as is" without warranty of any kind, either expressed or implied, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose. the entire risk as to the quality and performance of the program is with you. should the program prove defective, you assume the cost of all necessary servicing, repair or correction.

In no event unless required by applicable law or agreed to in writing will any copyright holder, or any other party who may modify and/or redistribute the program, be liable to you for damages, including any general, special, incidental or consequential damages arising out of the use or inability to use the program (including but not limited to loss of data or data being rendered inaccurate or losses sustained by you or third parties or a failure of the program to operate with any other programs), even if such holder or other party has been advised of the possibility of such damages.