Consortium MIMS (2022 - 2023)

Cross Methodological Insights for Multi-source Data Integration

In biology, as in other scientific fields, the integration of multi-source data is more relevant than ever. Indeed, the data collected are increasingly complex and their volume is growing, due to the development of analytical platforms, imaging techniques, the rise of omics data, etc

Background and challenges

This context has stimulated the search for new methods allowing the joint analysis of several data sets (structured data, multi-block, multi-channel) in many fields, such as:

  • Machine Learning, where several approaches are considered for the processing of multi-source data (matrix factorisation, probabilistic approach).
  • Chemometrics, where different methods are proposed to establish a chemical mapping of samples using several analytical techniques (generalisations of canonical analysis, NIPALS algorithm and tensor decompositions)
  • Bioinformatics, where integrative methodological approaches allow the most complete picture possible of the dynamics of molecular systems to be drawn. 

In order to contribute to meeting the challenge of analysing and exploiting these multi-source data from an exploratory, but also predictive perspective, it is essential to bring together different viewpoints, practices and paradigms in order to reconcile these different approaches. It is also necessary to encourage collaboration between "method generators" and "data generators" in the various application fields.

This is the challenge that the MIMS consortium proposes to take up, by bringing together an interdisciplinary community working on approaches to the analysis and integration of multi-source data.

Goals

MIMS is a multidisciplinary consortium gathering more than 60 researchers, whose objective is to examine the analysis and exploitation of multi-source data, both in an exploratory and predictive perspective.

This consortium brings together multidisciplinary skills: information processing, biological sciences and analytics. The implementation of this multi-disciplinarity and its management will be based on the sharing of data, practices and methods between the partners, with the aim of formalising a scientific project to meet a common challenge: the optimal analysis of multi-source data for exploratory and predictive purposes.

Contacts :

Units involved and partners

INRAE participants

Food, bioproducts and waste division

Expertise

USC StatSC

Sensometry/Chemometrics/Statistics/Multispectral imaging

UR BIA

Chemometrics/computer science

UR QuaPA

Volatolomics/MRI Chemometrics/Data Analysis/Image Analysis/System & Data Management

UMR SPO

Chemometrics

LBE

biostatistics, machine learning

Mathematics and digital technologies division

UMR TAP

Chemometrics

UMR MAIAGE

mathematical statistics/applied statistics/bioinformatics

Nutrition, Chemical Food Safety and Consumer Behaviour division

CSGA Centre des Sciences du Goût et de l'Alimentation

Chemometrics

UNH Unité Nutrition Humaine

Bioinformatics, metabolomics, chemometrics

PhAN

Perinatal nutrition and metabolic diseases, Bioinformatics, Data analysis, metagenomics and metabolomics

LABERCA

Metabolomics, Chemometrics, Expology, Epidemiology

Microbiology and the food chain division

Micalis

Biologist/Microbiota/Data Analysis

Prose

 

Ecology and Biodiversity division

BioForA

Quantitative Genetics/Modelling

LBLGC

Physiology

Plant Biology and Breeding division

AGAP Institut

Quantitative genetics, Genomics, Biochemistry, Evolutionary genetics, Selection, Ecophysiologist, Biostatistics, Bioinformatics

Animal Physiology and Livestock system division

UMR SELMET

Biometrics, Chemometrics, Machine Learning, Agronomy

  

Partners

Faculté des Sciences, Paris

Expertise

Centre Boreli

Unsupervised learning, Statistics, Graph networks, Bioinformatics

INRIA

Equipe projet LORIA

Knowledge Discovery/Life Sciences

Université de Genève

Sciences Analytiques

Metabolomics, Chemometrics

Université de Toulouse

Institut de mathématique  de Toulouse

Statistics, Multi-omics data analysis and integration

ANSES

Laboratoire de Ploufragan-Plouzané

Statistics, multi-block methods Epidemiology

CNAM

EPN6 - Mathématiques et Statistique

Analysis of complex heterogeneous data, Clusterwise methods, High dimensional classification

Université de Paris-Saclay

Signaux et Statistique

AMulti-block data analysis, tensor analysis (high dimensional), Structural equation models

Université de Montpellier

Institut Montpellierain Alexander Grothendieck

supervised component models / classification

ADLIN (partenaire privé)

ADLIN

Finance, Strategy, Multi-omics, Bioinformatics, Transcriptomics, 

Visualisation

Institut du vin et de la vigne

IFV

Chemometrics/Analytical Chemistry

See also

 

  • Bersanelli, M, Mosca E, Remondini D. et al. Methods for the integration of multi-omics data: mathematical aspects. BMC Bioinformatics 17, S15 (2016). https://doi.org/10.1186/s12859-015-0857-9
  • Boccard J,   Schvartz D, Codesido S, Hanafi M,   Gagnebin Y, Ponte B, Jourdan F, Rudaz S.  (2021). Gaining insights into metabolic networks using chemometrics and bioinformatics: chronic kidney disease as a clinical model. Frontiers in Molecular Biosciences 8, 682559.
  • Boutalbi R,  Labiod L, Nadif M. (2021): Implicit consensus clustering from multiple graphs. Data Min. Knowl. Discov. 35(6): 2313-2340
  • Hanafi M, Kiers H. A. L. (2006). Analysis of K sets of data, with differential emphasis on agreement between and within sets. Computational Statistics and Data Analysis. (51), 3, 1491-1508.
  • Eicher T, Kinnebrew G, Patt A, Spencer K, Ying K, Ma Q, Machiraju R, Mathé AEA. Metabolomics and Multi-Omics Integration: A Survey of Computational Methods and Resources. Metabolites. 2020 May 15;10(5):202. doi: 10.3390/metabo10050202. PMID: 32429287; PMCID: PMC7281435.