Exploratory project GenIALearn (2021 - 2023)

Application of machine learning and deep learning to improve animal genomic selection

The development of genomic selection - and other "omics" analyses such as metagenomics, transcriptomics, metabolomics and proteomics - now makes it possible to characterise animals using thousands of measurements. This massive data is integrated into models to predict production traits with the highest possible degree of accuracy.

Background and challenges

The most commonly used models in genomic prediction (additive genetic model such as GBLUP) are very efficient in predicting the genetic value of animals on a few genetically correlated traits.  On the other hand, this type of model does not allow the integration of a very large number of heterogeneous measurements, nor does it predict many output traits without knowing their genetic correlations. Moreover, this model is limited in its ability to take into account the many non-linear interactions that occur between regions of the genome or environmental factors.

In order to overcome these obstacles, we propose statistical learning (machine learning) and deep learning methods, derived from AI, to process both additive genetic information and non-linear genetic information present in massive genotyping data.

Goals

The GenIALearn project proposes to evaluate the performance of statistical and deep learning methods for the joint prediction of multiple complex traits, by integrating massive genotyping data. Two main families of methods will be compared altogether and versus the reference method GBLUP:

  • On the one hand, ensemble learning methods (random forests, gradient boosting), coupled with a learning step to represent the input data, in order to propose reference prediction levels;
  • On the other hand, deep learning methods of different architectures (neural networks), coupled with learning step on massive data base,  which should  produce predictive models  adapted for animal genomic selection.

 

Contacts/coordination:

Eric Barrey, UMR GABI

Didier Boichard, UMR GABI

Partnerships

INRAE participants

Animal Genetics division

Expertise

UMR GABI

Fine phenotyping of complex traits; multi-omics

(genotyping, transcriptomics, metagenomics, metabolomics); genetic values evaluation and complex multi-trait predictions.

Mathematics and Digital technologies division

MIA - Paris

modelling; statistical learning; machine learning; large and heterogeneous data; application to life sciences

Partners

UEVE, Université Paris-Saclay

Expertise

IBISC

Neural network construction methods and deep learning; Applications for transcriptomic and image analysis

Publications

Deep Learning and GBLUP Integration: An Approach that Identifies Nonlinear Genetic Relationships Between Traits. Fatima Shokor, Pascal Croiseau, Hugo Gangloff, Romain Saintilan, Thierry Tribout, Tristan Mary-Huard, Beatriz C.D. Cuyabano bioRxiv 2024.03.23.585208; doi: https://doi.org/10.1101/2024.03.23.585208

Communications  

  • Eric Barrey, Blaise Hanczar, Julien Chiquet, Didier Boichard, Jocelyn de Goër de Herve, et al.. Benchmarking predictive models: evaluating parametric, ensemble, and deep learning approaches for animal phenotype prediction from genotypes.. AI and biology Symposium, EMBO EMBL, Heidelberg, Mar 2024, HEIDELBERG, Germany. ⟨hal-04510253⟩
  • Eric Barrey, Pierre Fumeron, Anne Ricard, Blaise Hankzar, Eric Barrey 1, Pierre Fumeron 1, Anne Ricard 1, 2, Blaise Hankzar 3  (1 Université Paris-Saclay, AgroParisTech, INRAE, GABI UMR1313, Jouy-en-Josas, France. 2 IFCE, Recherche et Innovation, 61310 Exmes, France. 3 IBISC, UEVE, Université Paris-Saclay, France). Deep Learning Application for Predicting Endurance Horse Racing Performance via High-Density Genotyping, 14th International Havemeyer Foundation Horse Genome Workshop, May 12-15, 2024, Caen, France, Abstracts book 25   https://horse-genome.workshop.inrae.fr/content/download/723/7227?version=2 
  • F Shokor, P Croiseau, R Saintilan, T Mary-Huard, H Gangloff, et al.. Exploring Non-Linear Genetic Relationships Between Correlated Traits. 74th Annual Meeting of the European Association for Animal Production, INRAE, Aug 2023, Lyon, France. ⟨hal-04247381