Document Type : Research Paper

Authors

Department of Animal Science, Faculty of Agriculture and Natural Resources, Arak University, Arak, 38156-8-8349, Iran.

Abstract

The aim of this research was to compare the efficiency and performance of the advanced artificial neural network method with the principal component analysis method in discriminating different horse breeds. In this study, two methods of perceptron neural network (Olden) and the principal component analysis (PCA), were used to identify a subset of SNP markers with the highest breed discrimination potential and to investigate how to assign animals to their breed groups. The results showed that the network method (Olden),  is able to separate all the 37 horse breeds with a small subset of SNP markers (8,000 markers) with a same capability to all genomic markers (98% accuracy). The PCA selection method was only able to identify and separate breeds with diverse geographical originations. According to the obtained results, the PCA method is not error-free and depends upon changes and modifications to run on genomic data. The results of this study provide practical approaches in the design of economic arrays for discriminating the different horse breeds.
 

Keywords

  1. Howe KL, Achuthan P, Allen J, Allen J, Alvarez-Jarreta J, Amode MR, et al. (2020) Ensembl 2021. Nucleic Acids Research 49(D1): D884-D91.
  2. Paschou P, Ziv E, Burchard EG, Choudhry S, Rodriguez-Cintron W, Mahoney MW, et al. (2007) PCA-Correlated SNPs for Structure Identification in Worldwide Human Populations. PLoS Genetics 3(9): e160.
  3. Menhaj M (2009) Fundamentals of Neural Networks of Computational Intelligence. Tafresh Unit, Professor Hesabi Publication Center: Tehran University of Technology (Polytechnic of Tehran). (In Persian)
  4. Wilkinson S, Wiener P, Archibald AL, Law A, Schnabel RD, McKay SD, et al. (2011) Evaluation of approaches for identifying population informative markers from high density SNP Chips. BMC Genetics 12(1): 45.
  5. Bertolini F, Galimberti G, Schiavo G, Mastrangelo S, Di Gerlando R, Strillacci M, et al. (2018) Preselection statistics and Random Forest classification identify population informative single nucleotide polymorphisms in cosmopolitan and autochthonous cattle breeds. Animal 12(1): 12-9.
  6. Tu JV (1996) Advantages and disadvantages of using artificial neural networks versus logistic regression for predicting medical outcomes. Journal of Clinical Epidemiology 49(11): 1225-31.
  7. Petersen JL, Mickelson JR, Cothran EG, Andersson LS, Axelsson J, Bailey E, et al. (2013) Genetic Diversity in the Modern Horse Illustrated from Genome-Wide SNP Data. PLoS ONE 8(1): e54997.
  8. Olden JD and Jackson DA (2002) Illuminating the “black box”: a randomization approach for understanding variable contributions in artificial neural networks. Ecological modelling; 154(1): 135-50.
  9. Paetkau D, Calvert W, Stirling I and Strobeck C (1995) Microsatellite analysis of population structure in Canadian polar bears. Molecular Ecology 4(3): 347-54.
  10. Core T (2017) R: A Language and Environment for Statistical Computing [Available from: https://www.R-project.org/.
  11. Stefan Fritsch and Guenther F (2016) neuralnet: Training of Neural Networks [Available from: https://CRAN.R-project.org/package=neuralnet.

 

 

 

 

  1. Beck M (2016) NeuralNetTools: Visualization and Analysis Tools for Neural Networks [Available from: https://CRAN.R-project.org/package=NeuralNetTools.
  2. Hinrichs AL, Larkin EK and Suarez BK (2009) Population stratification and patterns of linkage disequilibrium. Genetic epidemiology 33(S1): S88-S92.
  3. Reich D, Price AL and Patterson N (2008) Principal component analysis of genetic data. Nature Genetics 40: 491.
  4. Lewis J, Abas Z, Dadousis C, Lykidis D, Paschou P and Drineas P (2011) Tracing Cattle Breeds with Principal Components Analysis Ancestry Informative SNPs. PLOS ONE 6(4): e18007.
  5. Dimauro C, Cellesi M, Steri R, Gaspa G, Sorbolini S, Stella A, et al. (2013) Use of the canonical discriminant analysis to select SNP markers for bovine breed assignment and traceability purposes. Animal Genetics 44(4): 377-382.
  6. Dimauro C, Nicoloso L, Cellesi M, Macciotta NPP, Ciani E, Moioli B, et al. (2015) Selection of discriminant SNP markers for breed and geographic assignment of Italian sheep. Small Ruminant Research 128: 27-33.
  7. Biffani S, Dimauro C, Macciotta N, Rossoni A, Stella A and Biscarini F (2015) Predicting haplotype carriers from SNP genotypes in Bos taurus through linear discriminant analysis. Genetics Selection Evolution: GSE 47(1): 4.
  8. Sorbolini S, Gaspa G, Steri R, Dimauro C, Cellesi M, Stella A, et al. (2016) Use of canonical discriminant analysis to study signatures of selection in cattle. Genetics Selection Evolution: GSE 48(1): 58.
  9. Moradi MH, Khaltabadi-Farahani AH, Khodaei-Motlagh M, Kazemi-Bonchenari M and McEwan J (2021) Genome-wide selection of discriminant SNP markers for breed assignment in indigenous sheep breeds. Annals of Animal Science 21(3): 807:831.
  10. Kumar H, Panigrahi M, Saravanan KA, Parida S, Bhushan B, Gaur GK, et al. (2021) SNPs with intermediate minor allele frequencies facilitate accurate breed assignment of Indian Tharparkar cattle. Gene 20; 777: 145473.
  11. Novembre J and Stephens M (2008) Interpreting principal component analyses of spatial population genetic variation. Nature genetics 40(5): 646-649.
  12. Azizi Z, Rafat A, Shoja J, Moradi Shahrbabak H and Moradi Shahrbabak M (2016) Study Of Population Structure And Stratification Two Ecotypes Buffalo With Dense Single Nucleotide Polymorphism Markers Using Admixture, Mds, Pca And Gc Methods. Journal Of Agricultural Biotechnology 8(2): 53-67. (In Persian).
  13. Azizi Z, Moradi Shahrbabak H and Moradi Shahrbabak M (2017) Comparison Of PCA And DAPC Methods For Analysis Of Iranian Buffalo Population Structure Using Snpchip90k Data. Iranian Journal Of Animal Science (Iranian Journal of Agricultural Sciences) 48(2): 153-161. (In Persian).
  14. Ringnér M (2008) What is principal component analysis? Nature Biotechnology 26: 303-304.