0% Complete
صفحه اصلی
/
4th international edition and 13th Iranian Conference on Bioinformatics
Machine-learning based biomarker discovery for Striga resistance in sorghum
نویسندگان :
Leyla Nazari
1
Afshar Estakhr
2
1- مرکز تحقیقات و آموزش کشاورزی و منابع طبیعی فارس، سازمان تحقیقات، آموزش و ترویج کشاورزی، شیراز، ایران
2- مرکز تحقیقات و آموزش کشاورزی و منابع طبیعی فارس، سازمان تحقیقات، آموزش و ترویج کشاورزی، شیراز، ایران
کلمات کلیدی :
Striga hermonthica،ReliefF،Information gain،Gain Ratio
چکیده :
Sorghum bicolor is the fifth most important cereal crop in the world, cultivated across the globe in almost 110 countries, predominantly in Asia and Africa but also in Europe, America, and Oceania. Despite its outstanding resilience to abiotic stresses, approximately 20% of sorghum yield loss is annually due to infestation with the parasitic weed Striga hermonthica. Identifying sources of striga resistance gene within sorghum is imperative to developing resistant sorghum cultivars. Feature selection algorithms are frequently employed in preprocessing machine learning pipelines applied to biological data to identify relevant features. The objective of this study is about selecting the important features along with improving the prediction accuracy. Therefore, we propose to use an integrated strategy including Information gain, Gain Ratio, and ReliefF to filter important genes involved in striga resistance in sorghum. For this, were searched in the public database, National Center for Biotechnology Information Gene Expression Omnibus (NCBI GEO) and found a study with accession number GSE216351. The DESeq2 package (v 1.34.0) was employed and the genes less than 10 counts across all samples were filtered out. The resulting matrix including 25037 genes and 31 samples were submitted to the feature selection algorithms and the top 50 genes ranked by Information gain, Gain Ratio, and ReliefF were selected as input for venn diagram. Seven genes including Sobic.001G429700, Sobic.002G087200, Sobic.010G134900, Sobic.005G063800, Sobic.005G192400, Sobic.006G053500, and Sobic.002G307700 were found to be common identified by the three methods. To validate the accuracy of our feature selection methods, we tested different algorithms from classifiers bayes, functions, lazy, meta, rules, and trees and the best performance algorithm from each classifier was selected. Modeling was performed using 10-fold cross-validation. NaiveBayes, SGD, IBK, AdaBoostM1, PART (rules), and j48 prediction models were the best algorithms from each classifier to discriminate control and infected crops. The highest performance was obtained by NaiveBayes with 96.7742% accuracy. Therefore, considering the high efficiency of these seven genes to classify control and infected crops, they could be suggested as biomarkers for striga resistant in sorghum.
لیست مقالات
لیست مقالات بایگانی شده
In Silico Design of DNA G-Quadruplex Aptamers Targeting Lipopolysaccharide Core and Capsular Polysaccharide in Multidrug-Resistant Klebsiella pneumoniae
Aida Arezoumandchafi - Maryam Azimzadeh Irani - Hamidreza Mollasalehi
Integrated bioinformatic analysis for the screening of hub genes & therapeutic drugs in high-grade serous ovarian cancer
Maryam Khalili - Behnaz Saffar
Design and Analysis of siRNA for Silencing the NS3 Gene of Hepatitis C Virus: A Novel Therapeutic Approach
Manouchehr Teymouri - Sara Eslami
Uncovering Disrupted Cell-Cell Interactions in Alzheimer's Disease Using Variational Graph Autoencoders on Single-Cell Spatial Transcriptomics Data from the Human Middle Temporal Gyrus
Zahra Bayat - Alireza Fotuhi Siahpirani
Expansion and Sequencing of the DNA Code Used in the COVID-19 Vaccine Using Meta-Heuristic Algorithms
Ahmad Aliyari Boroujeni - Mohammadreza Parsayi - Hossein Rahmati
Aureolysin Inhibition: Identification of Sigmoidin B as a Potential Therapeutic Candidate against Staphylococcus aureus
Amir Mohammad Akbarian Khujin - Melika Sadat Samadi - Ghazal Shirdel - Elnaz Afshari
Combination therapy synergism prediction for virus treatment using machine learning models
Shayan Majidifar - Arash Zabihian - Mohsen Hooshmand
Unlocking the Hidden Potential of Leuconostoc: Insights from Genomic Analysis
Bahram Bassami - Niloufar Zamanpour - Najmeh Salehi - Javad Hamedi
Predicting Adverse Drug Reactions with Advanced Machine Learning Techniques
AlI Mohammadian - Sara Haghighi Bardine - Fatemezahra Alizade
Applying immunoinformatics methods for Multiepitope Vaccine Design against HIV virus Based on the INT, RT, PRO genes
Fatemeh Hassanzadeh - Zahra Hassanzadeh - Ava Hashempour
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 42.7.0