0% Complete
صفحه اصلی
/
4th international edition and 13th Iranian Conference on Bioinformatics
Machine-learning based biomarker discovery for Striga resistance in sorghum
نویسندگان :
Leyla Nazari
1
Afshar Estakhr
2
1- مرکز تحقیقات و آموزش کشاورزی و منابع طبیعی فارس، سازمان تحقیقات، آموزش و ترویج کشاورزی، شیراز، ایران
2- مرکز تحقیقات و آموزش کشاورزی و منابع طبیعی فارس، سازمان تحقیقات، آموزش و ترویج کشاورزی، شیراز، ایران
کلمات کلیدی :
Striga hermonthica،ReliefF،Information gain،Gain Ratio
چکیده :
Sorghum bicolor is the fifth most important cereal crop in the world, cultivated across the globe in almost 110 countries, predominantly in Asia and Africa but also in Europe, America, and Oceania. Despite its outstanding resilience to abiotic stresses, approximately 20% of sorghum yield loss is annually due to infestation with the parasitic weed Striga hermonthica. Identifying sources of striga resistance gene within sorghum is imperative to developing resistant sorghum cultivars. Feature selection algorithms are frequently employed in preprocessing machine learning pipelines applied to biological data to identify relevant features. The objective of this study is about selecting the important features along with improving the prediction accuracy. Therefore, we propose to use an integrated strategy including Information gain, Gain Ratio, and ReliefF to filter important genes involved in striga resistance in sorghum. For this, were searched in the public database, National Center for Biotechnology Information Gene Expression Omnibus (NCBI GEO) and found a study with accession number GSE216351. The DESeq2 package (v 1.34.0) was employed and the genes less than 10 counts across all samples were filtered out. The resulting matrix including 25037 genes and 31 samples were submitted to the feature selection algorithms and the top 50 genes ranked by Information gain, Gain Ratio, and ReliefF were selected as input for venn diagram. Seven genes including Sobic.001G429700, Sobic.002G087200, Sobic.010G134900, Sobic.005G063800, Sobic.005G192400, Sobic.006G053500, and Sobic.002G307700 were found to be common identified by the three methods. To validate the accuracy of our feature selection methods, we tested different algorithms from classifiers bayes, functions, lazy, meta, rules, and trees and the best performance algorithm from each classifier was selected. Modeling was performed using 10-fold cross-validation. NaiveBayes, SGD, IBK, AdaBoostM1, PART (rules), and j48 prediction models were the best algorithms from each classifier to discriminate control and infected crops. The highest performance was obtained by NaiveBayes with 96.7742% accuracy. Therefore, considering the high efficiency of these seven genes to classify control and infected crops, they could be suggested as biomarkers for striga resistant in sorghum.
لیست مقالات
لیست مقالات بایگانی شده
Using interpretable deep learning models and multi-objective data for computational discovery of new drugs
Masoud Ahmadlou
Molecular Docking Study of Pyrazole Interaction with Bovine Serum Albumin (BSA): Insights from Drug-Protein Binding
Parisa Zallou - Yaghub Pazhang - Ebrahim Nemati-Kande
Age-Related Gene Expression Changes in Microglia: Insights from the Nygen platform and Brain-Aging Atlas
Anahita Esmaeili-Mehr - َAygin Zabtkar - Seyed abolhassan Shahzade Fazeli - Amir Amiri-Yekta - Yaser Tahmtani
Multi-Target Drug Discovery for Rheumatoid Arthritis: A Comprehensive Computational Approach Using Bioactive Compounds
Pegah Mansouri - Pardis Mansouri - Sohrab Najafipour - Seyed Amin Kouhpayeh - Akbar Farjadfar - Esmaeil Behmard
Comprehensive Analysis of EEG Signals for Machine Learning-Based Depression Detection
Mikaeil Tabarraei - Sepideh Jabbari
Unlocking the Hidden Potential of Leuconostoc: Insights from Genomic Analysis
Bahram Bassami - Niloufar Zamanpour - Najmeh Salehi - Javad Hamedi
Discovering Moonlighting Proteins with AI and Explainability
Masoud Mahdavifar - Milad Besharatifard - Fateme Zaremirakabad
Prediction of E8 mpox virus protein structure: a potential to design inhibitor
Mahsa Kazemi - Saeide Karimi - Maryam Kheirani nasab - Maryam Kazemi - Mahboobeh Nazari
Identification of Essential Genes and Suitable Drug Combinations for Colorectal Cancer Treatment Based on Systems Biology Approaches
Yasna Kazemghamsari
Predicting Anticancer Drug Repurposing Candidates using Knowledge Graphs
Marzieh Khodadadi AghGhaleh - Rooholah Abedian - Reza Zarghami - Sajjad Gharaghani
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 40.4.1