0% Complete
صفحه اصلی
/
4th international edition and 13th Iranian Conference on Bioinformatics
A novel approach to find Biomarkers affecting Autism Spectrum Disorder (ASD) Using Machine Learning
نویسندگان :
Amir Zarghami
1
Milad Besharatifard
2
Fatemeh Zare-Mirakabad
3
1- Computational Biology Research Canter (CBRC), Department of Mathematics and Computer Science, Amirkabir University of Technology, Tehran , Iran
2- Computational Biology Research Canter (CBRC), Department of Mathematics and Computer Science, Amirkabir University of Technology, Tehran , Iran
3- Computational Biology Research Canter (CBRC), Department of Mathematics and Computer Science, Amirkabir University of Technology, Tehran , Iran
کلمات کلیدی :
Neurodevelopmental،Random Forest،Gene Expression،Feature Importance،Biological Pathway
چکیده :
Autism Spectrum Disorder (ASD) is a neurodevelopmental condition that presents with challenges in social interaction, communication, and repetitive behaviors. Early detection is essential but remains difficult due to the variability of symptoms and their overlap with other developmental disorders. Advances in diagnostic techniques increasingly employ machine learning (ML) models to analyze biological and behavioral data, aiming to improve early detection and supplement traditional diagnostic methods with efficient, cost-effective tools. In this study, we analyzed gene expression data (GSE18123) comprising samples obtained from two distinct platforms, which were used to construct two separate datasets. To minimize the risk of data leakage, one dataset was designated for feature reduction, while the other was used for training and testing the machine learning model. From the original gene expression data, we derived a pathway expression dataset. To address the common issue of an excessive number of pathways compared to the available samples, we developed a secondary dataset by creating paired and concatenated samples. The ML model was trained on this secondary dataset. During testing, each test sample was paired with all training samples, and predictions were made for each pairing. The predicted labels were then aggregated to derive the final label for each test sample. The primary ML method used in this study was Random Forest, although the approach is adaptable to other machine learning techniques. For feature importance analysis, we identified pathways with high importance scores in both of their appearances in the secondary dataset. Pathways with consistently high scores were selected, and genes frequently appearing within these pathways were prioritized as potential biomarkers. Our model demonstrated strong performance, achieving an accuracy of 0.85, precision of 0.81, recall (sensitivity) of 1.00, and an F1-score of 0.90. These results suggest that the model is effective at distinguishing between ASD and non-ASD cases. Additionally, we identified over 50 candidate genes for ASD, several of which have been reported in previous studies. Notably, we also discovered a novel gene, ATP6V1F, which scored highly and may represent a new potential biomarker associated with ASD.
لیست مقالات
لیست مقالات بایگانی شده
Modeling and Predicting the Use of Medications Antiplatelets and ARBs Using Logistic Regression
Ahmad Aliyari Boroujeni - Pouya Joze Soleimani - Shima Soltani - Farzaneh Karamitanha
Age-Related Gene Expression Changes in Microglia: Insights from the Nygen platform and Brain-Aging Atlas
Anahita Esmaeili-Mehr - َAygin Zabtkar - Seyed abolhassan Shahzade Fazeli - Amir Amiri-Yekta - Yaser Tahmtani
Vaccine design for outer membrane protein C(Shigella Flexneri)
Maedeh Esmaili - Fatemeh Sefid
Identification of Antigenic Proteins of Acinetobacter baumannii as Potential Novel Vaccine Candidates Through a Reverse Vaccinology Approach
Amirhossein Ghadiri - Abbas Doosti - Mostafa Shakhsi-Niaei
Molecular docking and bioinformatics study of active compounds of thyme) Thymus vulgaris( in inhibiting COX-2 enzyme related to inflammatory diseases
Razieh Biglari Farash - Azizollah Kheiry - Najmaddin Mortazavi - Mohsen Sani khani
Simultaneous overexpression of CD70 and downregulation of CD84 as a prognostic marker for glucocorticoid resistance in B cell Acute lymphoblastic leukemia
Mohammad Hossein Shakib Manesh - Soheila Rahgozar
Molecular Docking Study of Pyrazole Interaction with Bovine Serum Albumin (BSA): Insights from Drug-Protein Binding
Parisa Zallou - Yaghub Pazhang - Ebrahim Nemati-Kande
Unlocking the Hidden Potential of Leuconostoc: Insights from Genomic Analysis
Bahram Bassami - Niloufar Zamanpour - Najmeh Salehi - Javad Hamedi
Identification of circRNA-miRNA-mRNA Interaction in Myocardial Infarction
Amir Hesam Pahlevani - Ashkan Nazari - Kiarash Zare - Mohammad Ghorbani - Abdolhakim Aalkamel - Mohammad Mehdi Naghizadeh
Molecular Investigation of Periplasmic Sensor Histidine Kinase Interactions in Regulating UV Shield Formation of Cyanobacteria Nostoc Sp
Maryam Eskafi - Maryam Azimzadeh Irani
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 42.7.0