0% Complete
صفحه اصلی
/
4th international edition and 13th Iranian Conference on Bioinformatics
A novel approach to find Biomarkers affecting Autism Spectrum Disorder (ASD) Using Machine Learning
نویسندگان :
Amir Zarghami
1
Milad Besharatifard
2
Fatemeh Zare-Mirakabad
3
1- Computational Biology Research Canter (CBRC), Department of Mathematics and Computer Science, Amirkabir University of Technology, Tehran , Iran
2- Computational Biology Research Canter (CBRC), Department of Mathematics and Computer Science, Amirkabir University of Technology, Tehran , Iran
3- Computational Biology Research Canter (CBRC), Department of Mathematics and Computer Science, Amirkabir University of Technology, Tehran , Iran
کلمات کلیدی :
Neurodevelopmental،Random Forest،Gene Expression،Feature Importance،Biological Pathway
چکیده :
Autism Spectrum Disorder (ASD) is a neurodevelopmental condition that presents with challenges in social interaction, communication, and repetitive behaviors. Early detection is essential but remains difficult due to the variability of symptoms and their overlap with other developmental disorders. Advances in diagnostic techniques increasingly employ machine learning (ML) models to analyze biological and behavioral data, aiming to improve early detection and supplement traditional diagnostic methods with efficient, cost-effective tools. In this study, we analyzed gene expression data (GSE18123) comprising samples obtained from two distinct platforms, which were used to construct two separate datasets. To minimize the risk of data leakage, one dataset was designated for feature reduction, while the other was used for training and testing the machine learning model. From the original gene expression data, we derived a pathway expression dataset. To address the common issue of an excessive number of pathways compared to the available samples, we developed a secondary dataset by creating paired and concatenated samples. The ML model was trained on this secondary dataset. During testing, each test sample was paired with all training samples, and predictions were made for each pairing. The predicted labels were then aggregated to derive the final label for each test sample. The primary ML method used in this study was Random Forest, although the approach is adaptable to other machine learning techniques. For feature importance analysis, we identified pathways with high importance scores in both of their appearances in the secondary dataset. Pathways with consistently high scores were selected, and genes frequently appearing within these pathways were prioritized as potential biomarkers. Our model demonstrated strong performance, achieving an accuracy of 0.85, precision of 0.81, recall (sensitivity) of 1.00, and an F1-score of 0.90. These results suggest that the model is effective at distinguishing between ASD and non-ASD cases. Additionally, we identified over 50 candidate genes for ASD, several of which have been reported in previous studies. Notably, we also discovered a novel gene, ATP6V1F, which scored highly and may represent a new potential biomarker associated with ASD.
لیست مقالات
لیست مقالات بایگانی شده
A Novel Approach to Antimicrobial Susceptibility Testing: Automated Disk Diffusion Analysis with Smartphone Cameras and Deep Learning Techniques
Mahdiar Mansouri - Mohammadreza Najafi Disfani - Danial Ghofrani - Arash Pournaji
Distribution and Allelic Diversity of Outer Membrane Proteins in Helicobacter pylori: Implications for Vaccine Development and Therapeutic Approach
Mohammadreza Najafi Disfani - Mahyar Abdi - Sarvin Rezvani Bafroyeh - SeyedehRomina Lavasani - Parastoo Saniee
Accelerating Diffusion-Based Graph Generative Models for De Novo Drug Design via Hessian Trace Approximation
Negin Bagherpour - AmirHossein Heidari - Alireza Fotouhi Siahpirani
Novel lncRNA‐miRNA‐mRNA competing endogenous RNA regulatory networks in glioma
Asoo Khani - Amir-Reza Javanmard
Exploring Genetic Variability: A Bioinformatics Approach to Analyzing Reported SNPs in the cagA Gene of Helicobacter pylori
Aria Soltani
Diabetes nephropathy indicators for early diagnosis
Sedigheh Momenzadeh - Masumeh Jalalvand - Reza Nedaeinia
Enhancing Predictive Accuracy of CRISPR-Cas9 on-target efficiency using Deep Learning and Active Learning Optimization for Small Datasets
Masoud Mahdavifar - Fateme Zaremirakabad
3D-QSAR Modeling on 2-Pyrimidine Carbohydrazides as Utrophin Modulators for the Treatment of Duchenne Muscular Dystrophy by Combining CoMFA, CoMSIA, and Molecular Docking Studies
Reza Mahmoudzadeh Laki - Eslam Pourbasheer
An efficient method based on transformers for antimicrobial peptide prediction
Alireza Khorramfard - Jamshid Pirgazi - Ali Ghanbari Sorkhi
Bioinformatics studies on S35K mutation on Mnemiopsin 2 photoprotein
ََAmirReza Mohammadi - Vahab Jafarian - Fatemeh Khatami
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 40.4.1