0% Complete
صفحه اصلی
/
4th international edition and 13th Iranian Conference on Bioinformatics
A New LDA-based Genetic Algorithm for Feature Selection and Classification in Gene Expression Data Analysis
نویسندگان :
Parisa Pouyamanesh
1
Mahdi Vasighi
2
1- Institute for Advanced Studies in Basic Sciences (IASBS)
2- Institute for Advanced Studies in Basic Sciences (IASBS)
کلمات کلیدی :
Genetic Algorithm،Feature Selection،Linear Discriminant Analysis،Gene Expression Data
چکیده :
Recent advancements in cancer diagnosis have driven the need for more efficient genetic data analysis methods to improve the accuracy and speed of identifying influential genes. Genetic algorithms, inspired by evolutionary processes in nature, have emerged as powerful tools for optimizing and identifying cancer-related genes. However, a significant challenge in these algorithms is the gradual reduction of population diversity, leading to premature convergence on suboptimal solutions, a phenomenon known as genetic drift. This effect causes certain areas of the search space to be overlooked, ultimately reducing the accuracy of results. This study introduces a repair operation as part of the genetic algorithm’s process to address these limitations. This repair operation purposefully and intelligently eliminates irrelevant genes while enhancing essential ones, thus improving the quality of feature selection and preventing premature convergence. This approach uses Linear Discriminant Analysis (LDA) as a crucial tool to guide feature selection precisely. LDA coefficients, which reflect each feature's contribution to class discrimination, are used in a repairing mechanism applied to the individuals. This way, the proposed repair mechanism removes noisy and irrelevant features, preserving only those most influential in distinguishing between healthy and cancerous samples. This approach not only enhances diagnostic accuracy but also simplifies the model by finding more sparse candidate subsets, reducing the risk of overfitting. The proposed repair operation was applied to several gene expression benchmark datasets in terms of convergence speed and the performance of the proposed approach compared with the conventional GA-based feature selection. The results show that the proposed approach produces consistently better subsets with few gens and classification accuracies.
لیست مقالات
لیست مقالات بایگانی شده
Targeting Protein in Neurodegenerative Diseases: A Computational Approach
Reyhaneh Ebrahimi - Seyed Hassan Alavi - Fayaz Soleymani - Fatemeh Zare-Mirakabad
BIRC5: The Silent Architect of Tumor Persistence and Senescence in Hepatocellular Carcinoma
Amirhosein Farrokhzad - Maryam Kaboli - Elahe Hoseinnia - Elham Rismani - Massoud Vosough
A Fuzzy Bayesian Network Model for Personalized Diabetes Risk Prediction: Integrating Lifestyle, Genetic, and Environmental Factors
Lida Hooshyar - Nadia Tahiri
Exploring Genetic Variability: A Bioinformatics Approach to Analyzing Reported SNPs in the cagA Gene of Helicobacter pylori
Aria Soltani
Solving Diffusion Equations Using Physics-Informed Neural Networks: A Biological Application
Yasaman Razzaghi - Ali Shokri - Ahmad Aliyari Boroujeni
Single-Cell Transcriptomic Analysis Reveals Cellular Heterogeneity and Molecular Markers in Acute Leukemia Subtypes
Fatemeh Mohagheghian - Zahra Salehi - Najmeh Salehi
Drug repurposing using bulk RNA-seq based on key genes involved in inflammatory bowel disease
Nayereh Abdali - Shahram Tahmasebian - Atena Vaghf
Bioinformatics Analysis of Prostate Cancer by the Construction of circRNA-miRNA-mRNA Regulatory Network
Fatemeh Zamani - Ali Taravati
A computational approach to identify the biomarker based on the RNA sequencing data analysis for Alzheimer’s disease
Atena Vaghf - Shahram Tahmasebian - Nayereh Abdali
Curcumin’s Journey Through Cellulose: Binding Dynamics Across Cellulose-derived Bio-Nanofibers
Ayla Esmaeilzadeh - Maryam Azimzadeh Irani - Mehdi Jahanfar - Naser Farrokhi
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 40.4.1