0% Complete
صفحه اصلی
/
4th international edition and 13th Iranian Conference on Bioinformatics
A Contrastive Learning Framework for Single-Cell Multi-Omics Data Integration
نویسندگان :
Amir Ebrahimi
1
Alireza Fotuhi Siahpirani
2
Hesam Montazeri
3
1- University of Tehran
2- University of Tehran
3- University of Tehran
کلمات کلیدی :
single-cell،omics integration،representation learning،neural networks،contrastive learning
چکیده :
The advancement of single-cell omics technologies has changed our understanding of biological systems’ functionalities and heterogeneities. Methods such as SHARE-seq and SNARE-seq capture gene expression and chromatin accessibility, while CITE-seq measures gene expression and cell surface protein abundance. However, analyzing each modality independently can lead to partial insights. Integrating these modalities offers a solution but is challenging due to differences in their distributions and feature spaces. There has been a lot of effort to develop efficient computational frameworks to address this problem. The majority of these approaches learn low-dimensional joint embeddings of the omics modalities. Some of these methods such as Principal Component Analysis (PCA) and Canonical Correlation Analysis (CCA) use linear transformations of input data, with tools such as Seurat combining PCA, CCA, and Mutual Nearest Neighbors for alignment. MOFA uses matrix factorization to derive shared and modality-specific representations. More recent multi-modal deep learning approaches, such as scglue, employ variational autoencoders to capture hierarchical, non-linear patterns and align multi-omics representations end-to-end. While these methods demonstrate promising results and strong performances, they often suffer from low signal-to-noise ratios (Wang et al., 2023) and over-complicated architectures. Here, we present a neural network architecture inspired by the CLIP model developed by OpenAI (Radford et al., 2021) for paired single-cell multi-omics integration. This framework consists of two encoders, each learning a low-dimensional representation of the input modality. Then, these representations are aligned using a contrastive loss function. We benchmarked this model with two baselines (PCA and an Auto Encoder with reconstruction loss) and three state-of-the-art models (MOFA (Argelaguet et al., 2018), Harmony (Korsunsky et al., 2019), and Con-AAE (Wang et al., 2023)) on three real-world datasets including SHARE-seq (Ma et al., 2020), PBMC (10x Genomics, 2020), and CITE-seq (Stoeckius et al., 2017). All evaluations were performed on unseen test datasets with 10 replications. Benchmarks were based on four measures: Average Silhouette Width (ASW) for clustering quality of latent representations based on cell types, Recall at k, Cell type accuracy, and Median Rank for the quality of integration. Results show that our framework outperforms other models in most of the metrics. Moreover, it achieved high ASW values compared to original datasets which reflect the ability of the model to denoise single-cell data and extract biological signals. In addition, we assess the model’s ability to handle unpaired multi-omics data which shows high values for most metrics compared to other frameworks. These findings position our framework as a high-potential platform capable of extending to downstream applications such as cell-type annotation and disease subtyping.
لیست مقالات
لیست مقالات بایگانی شده
Combination therapy synergism prediction for virus treatment using machine learning models
Shayan Majidifar - Arash Zabihian - Mohsen Hooshmand
Epitope-Based Design of a Dual-Purpose Recombinant Protein Targeting Dengue NS1 for Vaccine and Diagnostic Development
Abolhassan Bahari - Amirmahdi Yavari
Designing self-assembled peptide nanovaccine (SAPN) against Respiratory syncytial virus (RSV): An Immunoinformatic approach
Marzieh Mehdieh - Farahnaz Zare
Diagnosis of Diabetic Retinopathy with Fuzzy Technique and Deep Learning
Samaneh Noroozi - Sajad Haghzad Klidbary
Bioinformatics Analysis of Prostate Cancer by the Construction of circRNA-miRNA-mRNA Regulatory Network
Fatemeh Zamani - Ali Taravati
An efficient method based on transformers for antimicrobial peptide prediction
Alireza Khorramfard - Jamshid Pirgazi - Ali Ghanbari Sorkhi
A Computational Approach to Identify Potent Inhibitors of Janus Kinase 1 from Natural Products: Structure-Based High-Throughput Virtual Screening and LightGBM Classifier.
Parisa Valipour
Expansion and Sequencing of the DNA Code Used in the COVID-19 Vaccine Using Meta-Heuristic Algorithms
Ahmad Aliyari Boroujeni - Mohammadreza Parsayi - Hossein Rahmati
Development of Novel Cellulose Crystal-Hyaluronic Acid Anti-Cancer Carriers for Targeting
Yeganeh Abbasian Bajgiran - Maryam Azimzadeh Irani
Comprehensive Analysis of EEG Signals for Machine Learning-Based Depression Detection
Mikaeil Tabarraei - Sepideh Jabbari
بیشتر
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 42.7.0