0% Complete
صفحه اصلی
/
4th international edition and 13th Iranian Conference on Bioinformatics
A Contrastive Learning Framework for Single-Cell Multi-Omics Data Integration
نویسندگان :
Amir Ebrahimi
1
Alireza Fotuhi Siahpirani
2
Hesam Montazeri
3
1- University of Tehran
2- University of Tehran
3- University of Tehran
کلمات کلیدی :
single-cell،omics integration،representation learning،neural networks،contrastive learning
چکیده :
The advancement of single-cell omics technologies has changed our understanding of biological systems’ functionalities and heterogeneities. Methods such as SHARE-seq and SNARE-seq capture gene expression and chromatin accessibility, while CITE-seq measures gene expression and cell surface protein abundance. However, analyzing each modality independently can lead to partial insights. Integrating these modalities offers a solution but is challenging due to differences in their distributions and feature spaces. There has been a lot of effort to develop efficient computational frameworks to address this problem. The majority of these approaches learn low-dimensional joint embeddings of the omics modalities. Some of these methods such as Principal Component Analysis (PCA) and Canonical Correlation Analysis (CCA) use linear transformations of input data, with tools such as Seurat combining PCA, CCA, and Mutual Nearest Neighbors for alignment. MOFA uses matrix factorization to derive shared and modality-specific representations. More recent multi-modal deep learning approaches, such as scglue, employ variational autoencoders to capture hierarchical, non-linear patterns and align multi-omics representations end-to-end. While these methods demonstrate promising results and strong performances, they often suffer from low signal-to-noise ratios (Wang et al., 2023) and over-complicated architectures. Here, we present a neural network architecture inspired by the CLIP model developed by OpenAI (Radford et al., 2021) for paired single-cell multi-omics integration. This framework consists of two encoders, each learning a low-dimensional representation of the input modality. Then, these representations are aligned using a contrastive loss function. We benchmarked this model with two baselines (PCA and an Auto Encoder with reconstruction loss) and three state-of-the-art models (MOFA (Argelaguet et al., 2018), Harmony (Korsunsky et al., 2019), and Con-AAE (Wang et al., 2023)) on three real-world datasets including SHARE-seq (Ma et al., 2020), PBMC (10x Genomics, 2020), and CITE-seq (Stoeckius et al., 2017). All evaluations were performed on unseen test datasets with 10 replications. Benchmarks were based on four measures: Average Silhouette Width (ASW) for clustering quality of latent representations based on cell types, Recall at k, Cell type accuracy, and Median Rank for the quality of integration. Results show that our framework outperforms other models in most of the metrics. Moreover, it achieved high ASW values compared to original datasets which reflect the ability of the model to denoise single-cell data and extract biological signals. In addition, we assess the model’s ability to handle unpaired multi-omics data which shows high values for most metrics compared to other frameworks. These findings position our framework as a high-potential platform capable of extending to downstream applications such as cell-type annotation and disease subtyping.
لیست مقالات
لیست مقالات بایگانی شده
Statistical Investigation on the Occurrence of Liquid-Liquid Phase Separation in Proteins Involved in Neurodegenerative Proteins
Pouya Alimohammadi - Saeed Emadi - Mahdi Vasighi
Investigating the role of ursolic acid in EGFR L858R mutant inhibition in non-small cell lung cancer: Molecular docking and ADMET prediction
Tooba Abdizadeh
Novel Anti-ageing Strategy Via Targeting CST With Vitamin B1
SeyedMobin Mousavi Ghomi - Maryam Azimzadeh Irani - Aida Arezoumandchafi
Phylogenetic Insights into Enzybiotic with Novel Properties
Arman Hasannejad - Arad Ariaeenejad - Donya Afshar Jahanshahi - Mohammad Reza Zabihi - Shohreh Ariaeenejad - Kaveh Kavousi
Enhancing the Clustering and Sorting Procedures in the MAGUS Method for Multiple Sequence Alignment
Masih Hajsaeedi - Mohsen Hooshmand
Vaccine design for outer membrane protein C(Shigella Flexneri)
Maedeh Esmaili - Fatemeh Sefid
Minimum Error Entropy: A Superior Alternative to Mean Square Error for Heavy-Tailed EEG Signal Classification
Shermin Shahbazi - Hossein Mohammadi
In-silico investigation of bioactive peptides with anti-Alzheimer potential derived from bovine milk αs1-casein protein
Mohammad Jahangiri - Leila Zarandi-Miandoab
Enhancing NAFLD Diagnosis with AI: Insights from the Persian Fasa Cohort Through Advanced Machine Learning Techniques
Marzie Shadpirouz - Mohammad Reza Zabihi - Zahra Salehi - Kiarash Zare - Mohammad Mehdi Naghizadeh - Kaveh Kavousi
Exploring the Antibiotic Potential of Micromonospora
Niloufar Zamanpour - Najmeh Salehi - Javad Hamedi
ثمین همایش، سامانه مدیریت کنفرانس ها و جشنواره ها - نگارش 40.4.1