Cell-Level Pathway Scoring Comparison with a Biologically Constrained Variational Autoencoder

Gundogdu, Pelin; Payá-Milans, Miriam; Alamo-Alvarez, Inmaculada; Nepomuceno-Chamorro, Isabel A.; Dopazo, Joaquin; Loucera, Carlos

doi:10.1007/978-3-031-42697-1_5

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 14137))

Included in the following conference series:

International Conference on Computational Methods in Systems Biology

530 Accesses
10 Altmetric

Abstract

Unsupervised techniques are ubiquitous to study and understand the complex patterns that arise when analyzing genomic data at single-cell resolution. Particularly, unsupervised deep learning models provide state-of-the-art solutions for the most common tasks that arise when dealing with scRNA-seq data. However, the biological usefulness of these complex models is burdened by their black-box nature. To address such limitations several lines of research have emerged, from post hoc approximations to ante hoc modeling. In this work, we study the behavior of two biologically-constrained variational autoencoders (ante hoc modeling). On the one hand, we use a one-layer architecture where the constraints come from the signaling pathways, and, on the other hand, we propose a two-layer architecture following the recent trends in mechanistic models of signal transduction. We use the representations learned by the model as proxies of the signaling activity at the single-cell level. We check the performance of the scoring model using a known scRNA-seq public dataset with a clearly established ground truth. Although both models capture the relevant signals, the most pronounced differences are better captured by the one-layer architecture, while the two-layer design is able to learn more fine-grained features that can expose less prominent aspects of the data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 49.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Deconvolution of autoencoders to learn biological regulatory modules from single cell mRNA sequencing data

Article Open access 08 July 2019

Fast and precise single-cell data analysis using a hierarchical autoencoder

Article Open access 15 February 2021

Gene set inference from single-cell sequencing data using a hybrid of matrix factorization and variational autoencoders

Article 07 December 2020

References

Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous distributed systems, March 2016. https://doi.org/10.48550/arXiv.1603.04467
Aibar, S., et al.: SCENIC: single-cell regulatory network inference and clustering. Nat. Methods 14(11), 1083–1086 (2017). https://doi.org/10.1038/nmeth.4463
Article CAS PubMed PubMed Central Google Scholar
Badia-i-Mompel, P., et al.: decoupleR: ensemble of computational methods to infer biological activities from omics data. Bioinf. Adv. 2(1), vbac016 (2022). https://doi.org/10.1093/bioadv/vbac016
Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Roy. Stat. Soc. Ser. B (Methodological) 57(1), 289–300 (1995). https://doi.org/10.1111/j.2517-6161.1995.tb02031.x
Çubuk, C., Loucera, C., Peña-Chilet, M., Dopazo, J.: Crosstalk between metabolite production and signaling activity in breast cancer. Int. J. Mol. Sci. 24(8), 7450 (2023). https://doi.org/10.3390/ijms24087450
Article CAS PubMed PubMed Central Google Scholar
Dash, T., Chitlangia, S., Ahuja, A., Srinivasan, A.: A review of some techniques for inclusion of domain-knowledge into deep neural networks. Sci. Rep. 12(1), 1040 (2022). https://doi.org/10.1038/s41598-021-04590-0
Article CAS PubMed PubMed Central Google Scholar
Gillespie, M., et al.: The Reactome pathway knowledgebase 2022. Nucleic Acids Res. 50(D1), D687–D692 (2022). https://doi.org/10.1093/nar/gkab1028
Article CAS PubMed Google Scholar
Graziani, M., et al.: A global taxonomy of interpretable AI: unifying the terminology for the technical and social sciences. Artif. Intell. Rev. 56(4), 3473–3504 (2023). https://doi.org/10.1007/s10462-022-10256-8
Article PubMed Google Scholar
Gundogdu, P., Alamo, I., Nepomuceno-Chamorro, I.A., Dopazo, J., Loucera, C.: SigPrimedNet: a signaling-informed neural network for scRNA-seq annotation of known and unknown cell types. Biology 12(4), 579 (2023). https://doi.org/10.3390/biology12040579
Article CAS PubMed PubMed Central Google Scholar
Gundogdu, P., Loucera, C., Alamo-Alvarez, I., Dopazo, J., Nepomuceno, I.: Integrating pathway knowledge with deep neural networks to reduce the dimensionality in single-cell RNA-seq data. BioData Mining 15(1), 1 (2022). https://doi.org/10.1186/s13040-021-00285-4
Article CAS PubMed PubMed Central Google Scholar
Harris, C.R., et al.: Array programming with NumPy. Nature 585(7825), 357–362 (2020). https://doi.org/10.1038/s41586-020-2649-2
Article CAS PubMed PubMed Central Google Scholar
Heumos, L., et al.: Best practices for single-cell analysis across modalities. Nat. Rev. Genet. (2023). https://doi.org/10.1038/s41576-023-00586-w
Hidalgo, M.R., Cubuk, C., Amadoz, A., Salavert, F., Carbonell-Caballero, J., Dopazo, J.: High throughput estimation of functional cell activities reveals disease mechanisms and predicts relevant clinical outcomes. Oncotarget 8(3), 5160–5178 (2016). https://doi.org/10.18632/oncotarget.14107
Kang, H.M., et al.: Multiplexed droplet single-cell RNA-sequencing using natural genetic variation. Nat. Biotechnol. 36(1), 89–94 (2018). https://doi.org/10.1038/nbt.4042
Article CAS PubMed Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization, January 2017. https://doi.org/10.48550/arXiv.1412.6980
Kuenzi, B.M., et al.: Predicting drug response and synergy using a deep learning model of human cancer cells. Cancer Cell 38(5), 672-684.e6 (2020). https://doi.org/10.1016/j.ccell.2020.09.014
Article CAS PubMed PubMed Central Google Scholar
Lähnemann, D., et al.: Eleven grand challenges in single-cell data science. Genome Biol. 21(1), 31 (2020). https://doi.org/10.1186/s13059-020-1926-6
Article PubMed PubMed Central Google Scholar
Levine, J.H., et al.: Data-driven phenotypic dissection of AML reveals progenitor-like cells that correlate with prognosis. Cell 162(1), 184–197 (2015). https://doi.org/10.1016/j.cell.2015.05.047
Li, C., et al.: SciBet as a portable and fast single cell type identifier. Nat. Commun. 11(1), 1818 (2020). https://doi.org/10.1038/s41467-020-15523-2. https://www.nature.com/articles/s41467-020-15523-2, bandiera_abtest: a Cc_license_type: cc_by Cg_type: Nature Research Journals Number: 1 Primary_atype: Research Publisher: Nature Publishing Group Subject_term: Machine learning;Transcriptomics Subject_term_id: machine-learning;transcriptomics
Lotfollahi, M., et al.: Biologically informed deep learning to query gene programs in single-cell atlases. Nat. Cell Biol. 25(2), 337–350 (2023). https://doi.org/10.1038/s41556-022-01072-x
Article CAS PubMed PubMed Central Google Scholar
Ma, J., et al.: Using deep learning to model the hierarchical structure and function of a cell. Nat. Methods 15(4), 290–298 (2018). https://doi.org/10.1038/nmeth.4627
Article CAS PubMed PubMed Central Google Scholar
McInnes, L., Healy, J., Melville, J.: UMAP: uniform manifold approximation and projection for dimension reduction, September 2020. https://doi.org/10.48550/arXiv.1802.03426
Ogata, H., Goto, S., Sato, K., Fujibuchi, W., Bono, H., Kanehisa, M.: KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 27(1), 29–34 (1999). https://doi.org/10.1093/nar/27.1.29
Article CAS PubMed PubMed Central Google Scholar
Petegrosso, R., Li, Z., Kuang, R.: Machine learning and statistical methods for clustering single-cell RNA-sequencing data. Brief. Bioinform. 21(4), 1209–1223 (2020). https://doi.org/10.1093/bib/bbz063
Article CAS PubMed Google Scholar
Regev, A., et al.: Human cell atlas meeting participants: the human cell atlas. eLife 6, e27041 (2017). https://doi.org/10.7554/eLife.27041
Traag, V., Waltman, L., van Eck, N.J.: From Louvain to Leiden: guaranteeing well-connected communities. Sci. Rep. 9(1), 5233 (2019). https://doi.org/10.1038/s41598-019-41695-z
Article CAS PubMed PubMed Central Google Scholar
Virshup, I., et al.: The scverse project provides a computational ecosystem for single-cell omics data analysis. Nat. Biotechnol., 1–3 (2023). https://doi.org/10.1038/s41587-023-01733-8
Virshup, I., Rybakov, S., Theis, F.J., Angerer, P., Wolf, F.A.: Anndata: annotated data, December 2021. https://doi.org/10.1101/2021.12.16.473007
Virtanen, P., et al.: SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods 17(3), 261–272 (2020). https://doi.org/10.1038/s41592-019-0686-2
Wang, J., Zou, Q., Lin, C.: A comparison of deep learning-based pre-processing and clustering approaches for single-cell RNA sequencing data. Briefings Bioinf. 23(1), bbab345 (2022). https://doi.org/10.1093/bib/bbab345
Way, G.P., Greene, C.S.: Discovering pathway and cell type signatures in transcriptomic compendia with machine learning. Ann. Rev. Biomed. Data Sci. 2(1), 1–17 (2019). https://doi.org/10.1146/annurev-biodatasci-072018-021348
Article Google Scholar
Wolf, F.A., Angerer, P., Theis, F.J.: SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 19(1), 15 (2018). https://doi.org/10.1186/s13059-017-1382-0
Article PubMed PubMed Central Google Scholar
Zappia, L., Theis, F.J.: Over 1000 tools reveal trends in the single-cell RNA-seq analysis landscape. Genome Biol. 22(1), 301 (2021). https://doi.org/10.1186/s13059-021-02519-4
Article CAS PubMed PubMed Central Google Scholar
Zhao, Y., Shao, J., Asmann, Y.W.: Assessment and optimization of explainable machine learning models applied to transcriptomic data. Genomics Proteomics Bioinf. 20(5), 899–911 (2022). https://doi.org/10.1016/j.gpb.2022.07.003
Article Google Scholar

Download references

Acknowledgements

This work has been partially supported by grants PID2020-117979RB-I00 and PID2020-117954RB-C22 from the Spanish Ministry of Science and Innovation, IMP/00019 from the Instituto de Salud Carlos III (ISCIII), PIP-0087-2021 from Junta de Andalucía, co-funded with European Regional Development Funds (ERDF); grant H2020 Programme of the European Union grants Marie Curie Innovative Training Network “Machine Learning Frontiers in Precision Medicine” (MLFPM) (GA 813533). The authors also acknowledge Junta de Andalucía for the postdoctoral contract of Carlos Loucera (PAIDI2020-DOC_00350) co-funded by the European Social Fund (FSE) 2014-2020.

Author information

Authors and Affiliations

Andalusian Platform for Computational Medicine, Andalusian Public Foundation Progress and Health-FPS, Sevilla, Spain
Pelin Gundogdu, Miriam Payá-Milans, Inmaculada Alamo-Alvarez, Joaquin Dopazo & Carlos Loucera
Computational Systems Medicine, Institute of Biomedicine of Seville (IBIS), Hospital Virgen del Rocio, Sevilla, Spain
Pelin Gundogdu, Miriam Payá-Milans, Joaquin Dopazo & Carlos Loucera
Department of Immunology, Institute of Biomedicine of Seville (IBIS), Hospital Virgen del Rocio, Sevilla, Spain
Inmaculada Alamo-Alvarez
Dpto. de Lenguajes y Sistemas Informaticos, University of Seville, Sevilla, Spain
Isabel A. Nepomuceno-Chamorro
Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), FPS, Hospital Virgen del Rocío, Sevilla, Spain
Joaquin Dopazo
FPS/ELIXIR-es, Hospital Virgen del Rocío, Sevilla, Spain
Joaquin Dopazo

Authors

Pelin Gundogdu
View author publications
You can also search for this author in PubMed Google Scholar
Miriam Payá-Milans
View author publications
You can also search for this author in PubMed Google Scholar
Inmaculada Alamo-Alvarez
View author publications
You can also search for this author in PubMed Google Scholar
Isabel A. Nepomuceno-Chamorro
View author publications
You can also search for this author in PubMed Google Scholar
Joaquin Dopazo
View author publications
You can also search for this author in PubMed Google Scholar
Carlos Loucera
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Isabel A. Nepomuceno-Chamorro , Joaquin Dopazo or Carlos Loucera .

Editor information

Editors and Affiliations

University of Luxembourg, Esch-sur-Alzette, Luxembourg
Jun Pang
Inria Lille, Villeneuve d’Ascq, France
Joachim Niehren

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gundogdu, P., Payá-Milans, M., Alamo-Alvarez, I., Nepomuceno-Chamorro, I.A., Dopazo, J., Loucera, C. (2023). Cell-Level Pathway Scoring Comparison with a Biologically Constrained Variational Autoencoder. In: Pang, J., Niehren, J. (eds) Computational Methods in Systems Biology. CMSB 2023. Lecture Notes in Computer Science(), vol 14137. Springer, Cham. https://doi.org/10.1007/978-3-031-42697-1_5

Download citation

DOI: https://doi.org/10.1007/978-3-031-42697-1_5
Published: 09 September 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-42696-4
Online ISBN: 978-3-031-42697-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Cell-Level Pathway Scoring Comparison with a Biologically Constrained Variational Autoencoder

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Deconvolution of autoencoders to learn biological regulatory modules from single cell mRNA sequencing data

Fast and precise single-cell data analysis using a hierarchical autoencoder

Gene set inference from single-cell sequencing data using a hybrid of matrix factorization and variational autoencoders

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Cell-Level Pathway Scoring Comparison with a Biologically Constrained Variational Autoencoder

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Deconvolution of autoencoders to learn biological regulatory modules from single cell mRNA sequencing data

Fast and precise single-cell data analysis using a hierarchical autoencoder

Gene set inference from single-cell sequencing data using a hybrid of matrix factorization and variational autoencoders

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation