Mb-PHENIX: Diffusion and supervised uniform manifold approximation for denoizing microbiota data
Por:
Padron-Manrique C., Vázquez-Jiménez A., Esquivel-Hernandez D.A., Martinez Lopez Y.E., Neri-Rosario D., Sánchez-Castañeda J.P., Giron-Villalobos D., Resendis-Antonio O.
Publicada:
1 ene 2023
Ahead of Print:
1 dic 2023
Resumen:
Motivation: Microbiota data encounters challenges arising from technical noise and the curse of dimensionality, which affect the reliability of scientific findings. Furthermore, abundance matrices exhibit a zero-inflated distribution due to biological and technical influences. Consequently, there is a growing demand for advanced algorithms that can effectively recover missing taxa while also considering the preservation of data structure. Results: We present mb-PHENIX, an open-source algorithm developed in Python that recovers taxa abundances from the noisy and sparse microbiota data. Our method infers the missing information of count matrix (in 16S microbiota and shotgun studies) by applying imputation via diffusion with supervised Uniform Manifold Approximation Projection (sUMAP) space as initialization. Our hybrid machine learning approach allows to denoise microbiota data, revealing differential abundance microbes among study groups where traditional abundance analysis fails. © 2023 The Author(s). Published by Oxford University Press.
Filiaciones:
Padron-Manrique C.:
Human Systems Biology Laboratory, Instituto Nacional de Medicina Genómica (INMEGEN), Mexico City, 14610, Mexico
Programa de Doctorado en Ciencias Biomédicas, Universidad Nacional Autónoma de Mexico (UNAM), Mexico City, 04510, Mexico
Vázquez-Jiménez A.:
Human Systems Biology Laboratory, Instituto Nacional de Medicina Genómica (INMEGEN), Mexico City, 14610, Mexico
Esquivel-Hernandez D.A.:
Human Systems Biology Laboratory, Instituto Nacional de Medicina Genómica (INMEGEN), Mexico City, 14610, Mexico
Martinez Lopez Y.E.:
Human Systems Biology Laboratory, Instituto Nacional de Medicina Genómica (INMEGEN), Mexico City, 14610, Mexico
Programa de Doctorado en Ciencias Médicas, Odontológicas y de la Salud, Universidad Nacional Autónoma de México (UNAM), Mexico City, 04510, Mexico
Neri-Rosario D.:
Human Systems Biology Laboratory, Instituto Nacional de Medicina Genómica (INMEGEN), Mexico City, 14610, Mexico
Programa de Maestría en Ciencias Bioquímicas, Universidad Nacional Autónoma de México (UNAM), Mexico City, 04510, Mexico
Sánchez-Castañeda J.P.:
Human Systems Biology Laboratory, Instituto Nacional de Medicina Genómica (INMEGEN), Mexico City, 14610, Mexico
Programa de Maestría en Ciencias Bioquímicas, Universidad Nacional Autónoma de México (UNAM), Mexico City, 04510, Mexico
Giron-Villalobos D.:
Human Systems Biology Laboratory, Instituto Nacional de Medicina Genómica (INMEGEN), Mexico City, 14610, Mexico
Programa de Maestría en Ciencias Bioquímicas, Universidad Nacional Autónoma de México (UNAM), Mexico City, 04510, Mexico
Resendis-Antonio O.:
Human Systems Biology Laboratory, Instituto Nacional de Medicina Genómica (INMEGEN), Mexico City, 14610, Mexico
Coord. de la Invest. Cie. - Red de Apoyo A la Investigacion - Centro de Ciencias de la Complejidad, Universidad Nacional Autónoma de México (UNAM), Mexico City, 04510, Mexico
gold, Green Published, Green Submitted, All Open Access, Gold, Green
|