Mb-PHENIX: Diffusion and supervised uniform manifold approximation for denoizing microbiota data


Por: Padron-Manrique C., Vázquez-Jiménez A., Esquivel-Hernandez D.A., Martinez Lopez Y.E., Neri-Rosario D., Sánchez-Castañeda J.P., Giron-Villalobos D., Resendis-Antonio O.

Publicada: 1 ene 2023 Ahead of Print: 1 dic 2023
Resumen:
Motivation: Microbiota data encounters challenges arising from technical noise and the curse of dimensionality, which affect the reliability of scientific findings. Furthermore, abundance matrices exhibit a zero-inflated distribution due to biological and technical influences. Consequently, there is a growing demand for advanced algorithms that can effectively recover missing taxa while also considering the preservation of data structure. Results: We present mb-PHENIX, an open-source algorithm developed in Python that recovers taxa abundances from the noisy and sparse microbiota data. Our method infers the missing information of count matrix (in 16S microbiota and shotgun studies) by applying imputation via diffusion with supervised Uniform Manifold Approximation Projection (sUMAP) space as initialization. Our hybrid machine learning approach allows to denoise microbiota data, revealing differential abundance microbes among study groups where traditional abundance analysis fails. © 2023 The Author(s). Published by Oxford University Press.

Filiaciones:
Padron-Manrique C.:
 Human Systems Biology Laboratory, Instituto Nacional de Medicina Genómica (INMEGEN), Mexico City, 14610, Mexico

 Programa de Doctorado en Ciencias Biomédicas, Universidad Nacional Autónoma de Mexico (UNAM), Mexico City, 04510, Mexico

Vázquez-Jiménez A.:
 Human Systems Biology Laboratory, Instituto Nacional de Medicina Genómica (INMEGEN), Mexico City, 14610, Mexico

Esquivel-Hernandez D.A.:
 Human Systems Biology Laboratory, Instituto Nacional de Medicina Genómica (INMEGEN), Mexico City, 14610, Mexico

Martinez Lopez Y.E.:
 Human Systems Biology Laboratory, Instituto Nacional de Medicina Genómica (INMEGEN), Mexico City, 14610, Mexico

 Programa de Doctorado en Ciencias Médicas, Odontológicas y de la Salud, Universidad Nacional Autónoma de México (UNAM), Mexico City, 04510, Mexico

Neri-Rosario D.:
 Human Systems Biology Laboratory, Instituto Nacional de Medicina Genómica (INMEGEN), Mexico City, 14610, Mexico

 Programa de Maestría en Ciencias Bioquímicas, Universidad Nacional Autónoma de México (UNAM), Mexico City, 04510, Mexico

Sánchez-Castañeda J.P.:
 Human Systems Biology Laboratory, Instituto Nacional de Medicina Genómica (INMEGEN), Mexico City, 14610, Mexico

 Programa de Maestría en Ciencias Bioquímicas, Universidad Nacional Autónoma de México (UNAM), Mexico City, 04510, Mexico

Giron-Villalobos D.:
 Human Systems Biology Laboratory, Instituto Nacional de Medicina Genómica (INMEGEN), Mexico City, 14610, Mexico

 Programa de Maestría en Ciencias Bioquímicas, Universidad Nacional Autónoma de México (UNAM), Mexico City, 04510, Mexico

Resendis-Antonio O.:
 Human Systems Biology Laboratory, Instituto Nacional de Medicina Genómica (INMEGEN), Mexico City, 14610, Mexico

 Coord. de la Invest. Cie. - Red de Apoyo A la Investigacion - Centro de Ciencias de la Complejidad, Universidad Nacional Autónoma de México (UNAM), Mexico City, 04510, Mexico
ISSN: 13674803
Editorial
OXFORD UNIV PRESS, GREAT CLARENDON ST, OXFORD OX2 6DP, ENGLAND, Reino Unido
Tipo de documento: Article
Volumen: 39 Número: 12
Páginas:
WOS Id: 001117573600005
ID de PubMed: 38015858
imagen gold, Green Published, Green Submitted, All Open Access, Gold, Green

MÉTRICAS