The high-dimensional regime is a condition that arises in many large-scale data applications, like omics, economy, and ecoinformatics. In this scenario, the number of n acquired samples is far fewer than the number of p observed variables, and then the classical statistical approaches can not be applied. In this work, with the aim to infer direct interactions for all the proteins in the kidney tubule segments, we point out our attention on the reverse-engineering of conditional networks in this regime. Particularly, the mammalian renal tubule is made up of at least 14 segments, containing at least 16 distinct epithelial cell types. Each cell type has its own characteristic set of cellular functions, which have been elucidated largely over the past 50 years since the development of single-tubule microdissection approaches and the expansion of the micropuncture technique to mammalian physiology. New developments in protein mass spectrometry have resulted in a marked increase in sensitivity of protein detection and quantification. In order to address this problem, we developed an algorithm rooted on Graphical Gaussian Models to infer a network, and on a simple method to establish the interactions different from zero. We found that the proposed approach outperforms others methods in terms of accuracy of diagnostic tests, particularly in the high-dimensional regime. Moreover, we suggest that it can be applied to several types of data.
Reverse Engineering of Renal Tubule Networks in the High-Dimensional Regime
Pagliarini R.
2024-01-01
Abstract
The high-dimensional regime is a condition that arises in many large-scale data applications, like omics, economy, and ecoinformatics. In this scenario, the number of n acquired samples is far fewer than the number of p observed variables, and then the classical statistical approaches can not be applied. In this work, with the aim to infer direct interactions for all the proteins in the kidney tubule segments, we point out our attention on the reverse-engineering of conditional networks in this regime. Particularly, the mammalian renal tubule is made up of at least 14 segments, containing at least 16 distinct epithelial cell types. Each cell type has its own characteristic set of cellular functions, which have been elucidated largely over the past 50 years since the development of single-tubule microdissection approaches and the expansion of the micropuncture technique to mammalian physiology. New developments in protein mass spectrometry have resulted in a marked increase in sensitivity of protein detection and quantification. In order to address this problem, we developed an algorithm rooted on Graphical Gaussian Models to infer a network, and on a simple method to establish the interactions different from zero. We found that the proposed approach outperforms others methods in terms of accuracy of diagnostic tests, particularly in the high-dimensional regime. Moreover, we suggest that it can be applied to several types of data.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.