کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
1180211 | 1491524 | 2016 | 8 صفحه PDF | دانلود رایگان |
• Two new methods are introduced for replacing values below the detection limit (rounded zeros) in high-dimensional compositional data.
• The proposed methods are compared to existing imputation tools and outperform them given the reasonable simulation designs.
• In simulations and real data from metabolomics, the proposed methods show excellent performance compared to existing imputation tools.
High-dimensional compositional data, multivariate observations carrying relative information, frequently contain values below a detection limit (rounded zeros). We introduce new model-based procedures for replacing these values with reasonable numbers, so that the completed data set is ready for use with statistical analysis methods that rely on complete data, such as regression or classification with high-dimensional explanatory variables. The procedures respect the geometry of compositional data and can be considered as alternatives to existing methods. Simulations show that especially in high-dimensions, the proposed methods outperform existing methods. Moreover, even for a large number of rounded zeros, the new methods lead to an improved quality of the data, which is important for further analyses. The usefulness of the procedure is demonstrated using a data example from metabolomics.
Journal: Chemometrics and Intelligent Laboratory Systems - Volume 155, 15 July 2016, Pages 183–190