Nearest Neighbor Methods for the Imputation of Missing Values in Low and High-Dimensional Data
Nowadays, due to the advancement and significantly rapid growth in the technology, the collection of high-dimensional data is no longer a tedious task. Regardless of considerable advances in technology over the last few decades, the analysis of high-dimensional data faces new challenges concerning i...
Table of Contents:
- Intro; Introduction; Methodological Concepts for Missing Data; Missing Values; Missing Data Mechanism; An Overview of Traditional Missing Data Techniques; Deletion Methods; Substitution Methods; Imputation; Nearest Neighbors Methods; Traditional k Nearest Neighbors Imputation; Modification of the Traditional kNN Imputation; Nearest Neighbors for High-Dimensional Data; Practical Issues in High-Dimensional Data (n<<p); Extensions to Binary and Multi-Categorical Data; Selection of Attributes by Weighted Distances; Using Nearest Neighbors to Impute Missing Values; A Pearson Correlation Strategy
- Extension to Mixed Type DataAvailable Distances for Mixed Type Data; Weighted Distance for Mixed Type Data; Weighted Imputation by Nearest Neighbors; Bootstrap Inference; Missing Data and Classification; Multiple Imputation in High-Dimensional Data; MI using Nearest Neighbors; Improved Methods for the Imputation of Missing Data by Nearest Neighbor Methods; Introduction; Weighted Neighbors; Distances and Computation of Nearest Neighbors; Imputation Procedure; Choice of Tuning Parameters by Cross-Validation; Performance Measures; Evaluation of Weighted Neighbors
- Weighted Neighbors Including the Selection of PredictorsSelection of Dimensions; Evaluation of the Method with Selected Weighted Neighbors; Case Studies; Gene Expression Data; Non-gene Expression Data; Concluding Remarks; Missing Value Imputation for Gene Expression Data by Tailored Nearest Neighbors; Introduction; Materials and Methods; Nearest Neighbors Approaches to Imputation; Nearest Neighbor Based on Selected Genes; Choice of Tuning Parameters; Overview of Competing Methods; Results and Discussion; Data; Simulation and Evaluation; Real Data Sets; Concluding Remarks
- Nearest Neighbor Imputation for Categorical Data by Weighting of AttributesIntroduction; Methods; Distances for Categorical Variables; Selection of Attributes by Weighted Distances; Measuring Association Among Attributes; Using Nearest Neighbors to Impute Missing Values; A Pearson Correlation Strategy; Cross Validation; Evaluation of Performance; Existing Methods; Simulation Studies; Binary Variables; Multi-Categorical Variables; Mixed (Binary and Multi-Categorical) Variables; Applications; Concluding Remarks; Imputation Methods for High-Dimensional Mixed-Type Datasets by Nearest Neighbors
- IntroductionDistances for Mixed-Type Data; Available Methods; Weighted Distance; Weighted Distance With Selection of Variables; Weighted Imputation by Nearest Neighbors; Choice of Tuning Parameters by Cross-Validation; Measuring Association Among Mixed Variables; A Pearson Correlation Strategy; Existing Approaches for Comparison; Performance Measures; Simulation Studies; Real Data Applications; Concluding Remarks; Bootstrap Inference for Weighted Nearest Neighbors Imputation; Introduction; Nearest Neighbors Imputation of Missing Values; Bootstrap Sampling and Missing Data