We propose a method for imputing missing data by using conditional copula functions. Copulas are a powerful tool for multivariate analysis especially because they allow to i) fit any combination of marginal distribution functions, ii) model the marginal distributions and the dependence structure separately and iii) take into account complex dependence relationships. We present the method and perform a simulation study in order to compare it with two well–known imputation techniques: the regression imputation by EM algorithm and the nearest neighbour donor imputation. By varying different parameters we evaluate the performance of our proposal. Finally, we propose a generalization of our method by using non parametric estimation and inversion algorithms to generate random variates for conditional distributions.
Exploring copulas for the imputation of missing nonlinearly dependent data
GIANNERINI, SIMONE;
2009-01-01
Abstract
We propose a method for imputing missing data by using conditional copula functions. Copulas are a powerful tool for multivariate analysis especially because they allow to i) fit any combination of marginal distribution functions, ii) model the marginal distributions and the dependence structure separately and iii) take into account complex dependence relationships. We present the method and perform a simulation study in order to compare it with two well–known imputation techniques: the regression imputation by EM algorithm and the nearest neighbour donor imputation. By varying different parameters we evaluate the performance of our proposal. Finally, we propose a generalization of our method by using non parametric estimation and inversion algorithms to generate random variates for conditional distributions.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.