DECONT TVA 2011 FORMAT PDFJune 20, 2020
models in the package: (1) an additive measurement error model, where the goal is to estimate the density or distribution function from contaminated data; (2) nonparametric regression Wang, X.F. and Wang, B. (). Grade 2 English Model Curriculum 3 · Grade 3 English Grades English Model Curriculum March ELA DECON STAND · ELA K Author manuscript; available in PMC May In this paper, we present a new software package decon for R, which contains a collection of The regression estimate from the uncontaminated sample (dashed line) gives an accurate . 2. Deconvolution methods in measurement error problems.
|Published (Last):||27 December 2015|
|PDF File Size:||11.35 Mb|
|ePub File Size:||20.71 Mb|
|Price:||Free* [*Free Regsitration Required]|
Data from many scientific areas often come with measurement error. Density or distribution function estimation from contaminated data and nonparametric regression decony errors-in-variables are two important topics in measurement error models.
In this paper, we present a new software package decon for Rwhich contains a collection of functions 20111 use the deconvolution kernel methods to deal with the measurement error problems. The functions allow the errors to be either homoscedastic or heteroscedastic. To make the deconvolution estimators computationally more efficient in Rwe adapt the fast Fourier transform algorithm for density estimation with error-free data to the deconvolution kernel estimation. We discuss the practical selection of the smoothing parameter in deconvolution methods and illustrate the use of the package through both simulated and real examples.
Data measured with errors occur frequently in many scientific fields. Ignoring measurement error can bring forth biased estimates and lead to erroneous conclusions to various degrees in a data analysis.
One could think of several examples in which measurement error can be a concern:. Here we use two simulated examples to illustrate the effects of ignoring errors. The first example is the density estimation of a variable Xwhere X is from 0.
We generate simulated observations from such a model. The left panel of Figure 1 presents the kernel density estimate from the uncontaminated sample dashed linethe kernel density estimate from dwcont contaminated sample dotted lineand the true density function of X solid line.
We forma that even if formaat true density is bimodal, the kernel density estimate from the contaminated data may be unimodal. The formah example is a regression of a response Y on a predictor X. The right panel of Figure deconr displays the kernel regression estimates from the simulated case.
The regression estimate from the uncontaminated sample dashed line gives an accurate estimate for the true curve solid linewhile the estimate from the contaminated sample dashed line is far from the target function. Thus, correcting the bias of naive estimators is critical in measurement error problems.
Simulation examples to illustrate the effects of measurement error: The solid lines denote the true curves; the dashed lines denote the kernel estimates from the uncontaminated sample; the dotted lines denote the kernel estimates from the uncontaminated sample.
Statistical models for addressing measurement error problems can be classified as parametric or nonparametric Carroll et al. Our interest here focuses on two nonparametric models and their estimation:. The density function of U is denoted by f Uassumed known. Closely related to the density estimation is the problem of estimating the conditional density of X given Wf X W x w.
Suppose that the observations are a sample of i. Each of the two models is the subject of ongoing research in statistics.
The first model is an additive measurement error model. The problem of estimating f X is also known as the deconvolution problem. It is often related to the application to imaging deblurring, microarray background correction, and bump hunting with measurement error. The second model is known as regression with errors-in-variableswhich often occurs in bioscience, astronomy, and econometrics. Methods for correcting the effects of measurement error based on the above two models have been widely investigated in the past two decades.
In the additive measurement error model, Carroll and Hall and Stefanski and Carroll proposed the deconvolution kernel density estimator fornat recover the unknown density function from contaminated data, where the kernel idea and the Fourier inverse were employed in the construction of the estimator. Since then, the deconvolution kernel approach has been extensively studied. See for instance, ZhangFan, EfromovichDelaigle and Gijbels abMeister and van Es and Uhamong others. The idea in deconvolution kernel density estimation was also generalized to nonparametric regression with errors-in-variables by Fan and Truong Recent contributions to the two measurement error problems include the consideration of heteroscedastic errors.
Delaigle and Meister proposed a generalized deconvolution kernel estimator for density estimation with heteroscedastic errors.
They also applied this idea to nonparametric regression estimation in the heteroscedastic errors-in-variables problem Delaigle and Meister Hall and Lahiri studied estimation deconr distributions, moments and quantiles in the deconvolution problems. Wang, Fan, and Wang explored smooth distribution estimators with heteroscedastic error.
Carroll, Delaigle, and Hall discussed the nonparametric prediction in measurement error models when the covariate is measured tvs heteroscedastic errors.
A comprehensive discussion of nonparametric deconvolution techniques can be found in the recent monograph by Meister We thereby call all these kernel-type methods that require an inverse Fourier transform deconvolution kernel formst DKM. Despite the fact that DKM are shown to be the powerful tools in measurement error problems, there is no existing software to implement the methods systematically. We propose and apply an fast Fourier transform FFT algorithm in the deconvolution estimation, which adapts from the algorithm in kernel density estimation with error-free data by Silverman The resulting R functions become computationally very fast.
Our R functions allow both homoscedastic errors and heteroscedastic errors. Several bandwidth selection functions are also available in the package. The rest of the paper is organized as follows. Section 2 gives a summary of the DKM that are gormat in our package. Section 3 discusses the practical selection of the smoothing parameter in the measurement error problems. Section 4 addresses the FFT algorithm in the estimating procedures. Section 5 demonstrates our package through both simulated and real data examples.
Finally, the paper ends with discussion.
In this section, we review DKM in the two measurement error models and discuss some computational technical details, which have been implemented in the software package. An inverse Fourier transform leads to. However, in practice this naive estimate is unstable because the sample characteristic function has large fluctuations at its tails. The difficulty of deconvolution depends heavily on the smoothness of the error density f U: The smoother the error density the harder deconvolution is.
In the classical deconvolution literature, the error distributions are classified into two classes: Ordinary smooth distribution and supersmooth distribution Fan Examples of ordinary smooth distributions include Laplacian, gamma, and symmetric gamma; examples of supersmooth distributions are normal, mixture normal and Cauchy.
Generally speaking, a supersmooth distribution is smoother than a ordinary smooth distribution, so f X is more difficult to be deconvoluted when X is contaminated by supersmooth errors.
In the decon package, two important cases of measurement error distributions are allowed: Normal super-smooth and Laplacian ordinary-smooth.
In kernel density estimation for error-free data, the choice of the kernel function K does not have a big influence on the quality of the estimator. This requirement can be relaxed in the case of ordinary smooth errors or when the variance of measurement errors is small.
We consider the following kernels in the package. There are two typical choices of the kernel functions for normal errors. The first one is the following second-order kernel whose characteristic function has a compact and symmetric support Fan ; Delaigle and Gijbels a. Hence, the resulting deconvoluting kernel with normal error is.
The requirement for this support kernel can dscont relaxed when the error variance is small in Gaussian deconvolution. Fan gave comprehensive discussions about the effects of error magnitude on the DKM. In the package, a user can select the standard normal density as the kernel function if the magnitude of error variance is small, where the corresponding deconvoluting kernel becomes. When could one use the normal kernel in a data analysis?
If a user is not sure about error magnitude in a study, the support kernel is recommended. We consider the standard normal kernel function, so the resulting deconvoluting kernel for the case of Laplacian errors is.
The R functions DeconPdf and DeconCdf in the decon package perform the deconvolution kernel density and distribution estimation from contaminated data, respectively.
In deconvolution problems, it is common to assume an explicit form of the density function f U of Ubecause f X is not identifiable if f U is unknown. There are two common ways to estimate the parameters of f U in real data analysis. One can also estimate f U when replicated measurements of W are available.
The Framingham study that we will present in Section 5 is such a case.
Deconvolution estimation in measurement error models: The R package decon
In many real applications, the distributions of measurement errors could vary with each subject or even with each observation, so the errors are heteroscedastic. Hence, consideration of heteroscedastic errors is very formzt. The estimator is given by. The R functions DeconPdf and DeconCdf also allow us to estimate density and distribution functions with heteroscedastic errors.
In the current version, only the case of heteroscedastic normal errors is considered. Under Model I, closely related to the density estimation is the problem of ttva the conditional density of X given Wf X W x w. Conditional density estimation has an important application to microarray background correction. Wang and Ye proposed a re-weighted deconvolution kernel estimatorwhich is defined by. The function DeconCPdf allows us to estimate the conditional density function with homoscedastic errors.
The ideas of the deconvolution kernel density estimators can be generalized to nonparametric regression with errors-in-variables. One decnot to estimate the conditional mean curve.
The denominator in 10 can be estimated by the standard kernel density estimator.
CRAN – Package decon
Since the joint density f xy can be estimated using a multiplicative kernel, one could work out an estimator of the numerator in 10 by replacing the joint density f xy with its kernel estimate, which leads to.
A natural estimate of m x is now the combination of the estimates of the denominator and the numerator. Back to Model II, extending the kernel idea becomes natural in the errors-in-variables setting. The denominator of the Nadaraya-Watson estimator may be replaced formzt the deconvolution kernel density estimator 4which is an empirical version of f X x as in the error-free case. In the spirit of the deconvolution kernel density estimator, Fan and Truong suggest to estimate r x with.