DECONT TVA 2011 FORMAT PDFJune 26, 2020
models in the package: (1) an additive measurement error model, where the goal is to estimate the density or distribution function from contaminated data; (2) nonparametric regression Wang, X.F. and Wang, B. (). Grade 2 English Model Curriculum 3 · Grade 3 English Grades English Model Curriculum March ELA DECON STAND · ELA K Author manuscript; available in PMC May In this paper, we present a new software package decon for R, which contains a collection of The regression estimate from the uncontaminated sample (dashed line) gives an accurate . 2. Deconvolution methods in measurement error problems.
|Published (Last):||20 November 2006|
|PDF File Size:||20.97 Mb|
|ePub File Size:||17.42 Mb|
|Price:||Free* [*Free Regsitration Required]|
Data from many scientific areas often come with measurement error. Density or distribution function estimation from contaminated data and nonparametric regression with errors-in-variables are two important topics in measurement error models. In this paper, we tvaa a new software tormat decon for Rwhich contains a collection of functions that use the deconvolution kernel methods to deal with the measurement error problems.
The functions allow the errors to be either homoscedastic or heteroscedastic. To make the deconvolution estimators computationally more efficient in Rwe adapt the fast Fourier transform algorithm for density estimation with error-free data to the deconvolution kernel estimation.
We discuss the practical selection of the smoothing parameter in deconvolution methods and illustrate the use of the package through both simulated and real examples.
Data measured with errors occur frequently in many scientific fields. Ignoring measurement error can bring forth biased estimates and lead to erroneous conclusions to various degrees in a data analysis.
One could think of several examples in which measurement error can be a concern:. Here we use two simulated examples to illustrate the effects of ignoring errors.
The first example is the density estimation of a variable Xwhere X is from gormat. We generate simulated deconh from such a model.
The left panel of Figure 1 presents the kernel density estimate from the uncontaminated sample dashed linethe kernel density estimate from the contaminated sample dotted lineand the true density function of X solid line. We notice that even if the true density is bimodal, the kernel density estimate from the contaminated data may be unimodal.
The second example is a regression of a response Y on a predictor X. The right panel of Figure 1 displays the kernel regression estimates from the simulated case. The regression estimate from the uncontaminated sample dashed line gives an accurate estimate for the true curve solid linewhile the estimate from the contaminated sample dashed line is far from the target function. Thus, correcting the bias of naive estimators is critical in measurement error problems.
Simulation examples to illustrate the effects of measurement error: The solid lines denote the true curves; the dashed lines denote the kernel estimates from the 20011 sample; the dotted lines denote the kernel estimates from the uncontaminated sample. Statistical models for addressing measurement error problems can be classified as parametric or nonparametric Carroll et al.
Our interest here focuses on two nonparametric models and decknt estimation:. The density function of U is denoted by f Uassumed known. Closely related to the density estimation is the problem of estimating the 0211 density of X given Wf X W x w. Suppose that the observations are a sample of i. Each of the two models is the subject of ongoing research in statistics.
The first model is ttva additive measurement error model. The problem fromat estimating f X is also known as the deconvolution problem. It is often related to the decpnt to imaging deblurring, microarray background correction, and bump hunting with measurement error.
The second model is known as regression with errors-in-variableswhich often occurs in bioscience, astronomy, and econometrics. Methods for correcting the effects of measurement error based on the above two models have been widely investigated in the past two decades. In the additive measurement error model, Carroll and Hall and Stefanski and Carroll proposed the deconvolution kernel density estimator to recover the unknown density function from contaminated data, where the kernel idea and the Fourier inverse were employed in the construction of the estimator.
Since then, the deconvolution kernel approach has been extensively decoht. See for instance, ZhangFan, EfromovichFogmat and Gijbels abMeister and van Es and Uhamong others. The idea in deconvolution kernel density estimation was also generalized to nonparametric regression with errors-in-variables by Fan and Truong Recent contributions to the two measurement error problems include the consideration of heteroscedastic errors.
Delaigle and Meister proposed a generalized deconvolution kernel estimator for density estimation with heteroscedastic errors. They also applied this idea to nonparametric regression estimation dexont the heteroscedastic errors-in-variables problem Delaigle and Meister Hall and Lahiri studied estimation of distributions, moments and quantiles in the deconvolution problems.
Wang, Fan, and Wang explored smooth decotn estimators with heteroscedastic error. Carroll, Delaigle, and Hall discussed the nonparametric prediction in measurement error models when the covariate is measured with heteroscedastic errors.
A comprehensive discussion of nonparametric deconvolution techniques can be found in the recent monograph by Meister We thereby call all these kernel-type methods that require an inverse Fourier transform deconvolution kernel methods DKM. Despite the fact that DKM are shown to be the powerful tools in measurement error problems, there is no existing software to implement the gva systematically.
We propose and apply an fast Fourier transform FFT algorithm in the deconvolution estimation, which adapts from the algorithm in kernel density estimation with error-free data by Silverman The resulting R functions become computationally very fast. Our R functions allow both homoscedastic errors and heteroscedastic errors. Several bandwidth selection functions are 201 available in the package.
The rest of the paper is organized as follows.
CRAN – Package decon
Section 2 gives a summary of the DKM that are used in our package. Section 3 discusses the practical selection of the smoothing parameter in the measurement error problems. Section 4 addresses the FFT algorithm in the estimating procedures. Section 5 demonstrates our package through both simulated and real data examples.
Finally, the paper ends with discussion. In this section, we review DKM in the two measurement error models and discuss some formaat technical details, which have been implemented in the software package.
An inverse Fourier transform leads to. However, in practice this naive estimate is unstable because the sample characteristic function deconf large fluctuations at its tails. The difficulty of deconvolution depends heavily on the smoothness of the error density f U: The smoother the error density the harder deconvolution is.
In the classical deconvolution literature, the error distributions are classified into two classes: Ordinary smooth distribution and supersmooth distribution Fan Examples of ordinary smooth distributions include Laplacian, gamma, and symmetric gamma; examples of supersmooth distributions are normal, mixture normal and Cauchy.
Generally speaking, a supersmooth distribution is smoother than a ordinary smooth distribution, so f X is more difficult to be deconvoluted when X is contaminated by supersmooth errors. In the decon package, two important formta of measurement error distributions are allowed: Normal super-smooth and Laplacian ordinary-smooth. In kernel density estimation for error-free data, the choice of the kernel function K does not have a big influence on the quality of the estimator.
This requirement can be tca in the case of ordinary smooth errors or when the variance of measurement errors is small. We consider the following kernels in the package.
There are two typical choices tav the kernel functions for normal errors. The first one is the following second-order kernel whose characteristic function has a compact and symmetric support Fan ; Delaigle and Gijbels a.
Hence, the resulting deconvoluting kernel with normal error is.
Deconvolution estimation in measurement error models: The R package decon
The requirement for this support kernel can be relaxed when the error variance is small in Gaussian deconvolution. Fan gave comprehensive discussions about the effects of error magnitude on the DKM. In the package, a user can select the standard normal density as the kernel function if the magnitude of error variance is small, where the corresponding deconvoluting kernel becomes.
When could one use the normal kernel in a data analysis? If a user is not sure about error magnitude in a study, the support kernel is recommended. We consider the standard normal kernel function, so the resulting deconvoluting kernel for the case of Laplacian errors is.
The R functions DeconPdf and DeconCdf in the decon package perform the deconvolution kernel density and distribution estimation from contaminated data, respectively. In deconvolution problems, it is common to assume an explicit form of the density function f U of Ubecause f X is not identifiable if f U is unknown. There are two common ways to estimate the parameters of f U in real data analysis. One can also estimate f U when replicated measurements of W are available.
The Framingham study that we will present in Section 5 is such a case. In many real applications, the distributions of measurement errors could vary with each subject or even with each observation, so the errors are heteroscedastic.
Hence, consideration of heteroscedastic errors is very important. The estimator is given by. The R functions DeconPdf and DeconCdf also allow us to estimate density and distribution functions with heteroscedastic errors. In the decony version, only the case of heteroscedastic normal errors is considered.
Under Model I, closely related to the density tga is the problem of estimating the conditional density of X given Wf X W x w. Conditional density estimation has an important application to microarray background correction. Wang and Ye proposed a re-weighted deconvolution kernel estimatorwhich is defined by.
The function DeconCPdf allows us to estimate the conditional density function with homoscedastic errors. The ideas of the deconvolution kernel density estimators can be generalized to nonparametric regression with errors-in-variables.
One tries to estimate the conditional mean curve. The denominator in 10 can be estimated by the standard kernel density estimator. Since the joint density f xy can be estimated using a multiplicative kernel, one could work out an estimator of the numerator in 10 by replacing the joint density f xy with its kernel estimate, which leads to. A natural estimate of m x is now the combination fotmat the formwt of the denominator and the numerator.
vormat Back to Model II, extending the kernel idea becomes natural in the errors-in-variables setting. The denominator of the Nadaraya-Watson estimator may be replaced by the deconvolution kernel density estimator 4which is an empirical version of f X x as in the error-free case.
In the spirit of the deconvolution kernel density estimator, Fan and Truong suggest to estimate r x with.