Detect local outliers in vector data with a Hampel filter using median
absolute deviation (MAD), and replaces with NA
or the local median value.
Usage
replace_outliers(x, width, t0 = 3, return = c("NA", "median"))
Details
The "median absolute deviation" computation is done in the [-width...width]
vicinity of each point at least width
steps away from the end points of the
interval. At the lower and upper end the time series values are preserved.
A high threshold makes the filter more forgiving, a low one will declare
more points to be outliers. t0 = 3
(the default) corresponds to Pearson's
3 sigma edit rule, t0 = 0
to Tukey's median filter.
Missing NA
values in x
are removed before processing and restored in
the returned vector.
return = "median"
will replace outliers with the local median value,
as in pracma::hampel()
. Otherwise, the default return = "NA"
will
replace outliers with NA
to be replaced later by your choice of methods
(see replace_missing()
.
See also
replace_invalid()
replace_missing()
pracma::hampel()
Examples
set.seed(8421)
x <- numeric(1024)
z <- rnorm(1024)
x[1] <- z[1]
for (i in 2:1024) {
x[i] <- 0.4*x[i-1] + 0.8*x[i-1]*z[i-1] + z[i]
}
x[150:200] <- NA ## generate NA values
y <- replace_outliers(x, width = 20, return = "median")
ind <- which(x != y) ## identify outlier indices
outliers <- x[ind] ## identify outlier values
if (FALSE) { # \dontrun{
plot(1:1024, x, type = "l")
points(ind, outliers, pch = 21, col = "darkred")
lines(y, col = "blue")
} # }