Reduce noise in a high-dimensional dataset by averaging each point with its nearby neighbors.
Arguments
- X
A matrix of numeric data, or something that can be cast to a matrix. Each row represents a point.
- block
Optional. A block for each row in X. A factor, or something that can be cast to a factor. Denoising will be performed independently within each block.
- k
Number of nearest neighbors to find around each point (including itself).
- steps
Number of steps to take along the directed k-nearest neighbor graph.
steps=1
uses the k-nearest neighbors,steps=2
uses the k-nearest neighbors and their k-nearest neighbors, etc.
Details
knnDenoise
first finds the k
-nearest neighbors to each point (including the point itself). Then, for each point, the average is found of the points reachable in steps
steps along the directed k-nearest neighbor graph.
Examples
library(palmerpenguins)
completePenguins <- na.omit(penguins[,c(1,3,4,5,6)])
# Dimensions need to be on comparable scales to apply knnDenoise
scaled <- scale(completePenguins[,-1])
denoised <- knnDenoise(scaled)
langevitour(denoised, completePenguins$species, pointSize=2)