Issue
This Content is from Stack Overflow. Question asked by Joseph Kim
This is the frist time to perform KDE in R with data which has more than 5 variables for me for anomaly detection.
As far as I know that KDE is performable for multidimensional data but I couldn’t find examples which using more than 5 dimensional data.
I’m using data which have ‘age’, ‘trestbps’, ‘chol’, ‘thalach’, and ‘oldpeak’ 5 variables as like below.
'data.frame': 176 obs. of 5 variables:
$ age : int 30 50 50 50 50 60 50 40 50 40 ...
$ trestbps: int 130 130 130 130 130 130 130 130 130 130 ...
$ chol : int 198 245 221 288 205 309 240 243 289 250 ...
$ thalach : int 130 166 164 159 184 131 154 152 124 179 ...
$ oldpeak : num 1.6 2.4 0 0.2 0 1.8 0.6 0 1 0 ...
I performed KDE for those data, with the approach as like below, but I’m not sure it is correct approach, and proper result.
evpts <- do.call(expand.grid, lapply(df3, quantile, prob = c(0.1,.25,.5,.75,.9)))
hat2 <- kde(df3, eval.points = evpts)
> str(hat2)
List of 9
$ x : num [1:176, 1:5] 30 50 50 50 50 60 50 40 50 40 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:5] "age" "trestbps" "chol" "thalach" ...
$ eval.points:'data.frame': 3125 obs. of 5 variables:
..$ age : Named num [1:3125] 40 40 50 60 60 40 40 50 60 60 ...
.. ..- attr(*, "names")= chr [1:3125] "10%" "25%" "50%" "75%" ...
..$ trestbps: Named num [1:3125] 108 108 108 108 108 112 112 112 112 112 ...
.. ..- attr(*, "names")= chr [1:3125] "10%" "10%" "10%" "10%" ...
..$ chol : Named num [1:3125] 194 194 194 194 194 194 194 194 194 194 ...
.. ..- attr(*, "names")= chr [1:3125] "10%" "10%" "10%" "10%" ...
..$ thalach : Named num [1:3125] 114 114 114 114 114 ...
.. ..- attr(*, "names")= chr [1:3125] "10%" "10%" "10%" "10%" ...
..$ oldpeak : Named num [1:3125] 0 0 0 0 0 0 0 0 0 0 ...
.. ..- attr(*, "names")= chr [1:3125] "10%" "10%" "10%" "10%" ...
..- attr(*, "out.attrs")=List of 2
.. ..$ dim : Named int [1:5] 5 5 5 5 5
.. .. ..- attr(*, "names")= chr [1:5] "age" "trestbps" "chol" "thalach" ...
.. ..$ dimnames:List of 5
.. .. ..$ age : chr [1:5] "age=40" "age=40" "age=50" "age=60" ...
.. .. ..$ trestbps: chr [1:5] "trestbps=108" "trestbps=112" "trestbps=120" "trestbps=128" ...
.. .. ..$ chol : chr [1:5] "chol=194.00" "chol=211.00" "chol=244.00" "chol=283.75" ...
.. .. ..$ thalach : chr [1:5] "thalach=113.50" "thalach=128.25" "thalach=150.00" "thalach=164.00" ...
.. .. ..$ oldpeak : chr [1:5] "oldpeak=0.0" "oldpeak=0.0" "oldpeak=0.8" "oldpeak=1.8" ...
$ estimate : Named num [1:3125] 5.64e-12 5.64e-12 2.85e-09 7.76e-10 7.76e-10 ...
..- attr(*, "names")= chr [1:3125] "1" "2" "3" "4" ...
$ H : num [1:5, 1:5] 6.972 0.866 5.065 -6.541 0.189 ...
$ gridded : logi FALSE
$ binned : logi FALSE
$ names : chr [1:5] "age" "trestbps" "chol" "thalach" ...
$ w : num [1:176] 1 1 1 1 1 1 1 1 1 1 ...
$ type : chr "kde"
- attr(*, "class")= chr "kde"
If it is not proper approach, could you please help me to get correct approach?
Thank you for your support.
Solution
This question is not yet answered, be the first one who answer using the comment. Later the confirmed answer will be published as the solution.
This Question and Answer are collected from stackoverflow and tested by JTuto community, is licensed under the terms of CC BY-SA 2.5. - CC BY-SA 3.0. - CC BY-SA 4.0.