Clustering genomes based on identity matrix by hclust R


This Content is from Stack Overflow. Question asked by Agata

I have an identity matrix between genomes ranging from 0.005 to 0.13. I would like to group them in reverse order, which means that the genomes with little identity will be one at a time in clusters and with the greatest identity will be together in clusters.

What I’ve done:

distmat <- dist(ident_mtx2, method = "euclidian")  
clust <- hclust(distmat, method = "single")

The problem is that using euclidean distance causes all genomes with low identity to fall into one cluster. And I would like the opposite. I wish genomes with low identity would form separate clusters. Can this result be reversed somehow? If not, what method should I use to cluster the genomes?

Many thanks for any suggestions


This question is not yet answered, be the first one who answer using the comment. Later the confirmed answer will be published as the solution.

This Question and Answer are collected from stackoverflow and tested by JTuto community, is licensed under the terms of CC BY-SA 2.5. - CC BY-SA 3.0. - CC BY-SA 4.0.

people found this article helpful. What about you?