Interpreting Cluster Means/Centers for High Dimensionality Clustering Using Kmeans Clusting
I'm trying to perform Kmeans clustering on a data set with multiple variables (8 to be exact), but it's hard to visualize this clustering because of its dimensionality.
Would looking at its cluster means/center given in the output of the kmeans() function be helpful in visualizing it?
Thank you!
See also questions close to this topic
-
Rscript: problems with inner_join
I am trying to run my R-script from command line, but it is return error. Where are no one problems, when I run similar code from R or Rstudio. Before, I faced a problem with merge (Rcript Error in if (nx >= 2^31 || ny >= 2^31) stop("long vectors are not supported")) and I have replaced it to inner_join from dplyr
Error in UseMethod("tbl_vars") : no applicable method for 'tbl_vars' applied to an object of class "function" Calls: inner_join ... tbl_vars -> new_sel_vars -> structure -> tbl_vars_dispatch
The problematic part of R-script:
clonotypes_tables = function(name, cell, mode){ sub = subset(metadata, metadata$donor == as.character(name)) sub = subset(sub, sub$cell_type == as.character(cell)) if (nrow(sub) > 1){ sub = sub[order(sub$time_point), ] if (file.exists(paste(getwd(), sub$file_name[1], sep="/")) & file.exists(paste(getwd(), sub$file_name[2], sep="/"))){ point1 = read.table(sub$file_name[1], header = T) #cat("check1") point2 = read.table(sub$file_name[2], header = T) same_src(point1, point2) if (nrow(point1) >= 1000 & nrow(point2) >= 1000){ if (mode == "CDR3_V"){ common.clonotype = inner_join(point1[,c(1,2,4,5)], point2[,c(1,2,4,5)], by = c("cdr3aa", "v"), copy = T) common.clonotype$clon = paste(common.clonotype$cdr3aa, common.clonotype$v, sep = "~") } else{ common.clonotype = inner_join(point1[,c(1,2,4)], point2[,c(1,2,4)], by = c("cdr3aa"), copy = T) common.clonotype$clon = common.clonotype$cdr3aa } common.clonotype = common.clonotype[,c("clon", "freq.x", "freq.y")] colnames(common.clonotype) = c("Clonotypes", "0.5", "1") dim(common.clonotype) common.clonotype = common.clonotype[order(common.clonotype[2], decreasing = T), ] common.clonotype } #return(common.clonotype) } else{ print(paste(name, cell, "hasn't two time points", sep = " ")) } } }
Also, I have tried
inner_join(point1[,c(1,2,4)], point2[,c(1,2,4)], by = c("cdr3aa"))
, but where was problem with src withoutcopy = T
-
Mclust (GMM model) clustering in R: how can I add label for each point in the clusters?
I would like to perform a model-based clustering based on parameterized finite Gaussian mixture models. I used
Mclust
function bymclust
package inR
. I have a dataframe like this (which I reported here for the first 10 rows):df=data.frame(X1=c(-0.9749422, -0.4062771, 1.0974262, -2.297264, -1.6022011, 2.144254, 2.2012879, 2.810878, 0.6728063,-0.7042836), X2= c(0.1740782, -1.4475989, 0.8575626, 1.8466605, -2.0279622, 2.9748541, 0.8820755, 3.0898032, 1.3757168, -1.9729475), label=as.character(c("abundant", "accelerating", "anyone", "approach", "association", "ban", "blog", "commission", "complete", "congratulations")),stringsAsFactors = FALSE)
i.e.
> df X1 X2 label 1 -0.9749422 0.1740782 abundant 2 -0.4062771 -1.4475989 accelerating 3 1.0974262 0.8575626 anyone 4 -2.297264 1.8466605 approach 5 -1.6022011 -2.0279622 association 6 2.144254 2.9748541 ban 7 2.2012879 0.8820755 blog 8 2.810878 3.0898032 commission 9 0.6728063 1.3757168 complete 1 -0.7042836 -1.9729475 congratulations
I would like to plot the clusters obtained by
Mclust
function in X1 X2 plane, indicating the corresponding label of each point in the cluster, then I started from (3 clusters e.g.):mod2 <- Mclust(plot_df[,1:2], G = 3) plot(mod2,"classification")
but obviously I obtained no labels. How can I add label for each point in the clusters?
Thank you!
-
Make columns on the basis of one column
I have a data set in R as following:-
a <- data.frame(name=c("AFG", "AFG", "AFG","AFG", "GER", "GER", "GER", "GER", "GFR", "GFR", "GFR", "GFR"), Typ = c("One", "Two", "Three", "Four", "One", "Two", "Three", "Four", "One", "Two", "Three", "Four"), Yr1 = c(10, 11, 12, 14, 15, 17, 18, 19, 88, 1, 39, 1), Yr2 = c(1:12), Yr3 = c(8:19))
I want to change this data so that the columns are based on the values in
Typ
column. That is I get the followingdata.frame
.b <- data.frame(name = c("AFG", "AFG", "AFG","GER", "GER", "GER","GFR", "GFR", "GFR"), Yr = c("Yr1", "Yr2", "Yr3", "Yr1", "Yr2", "Yr3", "Yr1", "Yr2", "Yr3"), One = c(10, 1, 8, 15, 5, 12, 88, 9, 16), Two = c(11, 2, 9, 17, 6, 13, 1, 10, 17), Three = c(12, 3, 10, 18, 7, 14, 39, 11, 18), Four = c(14, 4, 11, 19, 8, 15, 1, 12, 19))
Thanks in advance