cor function with NA values due to 0 variance

Beginner R user here. I am using the cor function to get the Kendal's tau-b rank correlation coefficient between 2 columns of a dataframe. Examples of such columns are as folows:

A    B
1    1
1    2
1    3

when I use cor(d,method="kendall")

The result is NA for the correlation between A and B. Shouldnt it be 0? And if not is there a way that I can replace this NA result with 0 using a parameter in the cor function?

1 answer

  • answered 2019-09-22 19:04 G. Grothendieck

    Consider what would happen if we slightly perturb the constant column. We get vastly different solutions depending on the particular perturbation used. In fact we can get any correlation we like with different perturbations. As a result it really makes no sense to use any particular value for the correlation and it would be best left as NA.

    x <- c(1, 1, 1)
    y <- 1:3
    
    cor(x + (1:3) * 1e-10, y, method = "spearman")
    ## [1] 1
    
    cor(x - (1:3) * 1e-10, y, method = "spearman")
    ## [1] -1