specifying a numeric list within a function

I am writing the following function that creates a correlation matrix and finds highly correlated (>0.75) values:

var.cor <- function(data, cols){
  cor.mat <- cor(data [, cols])
  cor.mat <- round(cor.mat, 2)
  high.corr <- findCorrelation(cor.mat, cutoff = 0.75)
  print(cor.mat)
  print(high.corr)
} 

I want to give the function a range of column numbers (i.e., var.cor(data =dat, 10:20) will run the function for columns 10:20. what is the correct way to specify cols in the second line of the function? when I run var.cor("dat1", 10:20) I get an error message: Error in data[, cols] : incorrect number of dimensions

2 answers

  • answered 2020-05-22 12:38 jay.sf

    Your data argument just provides the string, like "mtcars" in the example below. Use get to get the object with this name from the .GlobalEnv. Example:

    var.cor <- function(data, cols){
      cor.mat <- cor(get(data, envir=.GlobalEnv)[, cols])
      cor.mat <- round(cor.mat, 2)
      high.corr <- caret::findCorrelation(cor.mat, cutoff = 0.75)
      print(cor.mat)
      print(high.corr)
    } 
    var.cor("mtcars", cols=1:2)
    #       mpg   cyl
    # mpg  1.00 -0.85
    # cyl -0.85  1.00
    # [1] 2
    

  • answered 2020-05-22 12:41 randr

    You should give us some information on your variable 'data' (it seems not to have 10 columns). With one alteration (see below) your code works fine for me using the built-in iris dataset:

    var.cor <- function(data, cols){
      cor.mat <- cor(data [, cols])
      cor.mat <- round(cor.mat, 2)
      high.corr <- cor.mat > 0.75
      print(cor.mat)
      print(high.corr)
    } 
    
    > var.cor(iris, 1:3)
                 Sepal.Length Sepal.Width Petal.Length
    Sepal.Length         1.00       -0.12         0.87
    Sepal.Width         -0.12        1.00        -0.43
    Petal.Length         0.87       -0.43         1.00
                 Sepal.Length Sepal.Width Petal.Length
    Sepal.Length         TRUE       FALSE         TRUE
    Sepal.Width         FALSE        TRUE        FALSE
    Petal.Length         TRUE       FALSE         TRUE
    
    

    You don't share your findCorrelation function, but it seems to just be a filter (which I have added as high.corr <- cor.mat > 0.75 above.

    In short, the actual answer to your question seems to be that your 'data' variable is not the shape you think it is.