Values are changing while matching levels in R.

I have 2 variables x and y. In X I have only 1 value with 1 level. I want to match levels of y to x. After matching levels, level's are matching but value of X is changing. Why this is so ?

x = as.factor(c(3))    
> x
Levels: 3

y = as.factor(c(2,3,4))
> y
2 3 4
Levels: 2 3 4

Output -

levels(x) = levels(y)


Levels: 2 3 4

The initial value of X was 3 now its 2.

2 answers

  • answered 2018-04-17 06:05 DJV

    I think this occurs because R presents the new level and not the value. For example, if you will do as.numeric(x) it will present 1 and not 3.

    x <- as.factor(c(3))  

    [1] 1

    However, if you will unfactor the variable using varhandle::unfactor(), it will present the "real" value.


    [1] 3

    Thus, when you do levels(x) <- levels(y) you don't relevel/refactor the levels of x to be like y - you adjusting/changing the levels and values.

    x <- as.factor(c(3))  
    y <- as.factor(c(2,3,4))
    levels(x) <- levels(y)

    [1] 2

    Doing thisx <- factor(x, levels = union(levels(x), levels(y))) will solve your problem.

    x <- as.factor(c(3))  
    y <- as.factor(c(2,3,4))

    [1] 3 Levels: 3

    x <- factor(x, levels = union(levels(x), levels(y)))

    [1] 3 Levels: 2 3 4


    [1] 3

    Thank you @pieca for the comment.

  • answered 2018-04-17 06:41 42-

    R factors are really positive integer vectors which have a levels-attribute that is used as a lookup "table". What happened in your example is that the value of x was 1 (since there was only one item in levels(x) that happened to be the character "3"). When you replaced the levels-attribute with the character vector: c( '2', '3', '4') the consequence was using 1 as an index to that vector returning the first item in the levels attribute which was now the character-"2".

    It's really fairly dangerous to go around changing levels of factors. If you wanted to expand the levels, the safe way to do it would be something along these lines:

    x <- factor( as.character(x), levels = union(levels(x), levels(y) ) )