R make subfactors based on consecutively occuring value
Hi does anyone know how to make a subfactor or unique marker to groups of data with the same value or factor consecutively
so my data can look like this
value group subgrouping
1 a a.1
5 a a.1
2 a a.1
3 b b.1
2 b b.1
5 b b.1
2 b b.1
1 b b.1
3 b b.1
2 a a.2
5 a a.2
5 a a.2
6 a a.2
6 a a.2
2 a a.2
1 a a.2
0 c c.1
3 c c.1
3 c c.1
2 b b.2
1 b b.2
3 a a.3
2 b b.3
3 b b.3
This way I can find say the average for a.2 and not all of a
3 answers

I've found this trick to work well in this situation. As written, it does not keep track of each group separately, but it might be sufficient:
df %>% mutate(subgroup_id = cumsum(lag(group, default = group[1]) != group))

Try
rle
:x < rle(df$group) x$values < with(x, ave(values, values, FUN = function(x) paste0(x, '.', seq_along(x)))) df$subgrouping2 < inverse.rle(x) df # '> df # value group subgrouping subgrouping2 # 1: 1 a a.1 a.1 # 2: 5 a a.1 a.1 # 3: 2 a a.1 a.1 # 4: 3 b b.1 b.1 # 5: 2 b b.1 b.1 # 6: 5 b b.1 b.1 # 7: 2 b b.1 b.1 # 8: 1 b b.1 b.1 # 9: 3 b b.1 b.1 # 10: 2 a a.2 a.2 # 11: 5 a a.2 a.2 # 12: 5 a a.2 a.2 # 13: 6 a a.2 a.2 # 14: 6 a a.2 a.2 # 15: 2 a a.2 a.2 # 16: 1 a a.2 a.2 # 17: 0 c c.1 c.1 # 18: 3 c c.1 c.1 # 19: 3 c c.1 c.1 # 20: 2 b b.2 b.2 # 21: 1 b b.2 b.2 # 22: 3 a a.3 a.3 # 23: 2 b b.3 b.3 # 24: 3 b b.3 b.3

With
data.table
, grouped by the runlengthid of 'group (rleid(group)
), get thefirst
'group' value and the number of observations (.N
), then grouped by 'group',paste
the sequence of observeations with 'group', replicate that by the number of observations afterorder
ing by the 'ind' and assign those to create the 'subgroup2'library(data.table) sgrp < setDT(df1)[, .(group = first(group), n = .N), .(ind = rleid(group))][, .(paste(group, seq_len(.N), sep="."), n, ind), group][order(ind), rep(V1, n)] df1[, subgroup2 := sgrp] df1 # value group subgrouping subgroup2 # 1: 1 a a.1 a.1 # 2: 5 a a.1 a.1 # 3: 2 a a.1 a.1 # 4: 3 b b.1 b.1 # 5: 2 b b.1 b.1 # 6: 5 b b.1 b.1 # 7: 2 b b.1 b.1 # 8: 1 b b.1 b.1 # 9: 3 b b.1 b.1 #10: 2 a a.2 a.2 #11: 5 a a.2 a.2 #12: 5 a a.2 a.2 #13: 6 a a.2 a.2 #14: 6 a a.2 a.2 #15: 2 a a.2 a.2 #16: 1 a a.2 a.2 #17: 0 c c.1 c.1 #18: 3 c c.1 c.1 #19: 3 c c.1 c.1 #20: 2 b b.2 b.2 #21: 1 b b.2 b.2 #22: 3 a a.3 a.3 #23: 2 b b.3 b.3 #24: 3 b b.3 b.3