我有一个数据框架,其中包含列表内.
df <- data.frame(
id=c(1:4),
a=I(list(c(1,"a1"),2,c("a31","a32","a33"),"a4")),
b=I(list(2,c("b1","b2",3),c("b3","b4"),4))
); print(df)
id a b
1 1 1, a1 2
2 2 2 b1, b2, 3
3 3 a31, a32.... b3, b4
4 4 a4 4
现在,我需要unnest
个列表才能获得这样的数据帧:
df2 <- data.frame(
id=c(1,1,2,2,2,3,3,3,3,3,3,4),
a=c(1,"a1",2,2,2,"a31","a31","a32","a32","a33","a33","a4"),
b=c(2,2,"b1","b2",3,"b3","b3","b3","b4","b4","b4",4)
) ; print(df2)
id a b
1 1 1 2
2 1 a1 2
3 2 2 b1
4 2 2 b2
5 2 2 3
6 3 a31 b3
7 3 a31 b3
8 3 a32 b3
9 3 a32 b4
10 3 a33 b4
11 3 a33 b4
12 4 a4 4
我过go 用unnest()
表示某些行/列中包含相同数量的列表元素,但当前数据框在某些行/列中包含不同数量的元素.目前,我面临以下错误.
> target <- c("id","a","b")
> df %>% unnest(cols=target)
Error in `unnest()`:
! In row 3, can't recycle input of size 3 to size 2.
Run `rlang::last_trace()` to see where the error occurred.
由于where it happens(行/列)和how many elements it will contain的不可预测性,我找不到适当的方法来处理这个问题.列数及其名称不能预先确定.
我很欣赏你的建议,特别是简单的建议,可以在dplyr
年内整合到目前的管道运营中.Base R
和其他方法也是受欢迎的.
====
Rreproducible example data个
让我分享一个我正在处理的实际数据帧的可复制示例.它不适用于@ JumasIsCoding的解决方案.我怀疑它是由NULL引起的,但这不是原因.空白/空白不一定重复.
structure(list(cluster = c("1", "2", "3", "4", "5", "6"), st_sub_main_th = list(
"hira", NULL, "tsuma", "tsuma", NULL, c("other", "hira")),
roo_main = list("2", "4", "3", "2", c("1", "3"), c("6", "7",
"2", "1")), st_con_rt = list("sub-room", "main-room", "sub-room",
"sub-room", "main-room", "sub-room"), st_con_tr = list(
"terrace", c("terrace", "direct"), "terrace", "terrace",
"terrace", "direct"), st_adsb = list("add", "add", "add",
"sub", "add", "sub"), st_th = list(NULL, "tsuma", NULL,
NULL, "hira", NULL), st_sub2_main_th = list(NULL, NULL,
NULL, "hira", "hira", "tsuma"), isstilt = list(NULL,
NULL, NULL, NULL, NULL, "0")), class = "data.frame", row.names = c(NA,
-6L))