我有一个数据帧,lexicon
,有650个单词,我想通过从lexicon
中随机 Select 单词,为5个说话者创建一系列随机单词列表.我想在为期24个月的数据收集期间完成这项工作,每个月都会采集不同大小的词汇样本.指定月份和词汇表大小的基本数据帧为df1
:
df1 <- data.frame(months=rep(1:24, times=5, each=1),
vocab_size=(sample(c(0:25), 120, replace=TRUE)),
Speaker=rep(c("A", "B", "C", "D", "E"), times=1, each=24))
list1 <- split(df1, f=df1$Speaker)
lexicon
大概是这样的:
lexicon <- data.frame(c("a", "about", "above", "ain't", "all", "am", "an", "and",
"animal", "ankle", "ant" ,"any", "apple","applesauce",
"asleep", "at", "ate", "aunt", "auntie", "aunty's",
"awake", "away", "baa", "baby" , "baby+doll", "bad" ,
"ball", "balloon", "banana", "basket", "bat", "bath",
"bathing", "bathtub", "be", "beach", "bead", "bean",
"because", "bed", "beddy", "bee", "been", "behind",
"being", "belt", "bench", "bib", "bicycle", "big"))
在使用以下代码之后,我一直在try 生成我想要的输出:
vocab_data <- lapply(list1, FUN=function(element) {
all_vocab <- slice_sample(lexicon, n=element$vocab_size, replace=TRUE)
})
但我收到以下错误消息
Error in `slice_sample()`:
! `n` must be a constant.
Caused by error in `element$vocab_size`:
! $ operator is invalid for atomic vectors
有没有办法通过这种方式从数据框中提取不同大小的样本, for each 说话者创建一个每个月的词汇表?