正如我的问题所指出的,我想把一个字符串向量转换成一个新的向量,它是每个字符串中出现的两个值之一.下面是一个非常简单的数据帧的例子:
data <- tibble::tibble(
w = c("Strongly disagree", "Somewhat disagree", "Disagree", "Somewhat agree", "Strongly agree", "Agree"),
x = c("Definitely true", "Probably true", "Somewhat false", "Definitely false", "Definitely true", "Definitely false"),
y = c("Definitely not doing enough", "Definitely doing enough", "Possibly not doing enough", "Possibly doing enough", "Definitely not doing enough", "Somehat doing enough"),
z = c("Very comfortable", "Comfortable", "Somewhat comfortable", "Very uncomfortable", "Somewhat uncomfortable", "Comfortable")
)
我们可以看到,w
个字符串中的每一根都要么同意,要么不同意;x
要么是真的,要么是假的;y
的弦要么做得够多,要么做得不够;z
的弦要么舒服,要么不舒服.有没有一个函数可以让我根据每列中出现的两个值中的一个来创建一个新的向量?让我来说明一下我的意思.
# write up a function
some_function <- function(arguments) {
"function text goes here"
}
# use new function to create a vector based on `w` from `data`
data %>% some_function(w)
# resulting vector would be:
[1] "Disagree" "Disagree" "Disagree" "Agree" "Agree" "Agree
我得到的最接近的是这个函数.但是,它会删除字符串的第一个单词.如果每个字符串的第一个单词是描述字符串其余部分的形容词,这将是很好的,但在字符串只是一个单词的情况下,它会给我安娜.
# write function
make_dicho <- function(df = data, var) {
df %>%
# pick out the column (equivalent to df[[var]])
dplyr::pull({{ var }}) %>%
# convert to a factor
haven::as_factor() %>%
# remove the first part of the factor
stringr::str_extract("(?<=\\s).+") %>%
# make the first letter uppercase
stringr::str_to_sentence()
}
# test this on the fake data
data %>% make_dicho(., w)
[1] "Disagree" "Disagree" NA "Agree" "Agree" NA
我之所以在里面有df
参数,是因为我想在dplyr::mutate()
的内部使用这个函数,就像data %>% mutate(new_a = make_dicho(., w)
一样.