我有一个数据框df,其中包含食物及其相应的配料(df粘贴在末尾).

我对哪些食物含有"面粉"、"水"或"盐"感兴趣.

使用str_detect,您可以确定食物是否包含以下一种或多种成分:

library(tidyverse)

strings_to_check <- c("Water", "Salt", "Flour")

df2 <- df %>%
  mutate(Key_Ingredient = str_detect(Ingredients, paste(strings_to_check, collapse = "|")))

How can I go one step further and obtain the count of key ingredients used and return which of the key ingredients were used?换句话说,我如何计算检测到的字符串列表中有多少个字符串,并返回在单独的值列中检测到的字符串?

预期输出为:

Food Ingredients Key_Ingredient Key_Count Key_Used
Appleberry Muffins Flour, Vanilla Extract, Olive Oil, Milk, Garlic, Carrots, Chicken TRUE 2 Flour, Salt
Blue Moon Pancakes Baking Powder, Garlic, Eggs, Ice, Sugar, Tofu, Rice FALSE 0 NA
Crystalized Starfruit Milk, Beef, Tofu, Rice, Salt, Garlic, Mushrooms TRUE 1 Salt
Dragonfruit Delight Rice, Milk, Pork, Yeast, Carrots, Tofu, Mushrooms FALSE 0 NA
Ethereal Eclairs Pasta, Flour, Water, Mushrooms, Chicken, Vanilla Extract, Yeast TRUE 2 Flour, Water
Flaming Firefruit Pepper, Yeast, Vanilla Extract, Sugar, Wheat, Olive Oil, Pork FALSE 0 NA
Glowing Grapes Garlic, Nutmeg, Beef, Salt, Tofu, Onions, Baking Powder TRUE 1 Salt
Honeydew Haze Salt, Water, Rice, Yeast, Flour, Honey, Mushrooms TRUE 2 Water, Salt
Iridescent Ice Cream Water, Salt, Onions, Pasta, Spinach, Pork, Carrots TRUE 2 Water, Salt
Jellybean Jamboree Salt, Eggs, Flour, Baking Powder, Water, Potatoes, Yeast TRUE 2 Water, Salt
Kiwi Kaleidoscope Water, Honey, Salt, Potatoes, Vanilla Extract, Pork, Pasta TRUE 1 Water
Lunar Lemons Salt, Tofu, Olive Oil, Baking Powder, Pork, Vanilla Extract, Cinnamon TRUE 1 Salt
Mystic Marshmallows Salt, Flour, Onions, Water, Chicken, Eggs, Milk TRUE 2 Flour, Water
Nebula Noodles Honey, Flour, Pork, Beef, Potatoes, Spinach, Chicken TRUE 1 Flour
Omega Oranges Mushrooms, Water, Salt, Olive Oil, Spinach, Tofu, Potatoes TRUE 2 Water, Salt
Phantom Peaches Wheat, Carrots, Baking Powder, Tofu, Eggs, Nutmeg, Potatoes FALSE 0 NA
Quasar Quince Honey, Tomatoes, Vanilla Extract, Flour, Garlic, Butter, Salt TRUE 2 Flour, Salt
Radiant Raspberries Salt, Yeast, Garlic, Rice, Sugar, Spinach, Baking Powder TRUE 1 Salt
Stellar Strawberries Flour, Onions, Spinach, Pork, Yeast, Water, Potatoes TRUE 2 Flour, Water
Twilight Tangerines Potatoes, Eggs, Kale, Beef, Spinach, Vanilla Extract, Milk FALSE 0 NA
Universal Ugli Fruit Cinnamon, Yeast, Potatoes, Flour, Salt, Water, Garlic TRUE 2 Water, Salt
Vortex Veggies Milk, Salt, Flour, Olive Oil, Garlic, Water, Spinach TRUE 2 Water, Salt
Whirlwind Walnuts Salt, Flour, Beef, Garlic, Milk, Potatoes, Olive Oil TRUE 2 Water, Salt
Xenon Xacuti Water, Salt, Yeast, Rice, Garlic, Vanilla Extract, Eggs TRUE 2 Water, Salt
Yellow Yams of Yore Vanilla Extract, Garlic, Chestnuts, Baking Powder, Tofu, Carrots, Sugar FALSE 0 NA
Zephyr Zucchini Pork, Honey, Baking Powder, Onions, Sugar, Yeast, Water TRUE 2 Water, Salt

Df的完整数据为:

df <- data.frame(
  Food = c("Appleberry Muffins", "Blue Moon Pancakes", "Crystalized Starfruit", 
           "Dragonfruit Delight", "Ethereal Eclairs", "Flaming Firefruit", 
           "Glowing Grapes", "Honeydew Haze", "Iridescent Ice Cream", 
           "Jellybean Jamboree", "Kiwi Kaleidoscope", "Lunar Lemons", 
           "Mystic Marshmallows", "Nebula Noodles", "Omega Oranges", 
           "Phantom Peaches", "Quasar Quince", "Radiant Raspberries", 
           "Stellar Strawberries", "Twilight Tangerines", "Universal Ugli Fruit", 
           "Vortex Veggies", "Whirlwind Walnuts", "Xenon Xacuti", 
           "Yellow Yams of Yore", "Zephyr Zucchini"),
  Ingredients = c("Flour, Vanilla Extract, Olive Oil, Milk, Garlic, Carrots, Chicken", 
                  "Baking Powder, Garlic, Eggs, Ice, Sugar, Tofu, Rice", 
                  "Milk, Beef, Tofu, Rice, Salt, Garlic, Mushrooms", 
                  "Rice, Milk, Pork, Yeast, Carrots, Tofu, Mushrooms", 
                  "Pasta, Flour, Water, Mushrooms, Chicken, Vanilla Extract, Yeast", 
                  "Pepper, Yeast, Vanilla Extract, Sugar, Wheat, Olive Oil, Pork", 
                  "Garlic, Nutmeg, Beef, Salt, Tofu, Onions, Baking Powder", 
                  "Salt, Water, Rice, Yeast, Flour, Honey, Mushrooms", 
                  "Water, Salt, Onions, Pasta, Spinach, Pork, Carrots", 
                  "Salt, Eggs, Flour, Baking Powder, Water, Potatoes, Yeast", 
                  "Water, Honey, Salt, Potatoes, Vanilla Extract, Pork, Pasta", 
                  "Salt, Tofu, Olive Oil, Baking Powder, Pork, Vanilla Extract, Cinnamon", 
                  "Salt, Flour, Onions, Water, Chicken, Eggs, Milk", 
                  "Honey, Flour, Pork, Beef, Potatoes, Spinach, Chicken", 
                  "Mushrooms, Water, Salt, Olive Oil, Spinach, Tofu, Potatoes", 
                  "Wheat, Carrots, Baking Powder, Tofu, Eggs, Nutmeg, Potatoes", 
                  "Honey, Tomatoes, Vanilla Extract, Flour, Garlic, Butter, Salt", 
                  "Salt, Yeast, Garlic, Rice, Sugar, Spinach, Baking Powder", 
                  "Flour, Onions, Spinach, Pork, Yeast, Water, Potatoes", 
                  "Potatoes, Eggs, Kale, Beef, Spinach, Vanilla Extract, Milk", 
                  "Cinnamon, Yeast, Potatoes, Flour, Salt, Water, Garlic", 
                  "Milk, Salt, Flour, Olive Oil, Garlic, Water, Spinach", 
                  "Salt, Flour, Beef, Garlic, Milk, Potatoes, Olive Oil", 
                  "Water, Salt, Yeast, Rice, Garlic, Vanilla Extract, Eggs", 
                  "Vanilla Extract, Garlic, Chestnuts, Baking Powder, Tofu, Carrots, Sugar", 
                  "Pork, Honey, Baking Powder, Onions, Sugar, Yeast, Water")
)

推荐答案

您可以使用str_count和str_Match_all函数.

df2 <- df %>%
    mutate(Key_Ingredient = str_detect(Ingredients, paste(strings_to_check, collapse = "|"))) %>% 
    mutate(Key_Count=str_count(Ingredients,paste(strings_to_check,collapse="|"))) %>% 
    mutate(Key_Used=str_match_all(Ingredients,paste(strings_to_check,collapse="|")))

R相关问答推荐

是否可以 Select 安装不带文档的R包以更有效地存储?

为什么当我try 在收件箱中使用合并功能时会出现回收错误?

在R中查找每个组不同时间段的总天数

行式dppr中的变量列名

用预测NLS处理R中生物学假设之上的误差传播

在数组索引上复制矩阵时出错

如何提取所有完美匹配的10个核苷酸在一个成对的匹配与生物字符串在R?>

如何将R中数据帧中的任何Nas替换为最后4个值

用约翰逊分布进行均值比较

将二进制数据库转换为频率表

合并后返回列表的数据帧列表

以NA为通配符的R中的FULL_JOIN以匹配其他数据中的任何值.Frame

从多层嵌套列表构建Tibble?

有没有办法一次粘贴所有列

需要一个函数来在第一行创建一个新变量,然后用新变量替换一个不同的变量(对于多行)

如何使用grepl()在数据帧列表中 Select 特定字符串?

如何获取R chromote中的当前URL?

根据用户输入更改标记大小和 colored颜色 (R)

如果y中存在x中的值,则将y行中的多个值复制到相应的x行中

如何修复geom_rect中的层错误?