我正在try 将R的dplyr代码转换为R极点,如果存在部分匹配,它会用另一个字符串替换整个字符串.
library(polars)
library(dplyr)
df <- data.frame(category = c('Cats A','Cats B','kittens','street cats','dogs A','dogs B'))
#replace string that contains 'cats' and 'kitten' with 'cats'
df %>%
mutate(replaced = replace(category,
grepl(paste0(c('cats','kittens'), collapse = '|'), category, ignore.case = TRUE),
'CATS')
)
#Output
category replaced
Cats A CATS
Cats B CATS
kittens CATS
street cats CATS
dogs A dogs A
dogs B dogs B
我想在极点复制这一点,并try 如下所示:
p_df <- pl$DataFrame(df) #to polars dataframe
p_df$with_columns(replaced = pl$col('category')$str$replace_many(c("Cats","kittens"),"CATS"))
和
...$str$replace(r"{cats}",'Cats'))
这将仅替换匹配的部分,而不是整个字符串.我不知道该怎么做.一个Python实现也会有所帮助.
#output
┌─────────────┬─────────────┐
│ category ┆ replaced │
│ --- ┆ --- │
│ str ┆ str │
╞═════════════╪═════════════╡
│ Cats A ┆ CATS A │
│ Cats B ┆ CATS B │
│ kittens ┆ CATS │
│ street cats ┆ street cats │
│ dogs A ┆ dogs A │
│ dogs B ┆ dogs B │
└─────────────┴─────────────┘