在 R tidyverse 中替换字符串中的第一组数字

发布于08月16日

我正在使用R tidyverse并try 解析列名中一些字符之后的第一组数字.我想保留字符和第二组数字，但删除第一组数字.例如，在下面的df中，我们有"var1_1975"和"var1_1976".转换后，这些变量应该命名为"var_75"和"var_76". 我正在try 这样做:

library(tidyverse)

df <- data.frame("var1_1975" = c(1:5), 
          "var1_1976" = c(3,2,1,1,1),
          "age" = c(25,41,39,60,36) ,
          "satisfaction" = c(5,3,2,5,4)
          )

#  Output
# var1_1975 var1_1976 age satisfaction
# 1         1         3  25            5
# 2         2         2  41            3
# 3         3         1  39            2
# 4         4         1  60            5
# 5         5         1  36            4


cols <- df%>% 
select(c(1:2)) %>% #select some cols by index
names() #retain only the col names

df <- df %>% 
  rename_with(.fn = ~ gsub("\\d+", "", .x, fixed = F),
          .cols=contains("var") & ( contains("1975") | contains("1976")   ) ) %>%
  rename_with( .fn = function(.x){paste0(.x,  "_",
                                     parse_number(gsub("var1_","",cols)) -1900)},
           .cols=(contains("var")  ))  #add year as suffix

library(tidyverse) df <- data.frame("var1_1975" = c(1:5), "var1_1976" = c(3,2,1,1,1), "age" = c(25,41,39,60,36) , "satisfaction" = c(5,3,2,5,4) ) df %>% rename_with(~gsub("([^\\d]+)\\d+_(\\d{2})(.*)", "\\1_\\3", .x)) #> var_75 var_76 age satisfaction #> 1 1 3 25 5 #> 2 2 2 41 3 #> 3 3 1 39 2 #> 4 4 1 60 5 #> 5 5 1 36 4

pattern = "([^\\d]+)\\d+_(\\d{2})(.*)" 1st capture group: "([^\\d]+)" match anything that's not a digit don't capture: "\\d+_" match any digits before the underscore 2nd capture group: "(\\d{2})" match exactly two digits (i.e. 19) 3rd capture group: "(\\d+)" match any remaining digits (i.e. 75 or 76) replacement = "\\1_\\3" print the first and third captured groups, separated by an underscore

在 R tidyverse 中替换字符串中的第一组数字

推荐答案

R相关问答推荐

R中具有gggplot 2的Likert图，具有不同的排名水平和显示百分比

使用R中的Shapetime裁剪格栅文件

咕噜中的元素列表：map

使用gggrassure减少地块之间的空间

如何使用`ggplot2：：geom_segment()`或`ggspatial：：geom_spatial_segment()`来处理不在格林威治中心的sf对象？

如何在xyplot中 for each 面板打印R^2

获取列中值更改的行号

我想在R中总结一个巨大的数据框架，使我只需要唯一的lat、lon、Date(Year)和Maxium Value""""""""

IMF IFS数据以R表示

使用rvest从多个页面抓取时避免404错误

无法定义沿边轨迹的 colored颜色渐变(与值无关)

如何从容器函数中提取conf并添加到ggplot2中？

以NA为通配符的R中的FULL_JOIN以匹配其他数据中的任何值.Frame

为什么在BASE R中绘制线条时会看到线上的点？

随机森林的带Shap值的蜂群图

R -如何分配夜间GPS数据(即跨越午夜的数据)相同的开始日期？

计算使一组输入值最小化的a、b和c的值

快速合并R内的值

Ggplot2如何找到存储在对象中的残差和拟合值？

注释不会绘制在所有ggplot2面上