R 提取字符串末尾括号内的数字 (YYYY)

发布于08月21日

我试图在R中创建两个独立的专栏.我遇到的问题是，根据观测类型的不同，年份并不与前一个专栏分开.

数据框中的一些名字只是名字，另一些名字则有名字和姓氏.我正在try 让Name显示第一个和第一个/最后一个，并将Year与当前的Name列分开.

虚假数据=员工和他们开始工作的年份

创建数据框

library(tidyverse)

dat <- tibble(Name = c("Percy Vere (2020)", "Ginger Plant (2017)", "Perry (2019)",
                    "Pat Thettick (2020)", "Samuel (2022)", "Fay Daway (2008)",
                    "Greg (2022)", "Simon Sais (2011)"))

# A tibble: 8 x 1
  Name               
  <fct>              
1 Percy Vere (2020)  
2 Ginger Plant (2017)
3 Perry (2019)       
4 Pat Thettick (2020)
5 Samuel (2022)      
6 Fay Daway (2008)   
7 Greg (2022)        
8 Simon Sais (2011)

try 将该列拆分为两列:Name和Year

dat %>% 
  select_all() %>% 
  separate(col = Name, into = c('Name', 'Year')) %>%    # sep = ',' and ';' does not create a fix 
  tibble()

# A tibble: 8 x 2
  Name   Year    
  <chr>  <chr>   
1 Percy  Vere    
2 Ginger Plant   
3 Perry  2019    
4 Pat    Thettick
5 Samuel 2022    
6 Fay    Daway   
7 Greg   2022    
8 Simon  Sais    
Warning message:
Expected 2 pieces. Additional pieces discarded in 8 rows [1, 2, 3, 4, 5, 6, 7, 8].

dat %>% extract(Name, c("Name", "Year"), "(.*) .*?(\\d+)") # A tibble: 8 × 2 Name Year <chr> <chr> 1 Percy Vere 2020 2 Ginger Plant 2017 3 Perry 2019 4 Pat Thettick 2020 5 Samuel 2022 6 Fay Daway 2008 7 Greg 2022 8 Simon Sais 2011

dat %>% separate(Name, c("Name", "Year"), " \\(|\\)", extra = 'drop') # A tibble: 8 × 2 Name Year <chr> <chr> 1 Percy Vere 2020 2 Ginger Plant 2017 3 Perry 2019 4 Pat Thettick 2020 5 Samuel 2022 6 Fay Daway 2008 7 Greg 2022 8 Simon Sais 2011

dat %>% mutate(Year = str_extract(Name, '\\d+'), Name = str_extract(Name, "\\D+(?= )")) # A tibble: 8 × 2 Name Year <chr> <chr> 1 Percy Vere 2020 2 Ginger Plant 2017 3 Perry 2019 4 Pat Thettick 2020 5 Samuel 2022 6 Fay Daway 2008 7 Greg 2022 8 Simon Sais 2011

dat %>% mutate(Year = str_remove_all(Name, '\\D+'), Name = str_remove(Name, " [(]\\d+[)]")) # A tibble: 8 × 2 Name Year <chr> <chr> 1 Percy Vere 2020 2 Ginger Plant 2017 3 Perry 2019 4 Pat Thettick 2020 5 Samuel 2022 6 Fay Daway 2008 7 Greg 2022 8 Simon Sais 2011

R 提取字符串末尾括号内的数字 (YYYY)

推荐答案

R相关问答推荐

在交互式情节中从barplot中获取值时遇到问题，在shinly中的ggplotly

使用gsim删除特殊词

R -模运算后的加法

按崩溃类别分类的指数

列出用m n个值替换来绘制n个数字的所有方法(i.o.w.：R中大小为n的集合的所有划分为m个不同子集)

R中具有gggplot 2的Likert图，具有不同的排名水平和显示百分比

如何利用模型函数在格图中添加双曲/指数曲线

然后根据不同的列值有条件地执行函数

筛选出以特定顺序患病的个体

根据多个条件增加y轴高度以适应geom_text标签

2个Rscript.exe可执行文件有什么区别？

如何基于两个条件从一列中提取行

使用范围和单个数字将数字与字符串进行比较

Ggplot2中geom_tile的动态zoom

基于Key->Value数据帧的基因子集相关性提取

在带有`R`中的`ggmosaic`的马赛克图中使用图案而不是 colored颜色

如何在R中使用混合GAM模型只对固定的影响因素进行适当的预测？

使用ggplot2中的sec_axis()调整次轴

是否从列中删除★符号？

Data.table：：Shift type=允许扩展数据(&Q；LAG&Q；)