我有如下数据:

     grp REGIONNAME RegionName `Año 2004_1` `Año 2004_2` `Año 2004_3`
   <int> <chr>      <chr>             <dbl>        <dbl>        <dbl>
 1     1 ANDALUCÍA  ANDALUCÍA         32143        37962        32374
 2     1 ANDALUCÍA  Almería              NA           NA           NA
 3     1 ANDALUCÍA  Abla                 58           61           54
 4     1 ANDALUCÍA  Abrucena              6            2            1
 5     1 ANDALUCÍA  Adra                146          211          101
 6     1 ANDALUCÍA  Albánchez            12            3            3
 7     1 ANDALUCÍA  Alboloduy             2            2            2
 8     1 ANDALUCÍA  Albox                33           66           35
 9     1 ANDALUCÍA  Alcolea               0            1            1
10     1 ANDALUCÍA  Alcóntar              1            1            2

在这个示例中,它包含2NA行,一行用于Almeria,另一行用于Balanegra.

我想创建一个新的专栏,比如说RegionName篇.它将由这两个单元格填充.i、 e.预期输出为:

     grp REGIONNAME RegionName    RegionName
   <int> <chr>      <chr>            <chr>
 1     1 ANDALUCÍA  ANDALUCÍA        ANDALUCIA/NA
 2     1 ANDALUCÍA  Almería            Almeria
 3     1 ANDALUCÍA  Abla               Almeria
 4     1 ANDALUCÍA  Abrucena           Almeria
 5     1 ANDALUCÍA  Adra               Almeria
 6     1 ANDALUCÍA  Albánchez          Almeria
 7     1 ANDALUCÍA  Alboloduy          Almeria
 8     1 ANDALUCÍA  Albox                ...
 9     1 ANDALUCÍA  Alcolea              ...
10     1 ANDALUCÍA  Alcóntar             ...
               ...............
  
 1     1 ANDALUCÍA  Bacares              ...
 2     1 ANDALUCÍA  Balanegra          Balanegra
 3     1 ANDALUCÍA  Bayárcal           Balanegra
 4     1 ANDALUCÍA  Bayarque           Balanegra
 5     1 ANDALUCÍA  Bédar              Balanegra
 6     1 ANDALUCÍA  Beires    
 7     1 ANDALUCÍA  Benahadux           ....
 8     1 ANDALUCÍA  Benitagla           ....
 9     1 ANDALUCÍA  Benizalón 
10     1 ANDALUCÍA  Bentarique         Balanegra

因此,当它在3列中看到NA的值时,它表示一个新的"区域".

最后,我想将这个新创建的区域设置为group_by,并计算cumsum,以填充NA个值.

当我想填写ANDALUCIA的NA值时,我做了一些与REGIONNAME列"类似"的事情.

... %>%
  group_by(grp = cumsum(RegionName == toupper(RegionName))) %>%
  mutate(REGIONNAME = first(RegionName)) %>% 
  relocate(REGIONNAME, .before = RegionName) %>% 
  mutate(across(starts_with("Año"), 
                ~ ifelse(REGIONNAME == RegionName, sum(.x[REGIONNAME != RegionName], na.rm = T), .x)))

数据:

df = structure(list(grp = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L), REGIONNAME = c("ANDALUCÍA", "ANDALUCÍA", "ANDALUCÍA", 
"ANDALUCÍA", "ANDALUCÍA", "ANDALUCÍA", "ANDALUCÍA", "ANDALUCÍA", 
"ANDALUCÍA", "ANDALUCÍA", "ANDALUCÍA", "ANDALUCÍA", "ANDALUCÍA", 
"ANDALUCÍA", "ANDALUCÍA", "ANDALUCÍA", "ANDALUCÍA", "ANDALUCÍA", 
"ANDALUCÍA", "ANDALUCÍA", "ANDALUCÍA", "ANDALUCÍA", "ANDALUCÍA", 
"ANDALUCÍA", "ANDALUCÍA", "ANDALUCÍA", "ANDALUCÍA", "ANDALUCÍA", 
"ANDALUCÍA", "ANDALUCÍA"), RegionName = c("ANDALUCÍA", "Almería", 
"Abla", "Abrucena", "Adra", "Albánchez", "Alboloduy", "Albox", 
"Alcolea", "Alcóntar", "Alcudia de Monteagud", "Alhabia", "Alhama de Almería", 
"Alicún", "Almería", "Almócita", "Alsodux", "Antas", "Arboleas", 
"Armuña de Almanzora", "Bacares", "Balanegra", "Bayárcal", 
"Bayarque", "Bédar", "Beires", "Benahadux", "Benitagla", "Benizalón", 
"Bentarique"), `Año 2004_1` = c(32143, NA, 58, 6, 146, 12, 2, 
33, 0, 1, 1, 1, 13, 0, 748, 0, 1, 6, 16, 0, 2, NA, 0, 0, 8, 0, 
18, 1, 2, 0), `Año 2004_2` = c(37962, NA, 61, 2, 211, 3, 2, 
66, 1, 1, 1, 0, 15, 1, 770, 0, 10, 12, 16, 0, 1, NA, 1, 0, 2, 
0, 21, 0, 0, 0), `Año 2004_3` = c(32374, NA, 54, 1, 101, 3, 
2, 35, 1, 2, 0, 0, 14, 0, 701, 0, 3, 26, 14, 0, 0, NA, 0, 3, 
8, 0, 25, 0, 2, 0)), class = c("grouped_df", "tbl_df", "tbl", 
"data.frame"), row.names = c(NA, -30L), groups = structure(list(
    grp = 1L, .rows = structure(list(1:30), ptype = integer(0), class = c("vctrs_list_of", 
    "vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L), .drop = TRUE))

推荐答案

您可以使用c_acrossfill:

library(tidyverse)

df %>% 
  rowwise() %>% 
  mutate(Region = case_when(all(is.na(c_across(starts_with("Año")))) ~ RegionName)) %>% 
  ungroup() %>% 
  fill(Region)

# A tibble: 30 × 7
     grp REGIONNAME RegionName           `Año 2004_1` `Año 2004_2` `Año 2004_3` Region   
   <int> <chr>      <chr>                       <dbl>        <dbl>        <dbl> <chr>    
 1     1 ANDALUCÍA  ANDALUCÍA                   32143        37962        32374 NA       
 2     1 ANDALUCÍA  Almería                        NA           NA           NA Almería  
 3     1 ANDALUCÍA  Abla                           58           61           54 Almería  
 4     1 ANDALUCÍA  Abrucena                        6            2            1 Almería  
 5     1 ANDALUCÍA  Adra                          146          211          101 Almería  
 6     1 ANDALUCÍA  Albánchez                      12            3            3 Almería  
 7     1 ANDALUCÍA  Alboloduy                       2            2            2 Almería  
 8     1 ANDALUCÍA  Albox                          33           66           35 Almería  
 9     1 ANDALUCÍA  Alcolea                         0            1            1 Almería  
10     1 ANDALUCÍA  Alcóntar                        1            1            2 Almería  
11     1 ANDALUCÍA  Alcudia de Monteagud            1            1            0 Almería  
12     1 ANDALUCÍA  Alhabia                         1            0            0 Almería  
13     1 ANDALUCÍA  Alhama de Almería              13           15           14 Almería  
14     1 ANDALUCÍA  Alicún                          0            1            0 Almería  
15     1 ANDALUCÍA  Almería                       748          770          701 Almería  
16     1 ANDALUCÍA  Almócita                        0            0            0 Almería  
17     1 ANDALUCÍA  Alsodux                         1           10            3 Almería  
18     1 ANDALUCÍA  Antas                           6           12           26 Almería  
19     1 ANDALUCÍA  Arboleas                       16           16           14 Almería  
20     1 ANDALUCÍA  Armuña de Almanzora             0            0            0 Almería  
21     1 ANDALUCÍA  Bacares                         2            1            0 Almería  
22     1 ANDALUCÍA  Balanegra                      NA           NA           NA Balanegra
23     1 ANDALUCÍA  Bayárcal                        0            1            0 Balanegra
24     1 ANDALUCÍA  Bayarque                        0            0            3 Balanegra
25     1 ANDALUCÍA  Bédar                           8            2            8 Balanegra
26     1 ANDALUCÍA  Beires                          0            0            0 Balanegra
27     1 ANDALUCÍA  Benahadux                      18           21           25 Balanegra
28     1 ANDALUCÍA  Benitagla                       1            0            0 Balanegra
29     1 ANDALUCÍA  Benizalón                       2            0            2 Balanegra
30     1 ANDALUCÍA  Bentarique                      0            0            0 Balanegra

R相关问答推荐

将复杂的组合列表转换为数据框架

更改编号列表的 colored颜色

使用ggsankey调整Sankey图中单个 node 上的标签

自动变更列表

使用geom_segment()对y轴排序

R s iml包如何处理语法上无效的因子级别?'

提取具有连续零值的行,如果它们前面有R中的有效值

如何提取所有完美匹配的10个核苷酸在一个成对的匹配与生物字符串在R?>

2个Rscript.exe可执行文件有什么区别?

根据列A中的差异变异列,其中行由列B中的相对值标识

在散点图中使用geom_point放置线图例

使用列中的值来调用函数调用中应使用的其他列

层次树图的数据树

R:使用ApexCharge更改标签在饼图中的位置

conditionPanel不考虑以下条件

重写时间间隔模糊连接以减少内存消耗

如何在R中添加标识连续日期的新列

R中的交叉表

从data.table列表中提取特定组值,并在R中作为向量返回

根据小时-分钟列创建年-月-日序列