Note:

请注意,在发帖之前,我已经try 了以下方法来解决我的问题:

试图解决我的问题,但没有成功

Problem

假设我有以下数据,它显示了项从开始到结束的流动方式

> run = c(1, 2, 3, 3, 4, 5, 5, 5, 6, 7, 7, 7, 8, 9, 10, 10, 11)
> start_location = c("A", "C", "A", "B", "A", "B", "C", "A", "B", "C", "B", "A", "A", "A", "A", "B", "C")
> end_location = c("B", "B", "B", "C", "C", "C", "A", "C", "A", "B", "A", "C", "B", "C", "B", "C", "B")
> df = data.frame(run, start_site, end_site)
> df
   run start_site end_site
1    1          A        B
2    2          A        C
3    3          A        B
4    3          B        C
5    4          A        C
6    5          B        C
7    5          C        A
8    5          A        C
9    6          B        A
10   7          C        B
11   7          B        A
12   7          A        C
13   8          A        B
14   9          A        C
15  10          A        B
16  10          B        C
17  11          C        B

我想将数据转换为如下所示的"宽"格式,每个stage实例都有一个新的列.

> # Desired result
      run  first_location second_location third_location fourth_location
 [1,] "1"  "A"            "B"             NA             NA             
 [2,] "2"  "C"            "B"             NA             NA             
 [3,] "3"  "A"            "B"             "C"            NA             
 [4,] "4"  "A"            "C"             NA             NA             
 [5,] "5"  "B"            "C"             "A"            "C"            
 [6,] "6"  "C"            "A"             NA             NA             
 [7,] "7"  "C"            "B"             "A"            "C"            
 [8,] "8"  "A"            "B"             NA             NA             
 [9,] "9"  "A"            "C"             NA             NA             
[10,] "10" "A"            "B"             "C"            NA             
[11,] "11" "C"            "B"             NA             NA     

Attempted Solution

我已经try 了以下几种方法,但没有得到预期的效果.我的专栏比我需要的多.

> library(dplyr)
> library(tidyr)
>
> # Unsuccessful attempt
> df_long = melt(df, id.vars=c("run"))
> df_long %>%
  select(!variable) %>%
  group_by(run) %>%
  dplyr::mutate(rn = paste0("location_",row_number())) %>%
  spread(rn, value)

# A tibble: 11 x 7
# Groups:   run [11]
     run location_1 location_2 location_3 location_4 location_5 location_6
   <dbl> <chr>      <chr>      <chr>      <chr>      <chr>      <chr>     
 1     1 A          B          NA         NA         NA         NA        
 2     2 A          C          NA         NA         NA         NA        
 3     3 A          B          B          C          NA         NA        
 4     4 A          C          NA         NA         NA         NA        
 5     5 B          C          A          C          A          C         
 6     6 B          A          NA         NA         NA         NA        
 7     7 C          B          A          B          A          C         
 8     8 A          B          NA         NA         NA         NA        
 9     9 A          C          NA         NA         NA         NA        
10    10 A          B          B          C          NA         NA        
11    11 C          B          NA         NA         NA         NA    

有没有人能帮我找出错误,帮我得到想要的结果?

谢谢你看我的帖子.

推荐答案

基于rletidyr::unnest_wider的解决方案.

run = c(1, 2, 3, 3, 4, 5, 5, 5, 6, 7, 7, 7, 8, 9, 10, 10, 11)
start_location = c("A", "C", "A", "B", "A", "B", "C", "A", "B", "C", "B", "A", "A", "A", "A", "B", "C")
end_location = c("B", "B", "B", "C", "C", "C", "A", "C", "A", "B", "A", "C", "B", "C", "B", "C", "B")
df = data.frame(run = run, from = start_location, to = end_location)

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tidyr)

df %>% group_by(run) %>% 
  summarise(location = list(rle(as.vector(t(cbind(from, to))))$values)) %>%
  unnest_wider(location, names_sep = "_")
#> # A tibble: 11 × 5
#>      run location_1 location_2 location_3 location_4
#>    <dbl> <chr>      <chr>      <chr>      <chr>     
#>  1     1 A          B          <NA>       <NA>      
#>  2     2 C          B          <NA>       <NA>      
#>  3     3 A          B          C          <NA>      
#>  4     4 A          C          <NA>       <NA>      
#>  5     5 B          C          A          C         
#>  6     6 B          A          <NA>       <NA>      
#>  7     7 C          B          A          C         
#>  8     8 A          B          <NA>       <NA>      
#>  9     9 A          C          <NA>       <NA>      
#> 10    10 A          B          C          <NA>      
#> 11    11 C          B          <NA>       <NA>

创建于2022-11-25年第reprex v2.0.2

R相关问答推荐

如何使下一个按钮只出现在Rshiny 的一段时间后?""

R s iml包如何处理语法上无效的因子级别?'

R函数‘paste`正在颠倒其参数的顺序

如何编辑gMarginal背景以匹配绘图背景?

R中1到n_1,2到n_2,…,n到n_n的所有组合都是列表中的向量?

根据1个变量绘制 colored颜色 发散的 map ,由另一个变量绘制饱和度,ggplot2不工作

您是否可以将组添加到堆叠的柱状图

汇总数据的Sheffe检验的P值(平均值和标准差)

KM估计的差异:SvyKm与带权重的调查

随机 Select 的非NA列的行均数

整理ggmosaic图的标签

按两个条件自动过滤数据

组合名称具有模式的列表的元素

使用点图调整离散轴比例

向内存不足的数据帧添加唯一行

Rshiny添加外部函数

如何避免在从.xlsx导入的数据框中的新列名之后添加前列名?

调整shiny 仪表板侧边栏中的文本位置

使用示例reshape 数据框

如何在R中组合多个地块?