我使用BUPAR包进行流程分析.假设我存储在CSV文件中的数据如下所示(该文件已经按CaseID和时间戳正确排序):

STATUS;timestamp;CASEID
created;16-02-2023 09:46:32;1
revised;13-04-2023 23:58:59;1
accepted;13-04-2023 23:59:59;1
created;16-02-2023 09:46:32;2
accepted;13-04-2023 23:59:59;2
created;14-12-2022 13:17:54;3
revised;02-01-2023 23:59:59;3
accepted;28-02-2023 19:37:01;3
submitted;03-03-2023 23:59:59;3
created;02-01-2023 07:45:43;5
created;24-01-2022 16:05:58;6
accepted;03-02-2022 23:59:59;6
created;24-01-2022 15:52:53;7
accepted;03-02-2022 23:59:59;7
created;15-08-2022 12:54:23;8
rejected;18-08-2022 23:59:59;8
created;21-03-2022 15:32:05;9
accepted;26-04-2022 23:59:59;9
created;21-03-2022 15:42:39;10

ID为1的第一个 case 的跟踪为"Created-Revision-Accept".因此,首先是事件创建,然后修改,然后接受.

我现在使用以下代码创建流程图:

library(bupaR)
library(processmapR)
library(edeaR)

datafile <- read.csv(file="pathtofile\\testfile.csv",header=T, sep=";")
datafile$timestampcolumn <- as.POSIXct(datafile$timestamp, format="%d-%m-%Y %H:%M:%S")

mytest <- simple_eventlog(datafile, case_id = "CASEID", activity_id = "STATUS", timestamp = "timestampcolumn")

process_map(mytest, type = frequency("absolute"))

这提供了:

outputmap

现在,我想将每个 case 的踪迹添加到我的原始文件中.当然,案件的痕迹总是相同的.因此,输出应如下所示(跟踪中的每个事件由示例"-"分隔):

STATUS;timestamp;CASEID;trace
created;16-02-2023 09:46:32;1;created-revised-accepted
revised;13-04-2023 23:58:59;1;created-revised-accepted
accepted;13-04-2023 23:59:59;1;created-revised-accepted
created;16-02-2023 09:46:32;2;created-accepted
accepted;13-04-2023 23:59:59;2;created-accepted
created;14-12-2022 13:17:54;3;created-revised-accepted-submitted
revised;02-01-2023 23:59:59;3;created-revised-accepted-submitted
accepted;28-02-2023 19:37:01;3;created-revised-accepted-submitted
submitted;03-03-2023 23:59:59;3;created-revised-accepted-submitted
created;02-01-2023 07:45:43;5;created
created;24-01-2022 16:05:58;6;created-accepted
accepted;03-02-2022 23:59:59;6;created-accepted
created;24-01-2022 15:52:53;7;created-accepted
accepted;03-02-2022 23:59:59;7;created-accepted
created;15-08-2022 12:54:23;8;created-rejected
rejected;18-08-2022 23:59:59;8;created-rejected
created;21-03-2022 15:32:05;9;created-accepted
accepted;26-04-2022 23:59:59;9;created-accepted
created;21-03-2022 15:42:39;10;created

我试着try 使用filter_activitytrace_list(来自edeaR包)和其他命令,但我无法理解.我想使用PROCESS_MAP算法/bupar包代码的结果.以使其与图形中的输出相对应.因此,我不想自己手动实现算法来计算轨迹.所以我当然可以实现一个算法来判断每个 case ,并写下状态等等.但不知何故,这已经在bupar eventlog/process_map命令中了,我想使用它.我想深入挖掘细节,看看哪个案件根据图表有具体的痕迹.这就是为什么让它与bupar输出保持一致,而不是用单独的算法对其进行编程的原因.这一信息必须以某种方式已经包括在内,否则该图将不存在.

那么,我如何才能做到这一点呢?

推荐答案

我从来没有使用过这些包,但我是这样解决问题的:

  1. 我看了看mytest届的学生:
class(mytest)
# [1] "eventlog"   "log"        "tbl_df"     "tbl"        "data.frame"
  1. 我查看了为类eventlog定义的方法:
methods(class = "eventlog")
# [1] act_collapse                     activities                       activity_frequency              
# [4] activity_instance_id             activity_presence                add_end_activity                
# [7] add_start_activity               arrange                          calculate_queuing_times         
# [10] case_id                          case_list                        cases                           
# [13] detect_resource_inconsistencies  dotted_chart                     durations                       
# [16] end_activities                   events_to_activitylog            filter                          
# [19] filter_activity_instance         filter_attributes                filter_endpoints_condition      
# [22] filter_infrequent_flows          filter_lifecycle                 filter_lifecycle_presence       
# [25] filter_precedence_resource       filter_time_period               filter_trim                     
# [28] filter_trim_lifecycle            first_n                          fix_resource_inconsistencies    
# [31] group_by                         group_by_activity                group_by_activity_instance      
# [34] group_by_case                    group_by_resource                group_by_resource_activity      
# [37] idle_time                        last_n                           lifecycle_id                    
# [40] lifecycle_labels                 lifecycles                       lined_chart                     
# [43] mapping                          mutate                           n_activity_instances            
# [46] n_events                         number_of_repetitions            number_of_selfloops             
# [49] process_map                      process_matrix                   processing_time                 
# [52] redo_repetitions_referral_matrix redo_selfloops_referral_matrix   resource_frequency              
# [55] resource_id                      resource_map                     resource_matrix                 
# [58] resources                        sample_n                         select                          
# [61] set_activity_instance_id         set_timestamp                    setdiff                         
# [64] size_of_repetitions              size_of_selfloops                slice_activities                
# [67] slice_events                     standardize_lifecycle            start_activities                
# [70] summarise                        summary                          throughput_time                 
# [73] timestamp                        timestamps                       to_activitylog                  
# [76] trace_explorer                   trace_length                     trace_list                      
# [79] ungroup_eventlog                 unite
  1. 我试了几个函数,直到我找到了解决你问题的那个:case_list

布设

library(bupaR)
library(processmapR)
library(edeaR)
library(dplyr)

d <- readr::read_delim(
"STATUS;timestamp;CASEID
created;16-02-2023 09:46:32;1
revised;13-04-2023 23:58:59;1
accepted;13-04-2023 23:59:59;1
created;16-02-2023 09:46:32;2
accepted;13-04-2023 23:59:59;2
created;14-12-2022 13:17:54;3
revised;02-01-2023 23:59:59;3
accepted;28-02-2023 19:37:01;3
submitted;03-03-2023 23:59:59;3
created;02-01-2023 07:45:43;5
created;24-01-2022 16:05:58;6
accepted;03-02-2022 23:59:59;6
created;24-01-2022 15:52:53;7
accepted;03-02-2022 23:59:59;7
created;15-08-2022 12:54:23;8
rejected;18-08-2022 23:59:59;8
created;21-03-2022 15:32:05;9
accepted;26-04-2022 23:59:59;9
created;21-03-2022 15:42:39;10", delim = ";")

d$timestampcolumn <- as.POSIXct(d$timestamp, format="%d-%m-%Y %H:%M:%S")
mytest <- simple_eventlog(d, 
                          case_id = "CASEID", 
                          activity_id = "STATUS", 
                          timestamp = "timestampcolumn")
process_map(mytest, type = frequency("absolute"))

d %>% 
  inner_join(case_list(mytest) %>% 
               select(CASEID, trace),
             "CASEID")
# # A tibble: 19 × 5
#    STATUS    timestamp           CASEID timestampcolumn     trace                             
#    <chr>     <chr>                <dbl> <dttm>              <chr>                             
#  1 created   16-02-2023 09:46:32      1 2023-02-16 09:46:32 created,revised,accepted          
#  2 revised   13-04-2023 23:58:59      1 2023-04-13 23:58:59 created,revised,accepted          
#  3 accepted  13-04-2023 23:59:59      1 2023-04-13 23:59:59 created,revised,accepted          
#  4 created   16-02-2023 09:46:32      2 2023-02-16 09:46:32 created,accepted                  
#  5 accepted  13-04-2023 23:59:59      2 2023-04-13 23:59:59 created,accepted                  
#  6 created   14-12-2022 13:17:54      3 2022-12-14 13:17:54 created,revised,accepted,submitted
#  7 revised   02-01-2023 23:59:59      3 2023-01-02 23:59:59 created,revised,accepted,submitted
#  8 accepted  28-02-2023 19:37:01      3 2023-02-28 19:37:01 created,revised,accepted,submitted
#  9 submitted 03-03-2023 23:59:59      3 2023-03-03 23:59:59 created,revised,accepted,submitted
# 10 created   02-01-2023 07:45:43      5 2023-01-02 07:45:43 created                           
# 11 created   24-01-2022 16:05:58      6 2022-01-24 16:05:58 created,accepted                  
# 12 accepted  03-02-2022 23:59:59      6 2022-02-03 23:59:59 created,accepted                  
# 13 created   24-01-2022 15:52:53      7 2022-01-24 15:52:53 created,accepted                  
# 14 accepted  03-02-2022 23:59:59      7 2022-02-03 23:59:59 created,accepted                  
# 15 created   15-08-2022 12:54:23      8 2022-08-15 12:54:23 created,rejected                  
# 16 rejected  18-08-2022 23:59:59      8 2022-08-18 23:59:59 created,rejected                  
# 17 created   21-03-2022 15:32:05      9 2022-03-21 15:32:05 created,accepted                  
# 18 accepted  26-04-2022 23:59:59      9 2022-04-26 23:59:59 created,accepted                  
# 19 created   21-03-2022 15:42:39     10 2022-03-21 15:42:39 created 

R相关问答推荐

单击 map 后,将坐标复制到剪贴板

抖动点与嵌套类别变量箱形图的位置不对齐

derrr summarise每个组返回多行?

为什么观察不会被无功值变化触发?

将数据集中的值增加到当前包含的最大值

在rpart. plot或fancyRpartPlot中使用带有下标的希腊字母作为标签?

如何删除仅在数据集顶部和底部包含零的行

R spatstat Minkowski Sum()返回多个边界

DEN扩展包中的RECT树形图出现异常行为

将向量元素重新排序为R中的第二个

循环遍历多个变量,并将每个变量插入函数R

创建在文本字符串中发现两个不同关键字的实例的数据框

如何在使用Alpha时让geom_curve在箭头中显示恒定透明度

使用ifElse语句在ggploy中设置aes y值

如何移动点以使它们的打印不重叠

为什么将负值向量提升到分数次方会得到NaN

在R中添加要打印的垂直线

我怎么才能把一盘棋变成一盘棋呢?

基于已有ID列创建唯一ID

创建由三个单独的shapefile组成的单个 map