我有一个用户的表历史任务,看起来是这样的.1个用户有9个任务.
user_id task IS_COMPLETED updated_at
123 Task 1 1 2024-01-01
123 Task 2 1 2024-01-01
123 Task 3 0 2024-01-01
123 Task 4 0 2024-01-01
123 Task 5 0 2024-01-01
123 Task 6 1 2024-01-01
123 Task 7 1 2024-01-01
123 Task 8 1 2024-01-01
123 Task 9 1 2024-01-01
我想把可能的任务组合在一起,变成这样的表.所以我知道什么样的任务组合是用户快速完成的.
combination total_user_completed
task1_task2 20
task1_task3 15
task1_task_4 15
task1_task_5 14
: : (and so on for 2 combination)
task1_task2_task3 10
task4_task5_task_6 11
: : (and so on for 3 combination)
(有关组合示例的更多详细信息:从A、B、C我得到了A-B、B-C、A-C和A-B-C之间的组合.不需要按顺序)
我曾try 使用递归SQL,但效果不佳
"递归联接内存不足.请重新运行此查询 更大的仓库"
然后我也试着用Case When来定义,但我需要像这样一个一个地定义,这并不理想.
with all_task as (
select
distinct user_id,
task_name as task,
is_completed,
updated_at
from task
where
is_completed = 1
group by all
),
agg as (
select
distinct user_id,
activated_at as period,
max(case when task in ('Task 1') and IS_COMPLETED = 1 then date(updated_at) end) as task_1,
max(case when task in ('Task 2') and IS_COMPLETED = 1 then date(updated_at) end) as task_2,
max(case when task in ('Task 3') and IS_COMPLETED = 1 then date(updated_at) end) as task_3,
max(case when task in ('Task 4') and IS_COMPLETED = 1 then date(updated_at) end) as task_4,
max(case when task in ('Task 5') and IS_COMPLETED = 1 then date(updated_at) end) as task_5,
max(case when task in ('Task 6') and IS_COMPLETED = 1 then date(updated_at) end) as task_6,
max(case when task in ('Task 7') and IS_COMPLETED = 1 then date(updated_at) end) as task_7,
max(case when task in ('Task 8') and IS_COMPLETED = 1 then date(updated_at) end) as task_8,
max(case when task in ('Task 9') and IS_COMPLETED = 1 then date(updated_at) end) as task_9
from all_task
group by all
),
task_group as (
select user_id,
case when task_1 is not null and task_2 is not null then user_id end as task_1_2
case when task_2 not null and task_3 is not null then user_id end as task_2_3
.........
from agg
)
select 'Task1_task2' as combination,
count(distinct task_1_2) as total
from task_group
union all
select 'task2_task3' as combination,
count(distinct task_2_3) as total
from task_group
.... (and so on)
有没有解决这一问题的建议?非常感谢!
*注意:目前我需要至少2/3的任务组合.谢谢