我有一个订阅表,有4个字段:ID、Customer_id、Start_Date和End_Date. 它列出了我的客户的所有订阅.没有订阅的end_date为空,一个客户可以同时拥有多个订阅. 例如,客户ID 37可以具有以下订阅:
id customer_id start_at end_at
44 37 2019-03-21 2019-03-21
17819 37 2020-03-23 2020-03-23
22302 37 2020-04-24 2021-07-25
42213 37 2021-04-25 2023-04-26
92013 37 2023-04-26 2024-04-26
这些记录意味着客户37是2019-03-21的订户,然后是2020-03-23的订户,然后是2020-04-24到2024-04-26的订户,总计1463天.
我正在try 编写一个查询,以获取每个客户在给定时间段内订阅的天数. 客户37在2023年已成为订户365天. 订阅可以重叠,因为一个订阅服务器可以同时拥有多个订阅.
查询的结果应该类似于:
customer_id total_subscription_days
37 1463
38 526
39 426
40 365
41 325
我的数据库运行在MySQL 8.2.12上.
我try 使用滞后,铅,CTE,最小和最大,无济于事.我试过chatgpt和stackoverflow.
编辑:以下是我到目前为止try 的内容:
第一次try :
SELECT
customer_id,
SUM(DATEDIFF(
LEAST(end_at, '2023-12-31'),
GREATEST(start_at, '2023-01-01')
) + 1) AS total_subscription_days
FROM (
SELECT
customer_id,
start_at,
end_at
FROM
subscription
WHERE
start_at <= '2023-12-31' AND end_at >= '2023-01-01'
UNION ALL
SELECT
s1.customer_id,
LEAD(s1.end_at) OVER (PARTITION BY s1.customer_id ORDER BY s1.end_at),
'2023-12-31'
FROM
subscription s1
LEFT JOIN
subscription s2 ON s1.customer_id = s2.customer_id
AND s1.end_at < s2.start_at
WHERE
s2.start_at IS NOT NULL
) AS merged_subscriptions
GROUP BY
customer_id;
尽管我想知道2023年的订阅天数,但我得到的结果超过了365天.因此,由于联接,它似乎会计算重复项.
第二次try :
WITH subscription_periods AS (
SELECT
customer_id,
start_at,
end_at,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY start_at) AS period_number
FROM
subscription
WHERE
start_at <= '2023-12-31' AND end_at >= '2023-01-01' AND customer_id < 100
),
subscription_days AS (
SELECT
customer_id,
SUM(
DATEDIFF(
LEAST(end_at, '2023-12-31'),
GREATEST(start_at, '2023-01-01')
) + 1
) AS days
FROM
(
SELECT
customer_id,
start_at,
LEAD(end_at) OVER (PARTITION BY customer_id ORDER BY start_at) AS end_at
FROM
subscription_periods
) AS overlapping_periods
WHERE
end_at >= '2023-01-01'
GROUP BY
customer_id
)
SELECT
customer_id,
SUM(days) AS total_subscription_days
FROM
subscription_days
GROUP BY
customer_id;
我仅限于前100个客户,否则我会收到504错误. 这个查询似乎没有考虑到订阅之间的差距. 对于订阅了从2023-01-01到2023-04-01,然后从2023-05-01到2023-08-01的客户,似乎是在2023-01-01和2023-08-01之间计算天数.