我在PosgreSQL数据库中有一个包含激活订阅数据的表.该表具有以下列:用户ID、订阅开始日期、订阅结束日期和订阅类型.

如果订阅当前处于活动状态,则订阅的结束日期设置为今天的日期.

订阅有两种独立的类型:1.高级订阅,2.图书订阅.它们是独立激活的.例如,数据可能如下所示:

User ID Start Date of Subscription End Date of Subscription Type of Subscription
675 2023-01-01 2023-05-10 Premium
675 2023-02-15 2023-02-28 Books
675 2023-04-18 2023-06-18 Books
726 2023-01-01 2023-10-10 Premium
726 2023-03-16 2023-05-28 Books
855 2023-04-05 2023-05-28 Books
855 2023-04-20 2023-07-25 Premium

我需要的是调整溢价订阅的期限,如果另一个订阅(图书)在其期限内被激活.换句话说,高级订阅应该在图书订阅激活之前和之后分为几个阶段.所需的输出将如下所示:

User ID Start Date of Subscription End Date of Subscription Type of Subscription
675 2023-01-01 2023-02-15 Premium
675 2023-02-15 2023-02-28 Books
675 2023-02-28 2023-04-18 Premium
675 2023-04-18 2023-06-18 Books
726 2023-01-01 2023-03-16 Premium
726 2023-03-16 2023-05-28 Books
726 2023-05-28 2023-10-10 Premium
855 2023-04-05 2023-05-28 Books
855 2023-05-28 2023-07-25 Premium

Made smth like this,but it splits the first subscription named premium into period before the next sub"books".但我不明白如何继续子"溢价"后,"书"完成.

WITH ordered_subscriptions AS (
    SELECT *,
           ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY start_date) as rn
    FROM subscriptions
),
date_ranges AS (
    SELECT 
        a.user_id, 
        a.start_date, 
        MIN(b.start_date) as end_date,
        a.subscription_type
    FROM ordered_subscriptions a
    LEFT JOIN ordered_subscriptions b ON a.user_id = b.user_id AND a.rn < b.rn
    GROUP BY a.user_id, a.start_date, a.subscription_type
),
filtered_subscriptions AS (
    SELECT 
        user_id, 
        start_date, 
        COALESCE(end_date, CURRENT_DATE) as end_date, 
        subscription_type
    FROM date_ranges
    WHERE subscription_type = 'Premium' AND NOT EXISTS (
        SELECT 1 
        FROM date_ranges d2 
        WHERE d2.user_id = date_ranges.user_id 
        AND d2.subscription_type = 'Books' 
        AND d2.start_date < date_ranges.end_date 
        AND (d2.end_date IS NULL OR d2.end_date > date_ranges.start_date)
    )
)

SELECT * FROM filtered_subscriptions
UNION ALL
SELECT user_id, start_date, COALESCE(end_date, CURRENT_DATE), subscription_type 
FROM date_ranges 
WHERE subscription_type = 'Books'
ORDER BY user_id, start_date;

推荐答案

稍后我会试着写得更好,甚至可能会稍微清理一下查询.

假设您希望保留所有Books个订阅.如果存在以某种方式重叠的Premium订阅,则需要将重叠的时段吸收到另一个订阅中,同时调整/创建非重叠(可能是两个).

两个这样的订阅可以通过四种不同的方式相交.也有可能Books的倍数可以落在一个Premium之下.查询的第一部分标识是否存在重叠、重叠的类型,并收集与前向和后向"链接"相关的数据.

第二部分生成对应于重叠的不同段之间的边界的一组四个日期.它还处理创建额外的行,其中Premium必须分成三部分(其中一个保留Books),而Premium完全被Books包含,并且需要完全消失.

其余的逻辑只是驱动所有部件的组装.注意不要在发生"链接"的地方重复片段.

没有递归,只有一个连接,所以我认为这将是有效的.列user_id, start_date, end_date上的适当索引可能是有益的.

with data as (
    select
        s.user_id, s.subscription_type, s.start_date, s.end_date,
        b.start_date as b_start_date, b.end_date as b_end_date,
        case when b.subscription_type is not null
             then lag(b.end_date)
                      over (partition by s.user_id, s.start_date order by b.start_date) end as pb_end_date,
        case when b.subscription_type is not null
             then lead(b.start_date)
                      over (partition by s.user_id, s.start_date order by b.start_date) end as nb_start_date,
        case when s.start_date < b.start_date then 'p'
             when s.start_date > b.start_date then 'b' else '' end ||
        case when s.end_date   < b.end_date   then 'b'
             when s.end_date   > b.end_date   then 'p' else '' end as overlap
    from ordered_subscriptions s left outer join ordered_subscriptions as b
        on      s.subscription_type = 'Premium' and b.subscription_type = 'Books'
            and s.user_id = b.user_id
            and s.start_date < b.end_date and s.end_date > b.start_date
), data2 as (
    select
        user_id, subscription_type, overlap,
        case when pb_end_date is not null then 'Y' else 'N' end as linked,
        least(start_date, b_start_date) as d1,
        greatest(start_date, b_start_date) as d2,
        least(end_date, b_end_date) as d3,
        coalesce(nb_start_date, greatest(end_date, b_end_date)) as d4,
        seg_num
    from data left outer join lateral (
            select * from (values (1), (2)) as v(seg_num)
            where subscription_type = 'Premium' and (seg_num = 1 or overlap = 'pp')
        ) as s(seg_num) on 1 = 1
    where (subscription_type = 'Books' or overlap <> 'bb')
)
select
    user_id, subscription_type,
    case when seg_num is null then d1
         when seg_num = 1 then case when overlap = 'bp' then d3 else d1 end
         when seg_num = 2 then d3 end as start_date,
    case when seg_num is null then d3
         when seg_num = 1 then case when overlap = 'bp' then d4 else d2 end
         when seg_num = 2 then d4 end as end_date
from data2
where (linked = 'N' or seg_num = 2)
order by user_id, start_date, end_date;

https://dbfiddle.uk/chT-SLiC

我最初认为使用外连接会比使用联合更干净,并生成更好的计划.实际上,尽管这两个查询基本相同,但实际上走另一条路更好.

https://dbfiddle.uk/raMy-wOL

Sql相关问答推荐

基于列对多行求和的查询

为什么postgres中CURRENT_TIMESTAMP的日期与CURRENT_DATE不同?

按每天的最大值分组

如何根据给定条件PostgreSQL迭代减少组中的行数

有没有办法在Postgres中存储带有时区的时间戳,而不将其转换为UTC

使用多个嵌套数组查询JSON数据

在UNION查询中查找MIN

Haystack针相交-在元素最多的Haystack中查找集合

从JSON值数组创建扁平数组Athena

使用特定的Order By子句随机化SQL输出

在 Postgres 中将结果按几十年划分

按行值出现的顺序对行值进行分组

将 json 列键映射到第二个表中的匹配列值

根据要过滤的列的值进行联接和分组

SQL 如何根据当前事件和下一个事件确定操作的持续时间?

一次 Select 语句中按组累计的SQL累计数

为 sqlite 全文搜索 (fts) 创建触发器时出现虚拟表的不安全使用

字符串从更改到表列和查询中的一行的转换错误

SQL:如何从时间戳数据生成时间序列并计算不同事件类型的累计总和?

使用一组值进行分组和计数