Sql 在当前行之后和之前创建附加日期并基于它创建新列

发布于01月23日

假设我有这样的数据

create table example
(cust_id VARCHAR, product VARCHAR, price float, datetime varchar);

insert into example (cust_id, product, price, datetime)
VALUES
('1', 'scooter', 2000, '2022-01-10'),
('1', 'skateboard', 1500, '2022-01-20'),
('1', 'beefmeat', 300, '2022-06-08'),
('2', 'wallet', 200, '2022-02-25'),
('2', 'hairdryer', 250, '2022-04-28'),
('3', 'skateboard', 1600, '2022-03-29')

我想要添加一些行，然后根据这些添加的行创建新的列

我的预期yields 会是这样的

cust_id	total_price	date	is_active
1	3500	2022-01	active
1	0	2022-02	active
1	0	2022-03	active
1	0	2022-04	inactive
1	0	2022-05	inactive
1	300	2022-06	active
1	0	2022-07	active
2	0	2022-01	inactive
2	200	2022-02	active
2	0	2022-03	active
2	250	2022-04	active
2	0	2022-05	active
2	0	2022-06	active
2	0	2022-07	inactive
3	0	2022-01	inactive
3	0	2022-02	inactive
3	1600	2022-03	active
3	0	2022-04	active
3	0	2022-05	active
3	0	2022-06	inactive
3	0	2022-07	inactive

规则是这样的

the first month when the customer make transaction is called active, before this transaction called inactive.
ex: first transaction in month 2, then month 2 is active, month 1 is inactive (look cust_id 2 and 3)
if more than 2 months there isnt transaction, the next month is inactive until there is new transaction is active.
ex: if last transaction in month 1, then month 2 and month 3 is inactive, and month 4, month 5 inactive if month 6 there is new transaction (look cust_id 1 and 3)

嗯，我的第一个 idea 是使用这个代码，但我不知道下一步是什么

select *, 
    date_part('month', age(to_date(date, 'YYYY-MM'), to_date(lag(date) over (partition by cust_id order by date),'YYYY-MM')))date_diff
from(
    select 
        cust_id,
        sum(price)total_price,
        to_char(to_date(datetime, 'YYYY-MM-DD'),'YYYY-MM')date
    from example
    group BY
        cust_id,
        date
    order by 
    cust_id, 
    date)test

我愿意接受任何建议

/* use generate_series to generate a series of dates starting from the min date of datetime up to the max datetime with one-month intervals, then do a cross join with the distinct cust_id to map each cust_id to each generated date.*/ WITH cust_dates AS ( SELECT EX.cust_id, to_char(dts, 'YYYY-mm') dts FROM generate_series ( (SELECT MIN(datetime)::timestamp FROM example), (SELECT MAX(datetime)::timestamp + '2 month'::interval FROM example), '1 month'::interval ) dts CROSS JOIN (SELECT DISTINCT cust_id FROM example) EX ), /* do a left join with your table to find prices for each cust_id/ month, and aggregate for cust_id, month_date to find the sum of prices for each cust_id, month_date. */ monthly_price AS ( SELECT CD.cust_id, CD.dts AS month_date, COALESCE(SUM(price), 0) total_price FROM cust_dates CD LEFT JOIN example EX ON CD.cust_id = EX.cust_id AND CD.dts = to_char(EX.datetime, 'YYYY-mm') GROUP BY CD.cust_id, CD.dts ) /* Now, we have the sum of monthly prices for each cust_id, we can use the max window function with "ROWS BETWEEN 2 PRECEDING AND CURRENT ROW" to check if one of the (current month or the previous two months) has a sum of prices > 0. */ SELECT cust_id, month_date, total_price, CASE MAX(total_price) OVER (PARTITION BY cust_id ORDER BY month_date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) WHEN 0 THEN 'inactive' ELSE 'active' END AS is_active FROM monthly_price ORDER BY cust_id, month_date

Sql 在当前行之后和之前创建附加日期并基于它创建新列

推荐答案

Sql相关问答推荐

判断时间之间是否有时间

LAG函数通过丢弃空值返回前一行

如何在幂函数中正确使用Power()和Exp()

R中对Arrow duckdb工作流的SQL查询

替换条件中的单元格值

编写一个SQL查询来返回收视率较高的类型，并取这些收视率的平均值，我该如何做呢？

PATINDEX中与[A-Z]匹配(U除外)的正则表达式

在Athena中使用regexp提取括号前的字符串值

PostgreSQL：查找继承表中的特定记录属于哪个表

在VB.NET中如何在MS Access数据库中创建SQL项目历史库存卡

Select 最频繁的值以及分组依据

Netezza SQL：判断两个表是否相同

如何根据 SQL 中的阈值标记一个集群中的所有值？

使用 XML 作为 SQL 表

如何在 case 语句中使用聚合？

在自引用表中使用分组和计数的SQL查询语句

在Snowflake中如何使用SQL对版本字符串进行排序？

如何将输出转换为二维格式？

如何从 2 个 SQLite 表构建嵌套对象？

根据条件列出不同的值