SQL 根据前一天的最大值计算每天的值数

发布于07月30日

在PostgreSQL数据库中有一个表，它存储有关书中单词的数据以及这些单词出现的页面.大概是这样的:

 headword | dict_pages       |   timestamp_updated    
----------+------------------+------------------------
 abcdefg  | {229}            | 2023-07-28 14:49:13+00
 ccdsd    | {213}            | 2023-07-28 18:48:11+00
 zdx      | {228}            | 2023-07-27 18:37:42+00
 xcdferc  | {227, 228}       | 2023-07-27 14:47:55+00

我想知道，每一天处理了多少页.我现在是这样做的:

SELECT
    to_char(timestamp_updated, 'YYYY-MM-DD') as dt,
    count(distinct(page_num))
FROM dictionary_word dtw, unnest(dict_pages) page_num
WHERE dict_pages IS NOT NULL
GROUP BY dt
ORDER BY dt ASC

我的目标是知道有多少new pages人被处理了.以前处理过的旧页面与此统计数据无关.

从上面的示例中，使用我现有的查询，我将获得以下结果:

2023-07-27 - 2
2023-07-28 - 2

但Word ccdsd是在几天/几周/几个月前添加的，因此第213页应该从处理的页数计算中排除.我认为解决方案可能是获取前一天的最大页数2023-07-27，即228，然后只计算当天大于228的页数(页数).

使用SQL有什么绝妙的方法来实现这一点吗？

select day_date, count(distinct(page)) as page_count from (select to_char(timestamp_updated, 'YYYY-MM-DD') as day_date, page, first_value(page) over ( order by to_char(timestamp_updated, 'YYYY-MM-DD') groups between 1 preceding and current row ) as prev_day_max_page from dictionary_word, unnest(dict_pages) page ) prev_day_page_data where page >= prev_day_max_page group by day_date order by day_date;

with all_pages as ( select to_char(timestamp_updated, 'YYYY-MM-DD') as day_date, page from dictionary_word dtw, unnest(dict_pages) page ) select day_date, count(distinct(page)) as pages_count from all_pages join ( select day_date, coalesce( lag(daily_max_page) over (order by day_date), daily_max_page ) as prev_max_page from ( select to_char(timestamp_updated, 'YYYY-MM-DD') as day_date, max(page::integer) as daily_max_page from dictionary_word dtw, unnest(dict_pages) page group by day_date -- order by day_date ) s ) prev_day_max_page_stat using (day_date) where page::integer >= prev_max_page group by day_date order by day_date;

day_date	pages_count
2023-07-26	1
2023-07-27	2
2023-07-28	1

day_date

pages_count

2023-07-26

2023-07-27

2023-07-28

SQL 根据前一天的最大值计算每天的值数

推荐答案

Sql相关问答推荐

我可以将INSERT语句与SELECT一起使用来创建条件吗？

在SQL中创建一个计数器，根据BigQuery/SQL中的条件递归地添加行值

为表中每个缺少的引用创建新行

在Postgres中实现合并功能的干净方法，因为当目标/源不匹配时

返回包含列和包含其他列的列表的自定义查询

从日期开始向前填充重复项

NULL-生成的列中连接的字符串的输入

使用列表作为参数进行 Select ，如果为空，则在PostgreSQL中不使用参数进行 Select

将计算列设置为持久化的目的是什么？

SQL按组 Select 最小值，当值不存在时为Null

Proc SQL Select Distinct SAS

如何查询自引用 comments 表以查找带有回复的 comments ，并按最新回复排序？

我们可以使用连接改进包含多个子查询的查询吗

JSON对象查询SQL服务器

在Snowflake中如何使用SQL对版本字符串进行排序？

在 postgresql 中保存带有时间戳的几何类型数据

MIN MAX 值与条件绑定

SQL Select 最大并获取列名

具有日期时间条件的存储过程

snowflake插入覆盖行为