我试图使用下表,在series个日期中,将task个计数按state分组:

tasks
-----
| id | title       | state_id | inserted_at         |
| -- | ----------- | -------- | ------------------- |
| 1  | First Task  | 1        | 2022-05-05 19:16:44 |
| 2  | Second Task | 1        | 2022-05-07 18:54:40 |
| 3  | Third Task  | 1        | 2022-05-07 19:18:28 |
| 4  | Fourth Task | 1        | 2022-05-10 15:28:57 |
task_states
-----
| id | label      |
| -- | ---------- |
| 1  | Assigns    |
| 2  | In Process |
| 3  | Completed  |
task_logs
-----
| id | event   | target | value      | task_id | inserted_at        |
| -- | ------- | ------ | ---------- | ------- | -------------------|
| 1  | changed | state  | Assigns    | 1       | 2022-05-05 19:16:44|
| 2  | changed | state  | In Progress| 1       | 2022-05-06 11:43:14|
| 3  | changed | state  | Assigns    | 2       | 2022-05-07 18:54:40|
| 4  | changed | state  | Assigns    | 3       | 2022-05-07 19:18:28|
| 5  | changed | state  | Completed  | 1       | 2022-05-08 12:11:38|
| 6  | changed | state  | In Progress| 2       | 2022-05-09 09:22:53|
| 7  | changed | state  | Assigns    | 4       | 2022-05-10 15:28:57|
| 8  | changed | state  | Completed  | 2       | 2022-05-11 11:21:53|
| 9  | changed | state  | In Progress| 3       | 2022-05-11 17:42:02|

每个任务没有一致的每日"状态"记录,因为task_logs只有任务更改状态时的条目.这意味着我必须在指定日期之前获取每个任务的最后一个"状态更改"日志(log).我使用以下查询获取一天前每个状态的任务计数:

SELECT date('2022-05-10'), state.id as state_id, state.label, count(sub.id)
FROM (
   SELECT DISTINCT ON (t.id) t.id, logs.value
   FROM tasks t
   INNER JOIN task_logs logs ON logs.task_id = t.id
   WHERE date(logs.inserted_at) <= date('2022-05-10') AND logs.target = 'state'
   ORDER BY t.id, logs.inserted_at DESC
) sub
RIGHT JOIN task_states state ON state.label = sub.value
GROUP BY state.id
ORDER BY state.id;
------------------
| date       | state_id | label      | count |
| ---------- | -------- | ---------- | ----- |
| 2022-05-10 | 1        | Assigns    | 2     |
| 2022-05-10 | 2        | In Process | 1     |
| 2022-05-10 | 3        | Completed  | 1     |

我的问题是,试图将上面的查询与generate_series相结合,以获得一系列日期的daily计数,例如:

| date       | state_id | label       | count |
| ---------- | -------- | ----------- | ----- |
| 2022-05-05 | 1        | Assigns     | 1     |
| 2022-05-05 | 2        | In Progress | 0     |
| 2022-05-05 | 3        | Complete    | 0     |
| 2022-05-06 | 1        | Assigns     | 0     |
| 2022-05-06 | 2        | In Progress | 1     |
| 2022-05-06 | 3        | Complete    | 0     |
| 2022-05-07 | 1        | Assigns     | 2     |
| 2022-05-07 | 2        | In Progress | 1     |
| 2022-05-07 | 3        | Complete    | 0     |
| 2022-05-08 | 1        | Assigns     | 2     |
| 2022-05-08 | 2        | In Progress | 0     |
| 2022-05-08 | 3        | Complete    | 1     |
| 2022-05-09 | 1        | Assigns     | 1     |
| 2022-05-09 | 2        | In Progress | 1     |
| 2022-05-09 | 3        | Complete    | 1     |
| 2022-05-10 | 1        | Assigns     | 2     |
| 2022-05-10 | 2        | In Progress | 1     |
| 2022-05-10 | 3        | Complete    | 1     |
| 2022-05-11 | 1        | Assigns     | 1     |
| 2022-05-11 | 2        | In Progress | 1     |
| 2022-05-11 | 3        | Complete    | 2     |

这是一个dbfiddle的设置,上面的表格.对于如何在一系列日期(generate_series(current_date - interval '5 day', current_date, '1 day'))中的每个日期执行上述查询(或重写)的任何 idea / idea ,我们将不胜感激!

推荐答案

考虑使用存储函数循环生成的一系列日期,并捕获每个每日聚合的快照:

CREATE OR REPLACE FUNCTION build_daily_log_agg(_interval_days TEXT)
  RETURNS TABLE ("date" TEXT,
                 state_id INTEGER, 
                 state_label TEXT,
                 "count" INTEGER)
  LANGUAGE plpgsql AS
$func$

DECLARE dt RECORD;
BEGIN 

    CREATE TEMPORARY TABLE daily_log_agg (
        "date" TEXT, 
        state_id INTEGER, 
        state_label TEXT, 
        "count" INTEGER
    );
    

    FOR dt IN SELECT dates FROM generate_series( 
            current_date - _interval_days::interval,
            current_date, '1 day' 
        ) AS dates LOOP
        
        INSERT INTO daily_log_agg ("date", state_id, state_label, "count")
        SELECT dt.dates AS "date",
               state.id AS state_id, 
               state.label, 
               COUNT(sub.id) AS "count"
        FROM (
            SELECT DISTINCT ON (t.id) t.id, logs.value
            FROM tasks t
            INNER JOIN task_logs logs ON logs.task_id = t.id
            WHERE date(logs.inserted_at) <= dt.dates
              AND logs.target = 'state'
            ORDER BY t.id, logs.inserted_at DESC
        ) sub
        RIGHT JOIN task_states state ON state.label = sub.value
        GROUP BY state.id
        ORDER BY state.id;
        
   END LOOP; 
   
   RETURN QUERY
   SELECT * FROM daily_log_agg;
END
$func$;


SELECT * FROM build_daily_log_agg('12 days');

Online Demo

Sql相关问答推荐

Microsoft Access UNION将长文本字段限制为255个字符

Postgres trunc_date删除一个月

SQL基于多个值 Select 单行

如何在PostgreSQL中的一列中添加两个文本?

用户购买平台及金额统计

Redshift PL/pgSQL循环中的参数化列名

在查询Oracle SQL中创建替代ID

我需要一个regexp_like来只验证字母D或T、数字和管道

合并分层表SQL中的第一个非空、变化的空位置

需要从键-值对数据中提取值

每个分组最多 Select 最后 2 个值并并排显示它们

将具有嵌套 XML 的列转换为 SQL 中的表格格式?

从另一个没有公共键的表中获取值来加入

试图找到两个身份列表的交集(列表的长度不同),但获取列 id 不明确?

获取上个月和上一年的值

如何优化仅返回符合条件的三条记录的查询?

添加一列并根据其他列值进行填充

MIN MAX 值与条件绑定

SQLite 中的过滤运行总和视图

来自 SQL Server 的树层次 struct 图的 JSON