我有一张这样的变更表

CREATE TABLE IF NOT EXISTS changes (
    entity_id TEXT NOT NULL,
    column_id TEXT NOT NULL,
    value JSONB NOT NULL,
    updated_at TIMESTAMP NOT NULL
);
INSERT INTO changes VALUES ('1', 'height', to_jsonb(140), '01-01-2021 00:00:00'::TIMESTAMP);
INSERT INTO changes VALUES ('1', 'weight', to_jsonb(30), '01-01-2021 00:00:00'::TIMESTAMP);
INSERT INTO changes VALUES ('1', 'height', to_jsonb(145), '01-02-2021 00:00:00'::TIMESTAMP);
INSERT INTO changes VALUES ('1', 'weight', to_jsonb(34),'01-03-2021 00:00:00'::TIMESTAMP);


 entity_id | column_id | value |     updated_at      
-----------+-----------+-------+---------------------
 1         | height    | 140   | 2021-01-01 00:00:00
 1         | weight    | 30    | 2021-01-01 00:00:00
 1         | height    | 145   | 2021-01-02 00:00:00
 1         | weight    | 34    | 2021-01-03 00:00:00

我想要获得这张表的累积视图

entity_id  | height | weight |     updated_at      
-----------+--------+--------+---------------------
 1         | 140    | 30     | 2021-01-01 00:00:00
 1         | 145    | 30     | 2021-01-02 00:00:00
 1         | 145    | 34     | 2021-01-03 00:00:00

我当前的查询看起来很有效

SELECT
  entity_id,
  coalesce(change->'height', lag(change->'height', 1, null) over (partition by entity_id order by updated_at)) as height,
  coalesce(change->'weight', lag(change->'weight', 1, null) over (partition by entity_id order by updated_at)) as weight,
  updated_at
FROM (
    SELECT entity_id, json_object_agg(column_id, value) as change, updated_at FROM changes
    GROUP BY entity_id, updated_at
) as changes;

但我不喜欢这里的json_object_agg,我相信有一种方法可以在没有冗余聚合的情况下完成它?我错过了一些使用窗口聚合函数的方法.

UPD.@SelVazi帮助让Query变得更好,但我觉得这不是最终的解决方案.

with cte as (
  SELECT
    entity_id,
    max(case when column_id = 'height' then value::int end) as height,
    max(case when column_id = 'weight' then value::int end) as weight,
    updated_at
  from changes
  GROUP by entity_id, updated_at
)
select
  entity_id,
  coalesce(height, lag(height) over (partition by entity_id order by updated_at)) as height,
  coalesce(weight, lag(weight) over (partition by entity_id order by updated_at)) as weight,
  updated_at
from cte;

推荐答案

这比看起来要复杂得多.可以使用条件聚合将高度和权重旋转到列,但随后我们必须填充"缺失"的值.

我假设这两个度量中的任何一个都可能有一个以上的日期差距,这使得lag()在postgres中是错误的,因为它只能回顾预定义的行数(并且不能忽略null个值).

我们可以通过在示例数据的末尾只添加一行来演示lag()的问题:

entity_id column_id value updated_at
1 height 140 2021-01-01 00:00:00
1 weight 30 2021-01-01 00:00:00
1 height 145 2021-02-01 00:00:00
1 weight 34 2021-03-01 00:00:00
1 weight 140 2021-04-01 00:00:00

一种解决方法是使用间隔和孤岛技术将"缺失"的值放入以非null开头的组中,然后将其变为新值.

select entity_id, updated_at,
    max(height) over(partition by entity_id, grp_height) height,
    max(weight) over(partition by entity_id, grp_weight) weight
from (
    select c.*,
        count(height) over (partition by entity_id order by updated_at) grp_height,
        count(weight) over (partition by entity_id order by updated_at) grp_weight
    from (
        select entity_id, updated_at,
            max(value::text) filter(where column_id = 'height') height,
            max(value::text) filter(where column_id = 'weight') weight
        from changes
        group by entity_id, updated_at
    ) c
) c
order by entity_id, updated_at

fiddle

entity_id updated_at height weight
1 2021-01-01 00:00:00 140 30
1 2021-02-01 00:00:00 145 30
1 2021-03-01 00:00:00 145 34
1 2021-04-01 00:00:00 145 140

Sql相关问答推荐

当编号和版本的唯一状态更改时报告

Snowflake SQL比较克隆x原始计数

SQL是否可以计算每年的所有日期变化?

SQL查询视图与连接?

连接特定行号

部分匹配表中元素的MariaDB查询查找结果

如果元素包含通过SQL指定的字符串,则过滤掉数组元素

替换上一个或下一个值中的空值并添加其价格日期

冲突查询的UPDATE时违反非空约束

带上最后日期(结果)

根据时间、状态和相关行在PostgreSQL中的存在来删除行

如何用SQL组合客户拥有的产品

Clickhouse:左连接表到外部数组

有没有办法在雅典娜中将字符串转换为 int ?

按公司和产品查询最近发票的平均价格的SQL查询

每个ID的SQL返回可能的最低级别及其值

为数组中的每个元素从表中收集最大整数

并非所有变量都绑定在 PL SQL 函数中

在时态表和非时态表之间使用 EXCEPT 的 SQL 子查询给出表达式错误数

从 JSON 数组中移除对象