有没有办法给toolkit_experimental.interpolated_average函数增加一个额外的GROUP BY?假设我的数据有不同传感器的功率测量;我如何在sensor_id上添加一个组?

with s as (
  select sensor_id,
    time_bucket('30 minutes', timestamp) bucket,
    time_weight('LOCF', timestamp, value) agg
  from
    measurements m
    inner join sensor_definition sd on m.sensor_id = sd.id
  where asset_id = '<battery_id>' and sensor_name = 'power' and
    timestamp between '2023-01-05 23:30:00' and '2023-01-07 00:30:00'
  group by sensor_id, bucket)
select sensor_id,
  bucket,
  toolkit_experimental.interpolated_average(
      agg,
      bucket,
      '30 minutes'::interval,
      lag(agg) over (order by bucket),
      lead(agg) over (order by bucket)
    )
from s
group by sensor_id;

上面的查询不起作用,因为我还需要添加bucketagg作为GROUP BY列.

您可以在下面找到相关的架构.

create table measurements
(
    sensor_id uuid                     not null,
    timestamp timestamp with time zone not null,
    value     double precision         not null
);

create table sensor_definition
(
    id          uuid default uuid_generate_v4() not null
        primary key,
    asset_id    uuid                            not null,
    sensor_name varchar(256)                    not null,
    sensor_type varchar(256)                    not null,
    unique (asset_id, sensor_name, sensor_type)
);

有什么建议吗?

推荐答案

这是一个很棒的问题和很酷的用例.绝对有办法做到这一点!我喜欢你在上面的CTE,尽管我更喜欢用更具描述性的方式来命名它们.连接看起来很适合 Select ,您甚至可以很容易地在future 的某个时间点细分出continuous aggregate的"即时"聚合,并对连续聚合执行相同的连接……所以这很棒!

您唯一需要做的就是修改leadlag函数的WINDOW子句,以便他们知道它正在处理的不是整个有序数据集,这样您就根本不需要GROUP BY子句了!

WITH weighted_sensor AS (
  SELECT 
    sensor_id,
    time_bucket('30 minutes', timestamp) bucket,
    time_weight('LOCF', timestamp, value) agg 
  FROM
    measurements m
    INNER JOIN sensor_definition sd ON m.sensor_id = sd.id
  WHERE asset_id = '<battery_id>' AND sensor_name = 'power' and
    timestamp between '2023-01-05 23:30:00' and '2023-01-07 00:30:00'
  GROUP BY sensor_id, bucket)
SELECT 
  sensor_id,
  bucket,
  toolkit_experimental.interpolated_average(
      agg,
      bucket,
      '30 minutes'::interval,
      lag(agg) OVER (PARTITION BY sensor_id ORDER BY bucket),
      lead(agg) OVER (PARTITION BY sensor_id ORDER BY bucket)
    )
FROM weighted_sensor;

您还可以在查询中将WINDOW子句拆分成一个单独的子句并对其进行命名,如果您使用它的次数较多,这尤其有用,因此,如果您也要使用integral function,例如,要获得一段时间内的总能源利用率,则可以执行以下操作:

WITH weighted_sensor AS (
  SELECT 
    sensor_id,
    time_bucket('30 minutes', timestamp) bucket,
    time_weight('LOCF', timestamp, value) agg 
  FROM
    measurements m
    INNER JOIN sensor_definition sd ON m.sensor_id = sd.id
  WHERE asset_id = '<battery_id>' AND sensor_name = 'power' and
    timestamp between '2023-01-05 23:30:00' and '2023-01-07 00:30:00'
  GROUP BY sensor_id, bucket)
SELECT 
  sensor_id,
  bucket,
  toolkit_experimental.interpolated_average(
      agg,
      bucket,
      '30 minutes'::interval,
      lag(agg) OVER sensor_times,
      lead(agg) OVER sensor_times
    ),
toolkit_experimental.interpolated_integral(
      agg,
      bucket,
      '30 minutes'::interval,
      lag(agg) OVER sensor_times,
      lead(agg) OVER sensor_times,
     'hours'
    )
FROM weighted_sensor
WINDOW sensor_times AS (PARTITION BY sensor_id ORDER BY bucket);

我用小时作为单位,因为我计算的能量通常是以瓦时或类似的单位来衡量的.

Postgresql相关问答推荐

Postgres SQL执行加入

Postgres将唯一索引添加到已在分区上具有PK的分区表中,然后添加PK

将XML解析从T-SQL迁移到Postgres时出现问题

SQLX::Query!带有UUID::UUID的宏在Rust中编译失败

使用doobie,如何将Scala case类映射到带有类型tstzmultirange的PostgreSQL列?

Postgres 查询指向国外数据工作者的分区表比直接查询 fdw 慢很多倍

如何在 spring/JPA 中记录表和列名以防数据库错误

如何在postgres中查询多行

是否可以使用 pgAdmin4 自动格式化/美化 SQL 查询?

在 to_tsquery 中转义特殊字符

Nodemon - 安装期间clean exit - waiting for changes before restart

如果 Column1 不为空,则按 Column1 排序,否则按 Column2 排序

gem install pg 无法绑定到 libpq

当记录包含 json 或字符串的混合时,如何防止 Postgres 中的invalid input syntax for type json

从 Select 中创建一个临时表,如果表已经存在则插入

如何使用 WITH RECURSIVE 子句进行 Select

PostgreSQL 中是否有类似 zip() 函数的东西,它结合了两个数组?

在 PostgreSQL 中 Select 进入临时表?

与 iexact 一起使用时,Django get_or_create 无法设置字段

使用 PostgreSQL 进行字母数字排序