Sql Select 具有 jsonb 列的行，这些行在一个 jsonb 属性中具有不同的值，而在另一个 jsonb 属性中具有匹配的值

发布于06月23日

我有一个表notifications，其中包含jsonb类型的payload列，该列上有一个gin索引.talbe当前包含2742691行

该表如下所示:

id	payload	created_at
1	{"customer": {"email": "foo@example.com", "externalId": 111 }	2022-06-21
2	{"customer": {"email": "foo@example.com", "externalId": 222 }	2022-06-20
3	{"customer": {"email": "bar@example.com", "externalId": 333 }	2022-06-20
4	{"customer": {"email": "baz@example.com", "externalId": 444 }	2022-04-14
5	{"customer": {"email": "baz@example.com", "externalId": 555 }	2022-04-12
6	{"customer": {"email": "gna@example.com", "externalId": 666 }	2022-06-10
7	{"customer": {"email": "gna@example.com", "externalId": 666 }	2022-06-11

我正在try 查询符合以下条件的邮箱地址列表:

存在相同email地址的多行
其中一行的externalId与前一行的externalId不同
created_at在上个月内

对于示例表内容，这应该只返回foo@example.com，因为

bar@example.com只出现一次
baz@example.com没有上个月内创建的行
gna@example.com有多行，但所有行都有相同的externalId

我试着用这样的LEFT JOIN LATERAL:

select
  n.payload -> 'customer' -> 'email'
from
  notifications n
  left join lateral (
    select
      n2.payload -> 'customer' ->> 'externalId' tid
    from
      notifications n2
    where
      n2.payload @> jsonb_build_object(
        'customer',
        jsonb_build_object('email', n.payload -> 'customer' -> 'email')
      )
      and not (n2.payload @> jsonb_build_object(
        'customer',
        jsonb_build_object('externalId', n.payload -> 'customer' -> 'externalId')
      ))
      and n2.created_at > NOW() - INTERVAL '1 month' 
    limit
      1
  ) sub on true
where
  n.created_at > NOW() - INTERVAL '1 month'
  and sub.tid is not null;

然而，这需要很长时间才能完成.此的查询计划看起来像https://explain.depesz.com/s/mriB

QUERY PLAN
Nested Loop  (cost=0.17..53386349.38 rows=259398 width=32)
  ->  Index Scan using index_notifications_created_at on notifications n  (cost=0.09..51931.08 rows=259398 width=514)
        Index Cond: (created_at > (now() - '1 mon'::interval))
  ->  Subquery Scan on sub  (cost=0.09..205.60 rows=1 width=0)
        Filter: (sub.tid IS NOT NULL)
        ->  Limit  (cost=0.09..205.60 rows=1 width=32)
              ->  Index Scan using index_notifications_created_at on notifications n2  (cost=0.09..53228.33 rows=259 width=32)
                    Index Cond: (created_at > (now() - '1 mon'::interval))
                    Filter: ((payload @> jsonb_build_object('customer', jsonb_build_object('email', ((n.payload -> 'customer'::text) -> 'email'::text)))) AND (NOT (payload @> jsonb_build_object('customer', jsonb_build_object('externalId', ((n.payload -> 'customer'::text) -> 'externalId'::text))))))
JIT:
  Functions: 13
  Options: Inlining true, Optimization true, Expressions true, Deforming true

有没有什么提示我这里做错了什么/如何优化？

Sql Select 具有 jsonb 列的行，这些行在一个 jsonb 属性中具有不同的值，而在另一个 jsonb 属性中具有匹配的值

推荐答案

Sql相关问答推荐

错误LOS-00302：在Oracle中使用自定义类型时必须声明DBA_PICKLER组件

在多个联合中使用相同的SELECT SQL查询

为什么在postgres中，横向连接比相关子查询快？

通过之前的连接-这是Oracle的错误吗？

在数据库中搜索列

关于Postgres横向联接的谓词

按属性值 Select 数组元素Postgres Jsonb

SQL数据库规范化与数据插入

在Athena中使用regexp提取括号前的字符串值

每个学校 Select N个最新的行，但跳过同一学生的重复行

显示所有组并计算给定组中的项目(包括 0 个结果)

如何修复初学者 SQL INNER JOIN 查询错误

特殊条件计算小计

没有调用子查询的嵌套 JOIN语法是什么？

如何对 jsonb 中的字段执行求和，然后使用结果过滤查询的输出

避免在SQL中使用具有相同条件的多个子查询

为什么在事务中未被后续使用的CTE执行SELECT...FOR UPDATE无效？

Postgres：表的累积视图

如何通过子 Select 在一次更新(并行数组)中多次更新相同的行

从每行中排除最大元素