我有一个表notifications
,其中包含jsonb
类型的payload
列,该列上有一个gin索引.talbe当前包含2742691行
该表如下所示:
id | payload | created_at |
---|---|---|
1 | {"customer": {"email": "foo@example.com", "externalId": 111 } | 2022-06-21 |
2 | {"customer": {"email": "foo@example.com", "externalId": 222 } | 2022-06-20 |
3 | {"customer": {"email": "bar@example.com", "externalId": 333 } | 2022-06-20 |
4 | {"customer": {"email": "baz@example.com", "externalId": 444 } | 2022-04-14 |
5 | {"customer": {"email": "baz@example.com", "externalId": 555 } | 2022-04-12 |
6 | {"customer": {"email": "gna@example.com", "externalId": 666 } | 2022-06-10 |
7 | {"customer": {"email": "gna@example.com", "externalId": 666 } | 2022-06-11 |
我正在try 查询符合以下条件的邮箱地址列表:
- 存在相同
email
地址的多行 - 其中一行的
externalId
与前一行的externalId
不同 -
created_at
在上个月内
对于示例表内容,这应该只返回foo@example.com
,因为
-
bar@example.com
只出现一次 -
baz@example.com
没有上个月内创建的行 -
gna@example.com
有多行,但所有行都有相同的externalId
我试着用这样的LEFT JOIN LATERAL
:
select
n.payload -> 'customer' -> 'email'
from
notifications n
left join lateral (
select
n2.payload -> 'customer' ->> 'externalId' tid
from
notifications n2
where
n2.payload @> jsonb_build_object(
'customer',
jsonb_build_object('email', n.payload -> 'customer' -> 'email')
)
and not (n2.payload @> jsonb_build_object(
'customer',
jsonb_build_object('externalId', n.payload -> 'customer' -> 'externalId')
))
and n2.created_at > NOW() - INTERVAL '1 month'
limit
1
) sub on true
where
n.created_at > NOW() - INTERVAL '1 month'
and sub.tid is not null;
然而,这需要很长时间才能完成.此的查询计划看起来像https://explain.depesz.com/s/mriB
QUERY PLAN
Nested Loop (cost=0.17..53386349.38 rows=259398 width=32)
-> Index Scan using index_notifications_created_at on notifications n (cost=0.09..51931.08 rows=259398 width=514)
Index Cond: (created_at > (now() - '1 mon'::interval))
-> Subquery Scan on sub (cost=0.09..205.60 rows=1 width=0)
Filter: (sub.tid IS NOT NULL)
-> Limit (cost=0.09..205.60 rows=1 width=32)
-> Index Scan using index_notifications_created_at on notifications n2 (cost=0.09..53228.33 rows=259 width=32)
Index Cond: (created_at > (now() - '1 mon'::interval))
Filter: ((payload @> jsonb_build_object('customer', jsonb_build_object('email', ((n.payload -> 'customer'::text) -> 'email'::text)))) AND (NOT (payload @> jsonb_build_object('customer', jsonb_build_object('externalId', ((n.payload -> 'customer'::text) -> 'externalId'::text))))))
JIT:
Functions: 13
Options: Inlining true, Optimization true, Expressions true, Deforming true
有没有什么提示我这里做错了什么/如何优化?