我在Postgr 15.1上运行以下查询:

EXPLAIN (ANALYZE, VERBOSE)
WITH segmented_customer AS (SELECT *
                            FROM customer
                            WHERE integration_id = 162
                              AND (1 = 1 OR 1 = 1)),
     segmented_subscription
         AS (SELECT * FROM subscription WHERE integration_id = 162 AND 1 = 1),
     segmented_billing_cycle
         AS (SELECT * FROM billing_cycle WHERE integration_id = 162),
     segmented_product_version
         AS (SELECT * FROM product_version WHERE integration_id = 162 AND 1 = 1)
SELECT segmented_customer.id                         AS customer_id,
       segmented_customer.name                       AS customer_name,
       segmented_subscription.override_start_date    AS sub_override_start_ts,
       segmented_subscription.source_start_date      AS sub_source_start_ts,
       segmented_subscription.override_end_date      AS sub_override_end_ts,
       segmented_subscription.computed_end_date      AS sub_computed_end_ts,
       segmented_product_version.override_mrr_amount AS override_mrr_amount,
       segmented_product_version.total_mrr           AS total_mrr,
       segmented_product_version.version_start_date  AS version_start_ts,
       segmented_product_version.version_end_date    AS version_end_ts
FROM segmented_customer
         JOIN segmented_subscription
              ON segmented_customer.id = segmented_subscription.customer_id
         JOIN segmented_billing_cycle
              ON segmented_subscription.id = segmented_billing_cycle.subscription_id
         JOIN segmented_product_version
              ON segmented_billing_cycle.id = segmented_product_version.billing_cycle_id;

查询计划器显示以下内容:

Gather  (cost=48644.92..114335.21 rows=1 width=105) (actual time=869.975..8133.738 rows=88061 loops=1)
"  Output: customer.id, customer.name, subscription.override_start_date, subscription.source_start_date, subscription.override_end_date, subscription.computed_end_date, product_version.override_mrr_amount, product_version.total_mrr, product_version.version_start_date, product_version.version_end_date"
  Workers Planned: 2
  Workers Launched: 2
  ->  Nested Loop  (cost=47644.92..113335.11 rows=1 width=105) (actual time=799.597..7986.501 rows=29354 loops=3)
"        Output: customer.id, customer.name, subscription.override_start_date, subscription.source_start_date, subscription.override_end_date, subscription.computed_end_date, product_version.override_mrr_amount, product_version.total_mrr, product_version.version_start_date, product_version.version_end_date"
        Worker 0:  actual time=755.136..7930.513 rows=28962 loops=1
          JIT:
            Functions: 29
"            Options: Inlining false, Optimization false, Expressions true, Deforming true"
"            Timing: Generation 5.048 ms, Inlining 0.000 ms, Optimization 0.922 ms, Emission 15.867 ms, Total 21.838 ms"
        Worker 1:  actual time=775.757..7927.262 rows=28968 loops=1
          JIT:
            Functions: 29
"            Options: Inlining false, Optimization false, Expressions true, Deforming true"
"            Timing: Generation 5.482 ms, Inlining 0.000 ms, Optimization 0.897 ms, Emission 16.110 ms, Total 22.489 ms"
        ->  Nested Loop  (cost=47644.49..113256.53 rows=9 width=81) (actual time=776.203..3896.817 rows=27877 loops=3)
"              Output: customer.id, customer.name, subscription.override_start_date, subscription.source_start_date, subscription.override_end_date, subscription.computed_end_date, billing_cycle.id"
              Worker 0:  actual time=730.393..3856.151 rows=27652 loops=1
              Worker 1:  actual time=730.376..3842.788 rows=27320 loops=1
              ->  Parallel Hash Join  (cost=47644.05..105823.47 rows=192 width=81) (actual time=752.267..1282.571 rows=9149 loops=3)
"                    Output: customer.id, customer.name, subscription.override_start_date, subscription.source_start_date, subscription.override_end_date, subscription.computed_end_date, subscription.id"
                    Inner Unique: true
                    Hash Cond: ((subscription.customer_id)::text = (customer.id)::text)
                    Worker 0:  actual time=694.538..1252.813 rows=8950 loops=1
                    Worker 1:  actual time=694.479..1230.973 rows=9024 loops=1
                    ->  Parallel Bitmap Heap Scan on public.subscription  (cost=322.92..58470.94 rows=11962 width=62) (actual time=8.967..527.330 rows=9149 loops=3)
"                          Output: subscription.id, subscription.source_id, subscription.source_start_date, subscription.source_end_date, subscription.status, subscription.attributes, subscription.created_at, subscription.updated_at, subscription.integration_id, subscription.organization_id, subscription.trial_end_date, subscription.trial_start_date, subscription.last_synced_at, subscription.computed_end_date, subscription.override_end_date, subscription.customer_id, subscription.override_start_date"
                          Recheck Cond: (subscription.integration_id = 162)
                          Heap Blocks: exact=4856
                          Worker 0:  actual time=8.969..556.056 rows=8950 loops=1
                          Worker 1:  actual time=8.954..533.827 rows=9024 loops=1
                          ->  Bitmap Index Scan on subscription_integration_id_e53b1bd5  (cost=0.00..315.75 rows=28709 width=0) (actual time=7.151..7.155 rows=27446 loops=1)
                                Index Cond: (subscription.integration_id = 162)
                                Worker 1:  actual time=7.151..7.155 rows=27446 loops=1
                    ->  Parallel Hash  (cost=47189.88..47189.88 rows=10500 width=41) (actual time=741.375..741.385 rows=8743 loops=3)
"                          Output: customer.id, customer.name"
                          Buckets: 32768  Batches: 1  Memory Usage: 2176kB
                          Worker 0:  actual time=683.803..683.810 rows=7868 loops=1
                          Worker 1:  actual time=683.794..683.808 rows=8225 loops=1
                          ->  Parallel Bitmap Heap Scan on public.customer  (cost=283.74..47189.88 rows=10500 width=41) (actual time=26.742..736.863 rows=8743 loops=3)
"                                Output: customer.id, customer.name"
                                Recheck Cond: (customer.integration_id = 162)
                                Heap Blocks: exact=6420
                                Worker 0:  actual time=17.125..679.507 rows=7868 loops=1
                                Worker 1:  actual time=17.345..679.678 rows=8225 loops=1
                                ->  Bitmap Index Scan on customer_integration_id_afcc30ba  (cost=0.00..277.44 rows=25201 width=0) (actual time=15.062..15.069 rows=26228 loops=1)
                                      Index Cond: (customer.integration_id = 162)
              ->  Index Scan using billing_cycle_subscription_id_b7f65c8b on public.billing_cycle  (cost=0.43..38.70 rows=1 width=16) (actual time=0.131..0.285 rows=3 loops=27446)
"                    Output: billing_cycle.id, billing_cycle.period_start_date, billing_cycle.period_end_date, billing_cycle.created_at, billing_cycle.updated_at, billing_cycle.subscription_id, billing_cycle.invoice_id, billing_cycle.integration_id, billing_cycle.organization_id, billing_cycle.last_synced_at, billing_cycle.is_trial_period"
                    Index Cond: (billing_cycle.subscription_id = subscription.id)
                    Filter: (billing_cycle.integration_id = 162)
                    Worker 0:  actual time=0.135..0.290 rows=3 loops=8950
                    Worker 1:  actual time=0.132..0.288 rows=3 loops=9024
        ->  Index Scan using product_version_billing_cycle_id_c8642f61 on public.product_version  (cost=0.43..8.72 rows=1 width=40) (actual time=0.140..0.146 rows=1 loops=83631)
"              Output: product_version.id, product_version.name, product_version.version_start_date, product_version.version_end_date, product_version.product_mrr, product_version.override_mrr_amount, product_version.quantity, product_version.source_product_id, product_version.source_price_id, product_version.is_metered, product_version.created_at, product_version.updated_at, product_version.billing_cycle_id, product_version.changed_date, product_version.invoice_id, product_version.integration_id, product_version.organization_id, product_version.last_synced_at, product_version.coupon_mrr, product_version.coupon_name, product_version.is_trial_version, product_version.tax_mrr, product_version.tax_name, product_version.total_mrr, product_version.attributes, product_version.query_grouping_id"
              Index Cond: (product_version.billing_cycle_id = billing_cycle.id)
              Filter: (product_version.integration_id = 162)
              Worker 0:  actual time=0.141..0.147 rows=1 loops=27652
              Worker 1:  actual time=0.142..0.149 rows=1 loops=27320
Planning Time: 9.807 ms
JIT:
  Functions: 87
"  Options: Inlining false, Optimization false, Expressions true, Deforming true"
"  Timing: Generation 22.063 ms, Inlining 0.000 ms, Optimization 4.291 ms, Emission 57.598 ms, Total 83.952 ms"
Execution Time: 8151.987 ms

请注意,例如,对于billing_cycle,即使只有idsubscription_id被用作连接键,每个列都包含在输出中.为什么会这样,这会带来性能问题吗?

我试着在Google上搜索了一下,并查看了大量的源文档,但是找不到这个输出的确切含义.

推荐答案

除非您确实想控制这些是否被实体化,否则您只会输入更多的内容并嵌套更多的内容.从技术上讲,无论您保留CTE还是切换到子查询,还是在末尾添加常规的where条件,或者作为附加的join条件:demo,几乎没有什么区别.

我认为Output个列表的行为是因为这些扫描的工作方式:seq和non—covering index扫描最终仍然从堆中抓取行.它会把所有这些都传递到更远的地方,即使在没有请求的情况下,也会把丢弃的工作留给查询中真正需要它的任何其他部分来处理.它甚至会声称它传递了我故意在storage externalTOASTed的巨大值:

QUERY PLAN
Nested Loop (cost=0.00..58.14 rows=1 width=292) (actual time=0.018..0.021 rows=1 loops=1)
  Output: customer.id, customer.name, subscription.override_start_date, subscription.source_start_date, subscription.override_end_date, subscription.computed_end_date, product_version.override_mrr_amount, product_version.total_mrr, product_version.version_start_date, product_version.version_end_date
...
              -> Seq Scan on public.billing_cycle (cost=0.00..1.02 rows=1 width=8) (actual time=0.006..0.007 rows=1 loops=1)
                    Output: billing_cycle.id, billing_cycle.subscription_id, billing_cycle.integration_id, billing_cycle.something_toasted, billing_cycle.organization_id
...

但是,即使something被通过,我怀疑这些真的是de—toasted值:在this demo中,每个值是长和toasted还是101没有区别.

你可以看到index-only scan on covering index的行为变化,它恰好保存了你想在查询中使用的所有东西,在它的索引列的子集中:

QUERY PLAN
Nested Loop (cost=0.29..65.84 rows=1 width=292) (actual time=0.033..0.037 rows=1 loops=1)
  Output: segmented_customer.id, segmented_customer.name, segmented_subscription.override_start_date, segmented_subscription.source_start_date, segmented_subscription.override_end_date, segmented_subscription.computed_end_date, segmented_product_version.override_mrr_amount, segmented_product_version.total_mrr, segmented_product_version.version_start_date, segmented_product_version.version_end_date
...
        -> Index Only Scan using billing_cycle_id_subscription_id_integration_id_idx on public.billing_cycle segmented_billing_cycle (cost=0.29..4.31 rows=1 width=8) (actual time=0.009..0.010 rows=1 loops=1)
              Output: segmented_billing_cycle.id, segmented_billing_cycle.subscription_id, segmented_billing_cycle.integration_id

它认识到除了索引中的内容之外,您没有引用任何东西,所以它不会try go 获取其余的列.

Sql相关问答推荐

Snowflake SQL比较克隆x原始计数

将SEMI JOIN、ANTI JOIN转换为非连接SQL

如何更改函数返回的列名?

有没有办法在每次计算每一行的数据时更新2个值?

雅典娜嵌套Json提取液

使用列表作为参数进行 Select ,如果为空,则在PostgreSQL中不使用参数进行 Select

导出部分条形码字符串GS1-128

带上最后日期(结果)

如何找到一个组合的两个列,这是不是在其他表在ORACLE SQL?

在WHERE EXISTS子查询中,列返回是否重要?

PostgreSQL中递归CTE查询的故障过滤

配置单元查询失败:无法识别';附近的输入;LEFT'';(select子句中的';';col'

Netezza SQL:判断两个表是否相同

将二维数组的第一个和第二个元素取消嵌套到两个一维数组中

将具有嵌套 XML 的列转换为 SQL 中的表格格式?

通过ID和数据找到每个不同的值

MS ACCESS 错误插入 X(...) 从 A 联合 Select ... 从 B

根据是否存在值组合分组并 Select 行

As400 (IBM i) SQL 表 QSYS2.SYSTABLES 上的元数据

基于源表的 SQL INSERT、UPDATE 和 DELETE