我正在为升级准备一个遗留的PostgreSQL9.5数据库.

我需要从表中删除OID,而不需要长时间锁定表.我已经开发了一种似乎有效的策略,但我希望专家们能参与进来,因为我计划对系统表做一些通常不受欢迎的事情.

在运行pg_repack之前,我执行以下修改:

mydata=# update pg_class set relhasoids = false where oid = 'some_schema.a_very_large_table_with_oids'::regclass;
UPDATE 1
mydata=# delete from pg_attribute where attrelid = 'some_schema.a_very_large_table_with_oids'::regclass and attname = 'oid';
DELETE 1
mydata=# \d+ some_schema.a_very_large_table_with_oids;
        Table "some_schema.a_very_large_table_with_oids"
 Column | Type | Modifiers | Storage  | Stats target | Description 
--------+------+-----------+----------+--------------+-------------
 k      | text | not null  | extended |              | 
 v      | text |           | extended |              | 
Indexes:
    "a_very_large_table_with_oids_pkey" PRIMARY KEY, btree (k)

mydata=# select oid,* from some_schema.a_very_large_table_with_oids;
ERROR:  column "oid" does not exist

到目前为止一切顺利.我可以插入更新和删除行,但磁盘上的表 struct 不变.我得把它修好

因此,在进行了这些修改之后,我用普通的pg_repack重新打包了表.这会将数据复制到新的磁盘上的表中,即SANS OID.

在我开始在线对生产数据执行这些操作之前,我想要一些专家观察这个过程,因为这是关键任务.

在非常大的表上,我们将在很长一段时间内处于这种边缘状态,其中PG_CLASS和PG_ATTRIBUTE将进行那些强制修改,而PG_REPACK则发挥其魔力.

如果INSERT/UPDATE/DELETE似乎起作用了,有什么需要担心的吗?

在pg_repack运行时,我对表的插入/更新/删除似乎工作得很好.

我还在有烤面包桌的桌子上try 了这一点.乍一看,一切似乎都井然有序.

这是一个 destruct 性的操作,我不会使用规范的ALTER TABLE ... SET WITHOUT OIDS.如果我遗漏了一些细节(例如在备份-恢复或复制过程中),我希望确保自己不会在将来遇到问题.

请告知.

最新情况:

Laurenz出色的回答给了我另外两件我没有想到的事情--即依赖判断和缓存/计划无效.将这些纳入计划将是重要的.执行短真空来清除陈旧的列是一个很好的后备位置,但根据我所看到的pg_repack发生的事情,我认为这将完成同样的事情.pg_repack中的日志(log)公开了复制表和交换定义所涉及的所有SQL:

mydata=# begin;
BEGIN
mydata=# ALTER TABLE perm.permission_cache SET WITHOUT OIDS;
^CCancel request sent
ERROR:  canceling statement due to user request
mydata=# rollback;
ROLLBACK
mydata=# \d+ perm.permission_cache
                                                       Table "perm.permission_cache"
    Column    |           Type           |                        Modifiers                        | Storage  | Stats target | Description 
--------------+--------------------------+---------------------------------------------------------+----------+--------------+-------------
 id           | integer                  | not null default nextval('perm.cache_id_seq'::regclass) | plain    |              | 
 company_uuid | uniqueidentifier         | not null                                                | plain    |              | 
 user_uuid    | uniqueidentifier         | not null                                                | plain    |              | 
 value        | boolean                  | not null                                                | plain    |              | 
 cache_date   | timestamp with time zone | not null default now()                                  | plain    |              | 
 token_name   | character varying(255)   |                                                         | extended |              | 
Indexes:
    "cache_id_pkey" PRIMARY KEY, btree (id)
    "permission_cache_user_token_idx" UNIQUE, btree (user_uuid, token_name)
Foreign-key constraints:
    "company_uuid_fkey" FOREIGN KEY (company_uuid) REFERENCES company_table(company_uuid) ON UPDATE RESTRICT ON DELETE RESTRICT DEFERRABLE INITIALLY DEFERRED
    "user_uuid_fkey" FOREIGN KEY (user_uuid) REFERENCES user_table(user_uuid) ON UPDATE RESTRICT ON DELETE RESTRICT DEFERRABLE INITIALLY DEFERRED
Has OIDs: yes

mydata=# \e
BEGIN
ALTER TABLE
UPDATE 1
DELETE 1
COMMIT
mydata=# \q

$ pg_repack -h 127.0.0.1 -d mydata -Upostgres -t 'perm.permission_cache' -e
LOG: (query) SET search_path TO pg_catalog, pg_temp, public
LOG: (query) SET search_path TO pg_catalog, pg_temp, public
LOG: (query) select repack.version(), repack.version_sql()
LOG: (query) SET statement_timeout = 0
LOG: (query) SET search_path = pg_catalog, pg_temp, public
LOG: (query) SET client_min_messages = warning
LOG: (query) SELECT t.*, coalesce(v.tablespace, t.tablespace_orig) as tablespace_dest FROM repack.tables t,  (VALUES (quote_ident($1::text))) as v (tablespace) WHERE (relid = $2::regclass) ORDER BY t.relname, t.schemaname
LOG:    (param:0) = (null)
LOG:    (param:1) = perm.permission_cache
INFO: repacking table "perm.permission_cache"
LOG: (query) SELECT pg_try_advisory_lock($1, CAST(-2147483648 + $2::bigint AS integer))
LOG:    (param:0) = 16185446
LOG:    (param:1) = 1046889774
LOG: (query) BEGIN ISOLATION LEVEL READ COMMITTED
LOG: (query) SET LOCAL statement_timeout = 100
LOG: (query) LOCK TABLE perm.permission_cache IN ACCESS EXCLUSIVE MODE
LOG: (query) RESET statement_timeout
LOG: (query) SELECT pg_get_indexdef(indexrelid) FROM pg_index WHERE indrelid = $1 AND NOT indisvalid
LOG:    (param:0) = 1046889774
LOG: (query) SELECT indexrelid, repack.repack_indexdef(indexrelid, indrelid, $2, FALSE)  FROM pg_index WHERE indrelid = $1 AND indisvalid
LOG:    (param:0) = 1046889774
LOG:    (param:1) = (null)
LOG: (query) SELECT repack.conflicted_triggers($1)
LOG:    (param:0) = 1046889774
LOG: (query) CREATE TYPE repack.pk_1046889774 AS (id integer)
LOG: (query) CREATE TABLE repack.log_1046889774 (id bigserial PRIMARY KEY, pk repack.pk_1046889774, row perm.permission_cache)
LOG: (query) CREATE TRIGGER repack_trigger AFTER INSERT OR DELETE OR UPDATE ON perm.permission_cache FOR EACH ROW EXECUTE PROCEDURE repack.repack_trigger('INSERT INTO repack.log_1046889774(pk, row) VALUES( CASE WHEN $1 IS NULL THEN NULL ELSE (ROW($1.id)::repack.pk_1046889774) END, $2)')
LOG: (query) ALTER TABLE perm.permission_cache ENABLE ALWAYS TRIGGER repack_trigger
LOG: (query) SELECT repack.disable_autovacuum('repack.log_1046889774')
LOG: (query) BEGIN ISOLATION LEVEL READ COMMITTED
LOG: (query) SELECT pg_backend_pid()
LOG: (query) SELECT pid FROM pg_locks WHERE locktype = 'relation' AND granted = false AND relation = 1046889774 AND mode = 'AccessExclusiveLock' AND pid <> pg_backend_pid()
LOG: (query) COMMIT
LOG: (query) BEGIN ISOLATION LEVEL SERIALIZABLE
LOG: (query) SELECT set_config('work_mem', current_setting('maintenance_work_mem'), true)
LOG: (query) SELECT coalesce(array_agg(l.virtualtransaction), '{}')   FROM pg_locks AS l   LEFT JOIN pg_stat_activity AS a     ON l.pid = a.pid   LEFT JOIN pg_database AS d     ON a.datid = d.oid   WHERE l.locktype = 'virtualxid'   AND l.pid NOT IN (pg_backend_pid(), $1)   AND (l.virtualxid, l.virtualtransaction) <> ('1/1', '-1/0')   AND (a.application_name IS NULL OR a.application_name <> $2)  AND a.query !~* E'^\\s*vacuum\\s+'   AND a.query !~ E'^autovacuum: '   AND ((d.datname IS NULL OR d.datname = current_database()) OR l.database = 0)
LOG:    (param:0) = 11918
LOG:    (param:1) = pg_repack
LOG: (query) DELETE FROM repack.log_1046889774
LOG: (query) SELECT pid FROM pg_locks WHERE locktype = 'relation' AND granted = false AND relation = 1046889774 AND mode = 'AccessExclusiveLock' AND pid <> pg_backend_pid()
LOG: (query) SET LOCAL statement_timeout = 100
LOG: (query) LOCK TABLE perm.permission_cache IN ACCESS SHARE MODE
LOG: (query) RESET statement_timeout
LOG: (query) CREATE TABLE repack.table_1046889774 WITH (oids = false) TABLESPACE pg_default AS SELECT id,company_uuid,user_uuid,NULL::integer AS "........pg.dropped.4........",value,cache_date,token_name FROM ONLY perm.permission_cache WITH NO DATA
LOG: (query) INSERT INTO repack.table_1046889774 SELECT id,company_uuid,user_uuid,NULL::integer AS "........pg.dropped.4........",value,cache_date,token_name FROM ONLY perm.permission_cache
LOG: (query) ALTER TABLE repack.table_1046889774 DROP COLUMN "........pg.dropped.4........"
LOG: (query) SELECT repack.disable_autovacuum('repack.table_1046889774')
LOG: (query) COMMIT
LOG: (query) CREATE UNIQUE INDEX index_1046889779 ON repack.table_1046889774 USING btree (id)
LOG: (query) CREATE UNIQUE INDEX index_1050932923 ON repack.table_1046889774 USING btree (user_uuid, token_name)
LOG: (query) SELECT repack.repack_apply($1, $2, $3, $4, $5, $6)
LOG:    (param:0) = SELECT * FROM repack.log_1046889774 ORDER BY id LIMIT $1
LOG:    (param:1) = INSERT INTO repack.table_1046889774 VALUES ($1.*)
LOG:    (param:2) = DELETE FROM repack.table_1046889774 WHERE (id) = ($1.id)
LOG:    (param:3) = UPDATE repack.table_1046889774 SET (id, company_uuid, user_uuid, value, cache_date, token_name) = ($2.id, $2.company_uuid, $2.user_uuid, $2.value, $2.cache_date, $2.token_name) WHERE (id) = ($1.id)
LOG:    (param:4) = DELETE FROM repack.log_1046889774 WHERE id IN (
LOG:    (param:5) = 1000
LOG: (query) SELECT repack.repack_apply($1, $2, $3, $4, $5, $6)
LOG:    (param:0) = SELECT * FROM repack.log_1046889774 ORDER BY id LIMIT $1
LOG:    (param:1) = INSERT INTO repack.table_1046889774 VALUES ($1.*)
LOG:    (param:2) = DELETE FROM repack.table_1046889774 WHERE (id) = ($1.id)
LOG:    (param:3) = UPDATE repack.table_1046889774 SET (id, company_uuid, user_uuid, value, cache_date, token_name) = ($2.id, $2.company_uuid, $2.user_uuid, $2.value, $2.cache_date, $2.token_name) WHERE (id) = ($1.id)
LOG:    (param:4) = DELETE FROM repack.log_1046889774 WHERE id IN (
LOG:    (param:5) = 1000
LOG: (query) SELECT pid FROM pg_locks WHERE locktype = 'virtualxid' AND pid <> pg_backend_pid() AND virtualtransaction = ANY($1)
LOG:    (param:0) = {}
LOG: (query) SAVEPOINT repack_sp1
LOG: (query) SET LOCAL statement_timeout = 100
LOG: (query) LOCK TABLE perm.permission_cache IN ACCESS EXCLUSIVE MODE
LOG: (query) RESET statement_timeout
LOG: (query) SELECT repack.repack_apply($1, $2, $3, $4, $5, $6)
LOG:    (param:0) = SELECT * FROM repack.log_1046889774 ORDER BY id LIMIT $1
LOG:    (param:1) = INSERT INTO repack.table_1046889774 VALUES ($1.*)
LOG:    (param:2) = DELETE FROM repack.table_1046889774 WHERE (id) = ($1.id)
LOG:    (param:3) = UPDATE repack.table_1046889774 SET (id, company_uuid, user_uuid, value, cache_date, token_name) = ($2.id, $2.company_uuid, $2.user_uuid, $2.value, $2.cache_date, $2.token_name) WHERE (id) = ($1.id)
LOG:    (param:4) = DELETE FROM repack.log_1046889774 WHERE id IN (
LOG:    (param:5) = 0
LOG: (query) SELECT repack.repack_swap($1)
LOG:    (param:0) = 1046889774
LOG: (query) COMMIT
LOG: (query) BEGIN ISOLATION LEVEL READ COMMITTED
LOG: (query) SAVEPOINT repack_sp1
LOG: (query) SET LOCAL statement_timeout = 100
LOG: (query) LOCK TABLE perm.permission_cache IN ACCESS EXCLUSIVE MODE
LOG: (query) RESET statement_timeout
LOG: (query) SELECT repack.repack_drop($1, $2)
LOG:    (param:0) = 1046889774
LOG:    (param:1) = 4
LOG: (query) COMMIT
LOG: (query) BEGIN ISOLATION LEVEL READ COMMITTED
LOG: (query) ANALYZE perm.permission_cache
LOG: (query) COMMIT
LOG: (query) SELECT pg_advisory_unlock($1, CAST(-2147483648 + $2::bigint AS integer))
LOG:    (param:0) = 16185446
LOG:    (param:1) = 1046889774

看起来OID并不像Laurenz猜测的那样在副本中幸存下来.对我来说,最好使用pg_repack,因为它可以在无人值守的情况下运行.对于在没有OID的情况下重写磁盘上的表,这难道不是和增量真空一样好吗?我可能还漏掉了一个细节.

推荐答案

您希望避免因ALTER TABLE ... SET WITHOUT OIDS的表重写而导致的长时间停机,对吗?

安全的解决方案是运行ALTER TABLE ... SET WITHOUT OIDS次,然后停机.你的方式更危险,但它避免了桌子上的长ACCESS EXCLUSIVE锁.

您所做的等同于PostgreSQL在commit 6d1e361852之前对ALTER TABLE ... SET WITHOUT OIDS所做的事情,只不过您忘记了对表进行ACCESS EXCLUSIVE锁定,没有判断oid上的依赖项,并且不会使其他会话对该表拥有的计划和缓存元数据信息无效.

要避免这些问题,请按以下方式操作:

  • 首先,确保没有任何东西依赖于oid列:

    BEGIN;
    
    -- would cause an error if there were dependencies on "oid"
    ALTER TABLE some_schema.a_very_large_table_with_oids SET WITHOUT OIDS;
    -- hit Ctrl+C to interrupt the statement
    
    ROLLBACK;
    
  • 然后,执行您建议的更改,但锁定表并使所有计划无效:

    BEGIN;
    
    -- perform a no-op change that locks the table and invalidates plans
    ALTER TABLE some_schema.a_very_large_table_with_oids
       ALTER id SET NOT NULL;
    
    UPDATE pg_class SET relhasoids = false
    WHERE oid = 'some_schema.a_very_large_table_with_oids'::regclass;
    
    DELETE FROM pg_attribute
    WHERE attrelid = 'some_schema.a_very_large_table_with_oids'::regclass
      AND attname = 'oid';
    
    COMMIT;
    
  • 最后,您可以使用pg_repack重写表,而无需使用长达ACCESS EXCLUSIVE的锁.

作为pg_repack的替代方案,您可以分块更新表,以便重写单个行.确保批次之间为VACUUM:

UPDATE some_schema.a_very_large_table_with_oids
SET id = id
WHERE id BETWEEN 1 AND 1000000;

VACUUM some_schema.a_very_large_table_with_oids;

UPDATE some_schema.a_very_large_table_with_oids
SET id = id
WHERE id BETWEEN 1000001 AND 2000000;

VACUUM some_schema.a_very_large_table_with_oids;

...

在那之后,桌子应该没问题了.尽管如此,执行测试升级并测试良好.


一些背景信息说明了为什么ALTER TABLE ... SET WITHOUT OIDS会重写表格,而不是ALTER TABLE ... DROP COLUMN ...:

在8.4版之前,PostgreSQL不需要对这两个操作进行表重写.但是,将ALTER TABLE ... SET WITHOUT OIDS更改为重写表是有原因的;请参见this bug reportthis ensuing thread中的讨论.这里提到的问题在pg_upgrade之后不会影响到您,但是如果您有一个在其元组中实际包含OID的表,而PostgreSQL不期望有任何OID,那么可能会有其他微妙的问题.这就是更改ALTER TABLE ... SET WITHOUT OIDS以重写表的原因.

您的目录操作将使该表处于类似于PostgreSQL 8.3及更早版本中的ALTER TABLE ... SET WITHOUT OIDS的状态.现在pg_upgrade文档提到9.2及更高版本支持升级,commit e469f0aaf3之前提到8.4及更高版本支持.虽然历史文档中没有明确提到其OID被以您的方式删除的表构成问题,但这种情况显然是出乎意料的,根本没有经过测试,也不受支持.在future ,你可能会遇到有趣的问题.对于我的数据库,我当然不会冒这个险.

Postgresql相关问答推荐

Postgs SQL用于比较两个表之间的数据并填充到目标中

PostgreSQL:函数结果表内冲突(...)上的";中的字段名称

为什么我的应用程序接收的是空值而不是布尔值?

如何删除Devtainer之前创建的PostgreSQL数据库?

在特定值的组聚合中添加标签列作为结果

PostgreSQL 不删除旧的 WAL 档案

空表上的 Postgres 意外行为

如何在plpgsql中找到一行中的最大值?

Postgresql如何从jsonB列的数组中的多个json中 Select 一个值

如何将 postgres 数据库转换为 sqlite

LAG 函数和 GROUP BY

为什么 PostgreSQL 将用户和组合并为角色?

Mac psql/readline - 未加载库

如何将两个 PostgreSQL 列聚合到一个用括号分隔的数组

Postgres 日期重叠约束

Postgres 类似于 SQL Server 中的 CROSS APPLY

ruby on rails jsonb 列默认值

PostgreSQL - 如何在不同时区呈现日期?

错误:索引表达式中的函数必须在 Postgres 中标记为 IMMUTABLE

JOIN (SELECT ... ) ue ON 1=1?