Sql 同时插入和更新记录

发布于09月24日

我得到了下面的波斯格雷斯表格:

create table test (
    id serial, 
    contract varchar, 
    amount int, 
    aggregated int, 
    is_aggregate int
);

insert into test (contract, amount, aggregated) 
values ('abc', 100, 0), 
       ('abc', 200, 0), 
       ('xyz',  50, 0), 
       ('xyz',  60, 0);

id	contract	amount
1	abc	100
2	abc	200
3	xyz	50
4	xyz	60

我正在try 编写一条在每个合同的基础上插入聚合行的SQL语句，因此结果应该如下所示:

id	contract	amount	aggregated	is_aggregate
1	abc	100	1
2	abc	200	1
3	xyz	50	1
4	xyz	60	1
5	abc	300		1
6	xyz	110		1

由于不断从另一个数据源添加新行，因此需要一次性插入聚合行并在已聚合的行中将‘Aggregate’设置为1，以避免并发问题.

我try 了几种方法，但我只走了这么远:

INSERT INTO test (contract, amount, aggregated, is_aggregate)
SELECT contract,
       SUM(amount) AS sum_amount,
       1,
       1
FROM test
WHERE aggregated IS NULL 
   OR aggregated = 0
GROUP BY contract
HAVING COUNT(*) > 1;

这将插入聚集的行，但我不知道如何包含一个子句，以便为已经聚集的行将‘Aggregated’更新为1.

编辑:

经过几次"浓缩"后，数据将如下所示:

id	contract	amount	aggregated	is_aggregate
1	abc	100	1
2	abc	200	1
3	xyz	50	1
4	xyz	60	1
5	abc	300	1	1
6	xyz	110	1	1
7	abc	20	1
8	abc	30	1
9	xyz	70	1
10	xyz	80	1
11	abc	50	1	1
12	xyz	150	1	1
13	abc	350		1
14	xyz	260		1

...其中，ID 11和12再次是第一次运行的聚集(如前面的ID 5和6)，并且ID 13和14是前一次聚集运行的聚集(即，分别为ID 5和11以及ID 6和12).最后一次运行的聚合可以通过它们的‘Aggregated’=NULL来确定

注意:还有一些额外的‘time’列，可以根据这些列确定各行的聚合级别，以便在以后的业务逻辑中使用.为了简洁起见，这里省略了这些专栏.

有谁能帮帮我吗？先谢谢你.

使用Created_at筛选

添加一个名为created_at的timestamp列，其缺省值为NOW().请注意，值NOW()在postgres中的事务内不会更改，并且它还具有读一致性(快照隔离)，从而防止在该事务开始后由其他进程插入/修改的行被包括在该事务中.

-- Start a transaction BEGIN; -- Aggregate INSERT INTO test (contract, amount, aggregated, is_aggregate) SELECT contract, SUM(amount), NULL, 1 FROM test WHERE aggregated IS NULL OR aggregated = 0 GROUP BY contract HAVING COUNT(*) > 1; -- Mark aggregated rows as aggregated UPDATE test SET aggregated = 1 WHERE id IN ( SELECT id FROM test WHERE (aggregated IS NULL OR aggregated = 0) AND created_at <> NOW() GROUP BY contract HAVING COUNT(*) > 1 ); -- Commit the transaction COMMIT;

使用显式行锁

添加一个名为is_locked的boolean列，其缺省值为NULL.这是因为聚合条件在聚合前后返回不同的行.另一种方法是使用插入行的时间戳在最后的更新中过滤掉它们，但我不确定哪些列可用，因为表定义是部分的.

这段代码不是并行安全的--在任何给定时间只运行一个聚合进程.

-- Start a transaction BEGIN; -- Lock the rows you want to aggregate UPDATE test SET is_locked = TRUE WHERE id IN ( SELECT id FROM test WHERE is_locked IS NULL AND (aggregated IS NULL OR aggregated = 0) AND contract IN ( SELECT contract FROM test WHERE is_locked IS NULL AND (aggregated IS NULL OR aggregated = 0) GROUP BY contract HAVING COUNT(*) > 1 ) ); -- Insert the data from the locked rows INSERT INTO test (contract, amount, aggregated, is_aggregate) SELECT contract, SUM(amount) AS sum_amount, NULL, 1 FROM test WHERE is_locked = TRUE GROUP BY contract; -- Update the aggregated column to set it to 1 for the locked rows -- and remove the locks UPDATE test SET aggregated = 1, is_locked = NULL WHERE is_locked = TRUE; -- Commit the transaction COMMIT;

两张桌子

如果您想使用单独的聚合表，请参见下面的内容.

将插入的时间戳添加到测试表中，并将该时间戳用于聚合目的.

Conceptual solution proposal below. The exact syntax depends on which database engine you are using.个

修改后的测试表定义:

create table test ( id serial, contract varchar, amount int, created_at timestamp DEFAULT CURRENT_TIMESTAMP );

新的集结表:

create table test_agg ( contract varchar, amount int, aggregated_to timestamp ); CREATE UNIQUE INDEX test_agg_contract_uq ON test_agg (contract);

修改后的SQL语句，如果不存在则插入一行或更新聚合表中的现有行.

INSERT INTO test_agg (contract, amount, aggregated_to) SELECT t.contract, SUM(t.amount), MAX(t.created_at) FROM test t, LEFT OUTER JOIN test_agg ta ON ta.contract = t.contract WHERE ta.aggregated_to IS NULL OR t.created_at > ta.aggregated_to GROUP BY t.contract ON DUPLICATE KEY UPDATE amount = amount + VALUES(amount), aggregated_to = VALUES(aggregated_to);

Sql 同时插入和更新记录

推荐答案

One table

使用Created_at筛选

使用显式行锁

两张桌子

Sql相关问答推荐

SQL Google Sheets：UNIQUE/DISTINCT和编码查询函数

使用交叉应用透视表在SQL中转换分段时间段&

我如何计算字母a出现的字符串的次数？

连接特定行号

Oracle SQL-将结果列在单行中

当一个视图在Postgres中失效时？

如何在PostgreSQL中对第1，1，1，1，2，2，2，2行进行编号

计算周时出现SQL错误结果

防止ActiveRecord迁移在db/structure.sql中进行巨大更改

使用SQL创建列出两个时间戳之间小时数的列

对多个条件的SQL进行排名

在查询Oracle SQL中创建替代ID

在xml.Modify方法中使用子字符串和可能的替代方法

排除具有部分匹配条件的记录

为什么左联接结果在MS Access数据库中不匹配

匹配 H[0-9][0-9] 但不匹配除字母 H 之外的任何字母

给定 3 个键列，从一个表中 Select 另一表中不存在的所有数据

为什么 get_json_object() 无法从存储在 Hive SQL 表中的 JSON 中提取值？

如何仅在满足条件时才按顺序在 SQL 中计数？

为数组中的每个元素从表中收集最大整数

id	contract	amount	aggregated	is_aggregate
1	abc	100	1
2	abc	200	1
3	xyz	50	1
4	xyz	60	1
5	abc	300	1	1
6	xyz	110	1	1
7	abc	20	1
8	abc	30	1
9	xyz	70	1
10	xyz	80	1
11	abc	50	1	1
12	xyz	150	1	1
13	abc	350		1
14	xyz	260		1

id	contract	amount	aggregated	is_aggregate
1	abc	100	1
2	abc	200	1
3	xyz	50	1
4	xyz	60	1
5	abc	300	1	1
6	xyz	110	1	1
7	abc	20	1
8	abc	30	1
9	xyz	70	1
10	xyz	80	1
11	abc	50	1	1
12	xyz	150	1	1
13	abc	350		1
14	xyz	260		1

id	contract	amount	aggregated	is_aggregate
1	abc	100	1
2	abc	200	1
3	xyz	50	1
4	xyz	60	1
5	abc	300	1	1
6	xyz	110	1	1
7	abc	20	1
8	abc	30	1
9	xyz	70	1
10	xyz	80	1
11	abc	50	1	1
12	xyz	150	1	1
13	abc	350		1
14	xyz	260		1