我有如下所示的存储过程,它从包含两列的表中读取数据:第一列row_id
包含行的ID,第二列row_data
包含包含数据的JSON,然后它将该JSON转换为列并将其插入到新表中.
该存储过程从包含JSON的一个表中读取数据,并将该JSON规范化为一个新表,同时删除重复的行:
CREATE OR ALTER PROCEDURE dbo.basic_json_normalization_incr
@tableNamep NVARCHAR(MAX),
@tableName NVARCHAR(MAX),
@rows_count INT OUTPUT
AS
BEGIN
SET NOCOUNT ON;
DECLARE @raw_tableName NVARCHAR(MAX);
DECLARE @chargement NVARCHAR(MAX);
DECLARE @truncateQuery NVARCHAR(MAX);
DECLARE @columns_list NVARCHAR(MAX);
DECLARE @json_query_part NVARCHAR(MAX);
DECLARE @json_query NVARCHAR(MAX) = '';
DECLARE @separator NVARCHAR(2) = '';
DECLARE @sqldelete NVARCHAR(MAX);
DECLARE @sqlinsert NVARCHAR(MAX);
SET @raw_tableName = 'dbo.raw' + @tableNamep;
SELECT @columns_list = COALESCE(@columns_list + ', ' + '[' + column_name + ']', '[' + column_name + ']')
FROM dbo.schemaSource
WHERE _enabled = 1 AND table_name = @tableNamep;
DECLARE json_cursor CURSOR FOR
SELECT CASE
WHEN single_mult = 'M'
THEN CONCAT('[', column_name, '] NVARCHAR(MAX) ', 'AS JSON')
ELSE CONCAT('[', column_name, '] NVARCHAR(MAX) ', '''' + '$.', column_name, '''')
END
FROM dbo.schemaSource WITH (NOLOCK)
WHERE _enabled = 1
AND table_name = @tableNamep;
OPEN json_cursor;
FETCH NEXT FROM json_cursor INTO @json_query_part;
WHILE @@FETCH_STATUS = 0
BEGIN
SET @json_query = @json_query + @separator + @json_query_part;
SET @separator = ',';
FETCH NEXT FROM json_cursor INTO @json_query_part;
END;
CLOSE json_cursor;
DEALLOCATE json_cursor;
-- Determine the duplicate check column based on table name
DECLARE @duplicateCheckColumn NVARCHAR(MAX);
SET @duplicateCheckColumn = '_id';
-- Build the dynamic SQL query for insertion with duplicate and CURR_NO check
SET @sqldelete = N'DELETE FROM ' + @tableName + ' WHERE _id COLLATE French_CI_AS IN (SELECT raw_id COLLATE French_CI_AS FROM raw' + @tableNamep + ');';
SET @sqlinsert = N'
INSERT INTO dbo.' + @tableName + '(_id, ORIGINAL_ID, ' + @columns_list + ')
SELECT j.[_id], j.[ORIGINAL_ID], ' + @columns_list + '
FROM ' + @raw_tableName + ' r
CROSS APPLY OPENJSON(r.raw_data) WITH (
_id nvarchar(155) '$._id' STRICT,
ORIGINAL_ID nvarchar(150) '$.ORIGINAL_ID',
' + @json_query + '
) AS j
';
BEGIN
BEGIN TRANSACTION;
EXECUTE sp_executesql @sqldelete;
EXECUTE sp_executesql @sqlinsert;
SELECT @rows_count = @@ROWCOUNT;
COMMIT;
END;
END;
GO
问题是我正在处理包含大量数据的表,所以这个存储过程非常慢.
以下是表ACCOUNT
的示例数据:
表rawACCOUNT
:
|raw_id|raw_data
|001 |{"_id":"001","ORIGINAL_ID":"001","CATEGORY":"65","ACCOUNT_OFFICER":"1","OPENING_DATE":"20100322","CURR_NO":"3","DATE_TIME":"2112091237","DEPT_CODE":"13"}|
期望表ACCOUNT
:
|ORIGINAL_ID|_id|INSERT_DATE |ACCOUNT_OFFICER|CATEGORY|CURR_NO|CUSTOMER|DATE_LAST_UPDATE|DATE_TIME |DEPT_CODE|OPENING_DATE|INACTIV_MARKER|
|001 |001|20230906085135|1 |65 |3 |NULL |NULL |2112091237|13 |20100322 |NULL |
我有没有什么优化可以让它运行得更快?
PS:我曾经使用STRING_AGG代替游标,但它向我展示了这一点 错误:
The result of STRING_AGG aggregation has exceeded the 8,000-byte limit. Use LOB types to avoid truncation of the result.