我有一个表(表的描述):

Name Type
KEYVALUE VARCHAR2(100)
TEXT CLOB

示例

Keyvalue Text
101 Customer Input 05/15/2023 07:20:20 My name is ABX +++ Private Notes What is you name+++Customer Input 04/30/2023 19:40:58 I have issue related to water purifier purchased on Jan 23 +++ Public Notes 04/30/2023 18:19:18 +++Customer Input 04/30/2023 Requesting to send a technicial, we could not bring them up due to the same issue that was looked into in ticket 20092. We dont know if this is the same issue as the previous ticket, but need to know the reason Language Preference: English
102 Customer Input 05/15/2023 07:20:20 20424596 Reference to the above ticket+++Customer Input 04/30/2023 19:40:58 Plesae replace the item as this is a faulty one +++ Public Notes 04/30/2023 18:19:18 +++Customer Input 04/30/2023 17:54:54 Shared the faulty machine pics for quick action Problem Context: When was the issue first observed? - 4/30, 1AM or so Were there any recent changes or maintenance performed? - Language Preference: English

我基本上是用"客户输入"一词将整个阶段分成多行. 类似于以下内容:

SELECT distinct keyvalue, level pos, trim(regexp_substr(text, 'Customer Input[^+++]*', 1, level))  x
  FROM 
  (
    SELECT 101 as keyvalue,'Customer Input 05/15/2023 07:20:20 My name is ABX +++ Private Notes What is you name+++Customer Input 04/30/2023 19:40:58 I have issue related to water purifier purchased on Jan 23 
    +++ Public Notes 04/30/2023 18:19:18 +++Customer Input 04/30/2023 Requesting to send a technicial, we could not bring them up due to the same issue that was looked into in ticket 20092. We dont know if this is the same issue as the previous ticket, but need to know the reason Language Preference: English| '
    as text from dual
    union all 
    SELECT 102 as keyvalue,' Customer Input 05/15/2023 07:20:20 20424596 Reference to the above ticket+++Customer Input 04/30/2023 19:40:58 Plesae replace the item as this is a faulty one 
    +++ Public Notes 04/30/2023 18:19:18 +++Customer Input 04/30/2023 17:54:54 Shared the faulty machine pics for quick action Problem Context: When was the issue first observed? - 4/30, 1AM or so Were there any recent changes 
    or maintenance performed? - Language Preference: English| '
    as text from dual  

  ) t
CONNECT BY instr(text, 'Customer Input', 1, level - 1) > 0
order by keyvalue;
keyvalue pos text
101 1 Customer Input 05/15/2023 07:20:20 My name is ABX
101 2 Customer Input 04/30/2023 19:40:58 I have issue related to water purifier purchased on Jan 23
101 3 Customer Input 04/30/2023 Requesting to send a technicial, we could not bring them up due to the same issue that was looked into in ticket 20092. We dont know if this is the same issue as the previous ticket, but need to know the reason Language Preference: English
101 4
102 1 Customer Input 05/15/2023 07:20:20 20424596 Reference to the above ticket
102 2 Customer Input 04/30/2023 19:40:58 Please replace the item as this is a faulty one
102 3 Customer Input 04/30/2023 17:54:54 Shared the faulty machine pics for quick action Problem Context: When was the issue first observed? - 4/30, 1AM or so Were there any recent changes or maintenance performed? - Language Preference: English
102 4

这可以很好地工作,因为文本列是CHARACTER数据类型.

但是当我运行下面的查询时(在数据类型为lob的Actual列上)

SELECT distinct keyvalue, level pos, trim(regexp_substr(customer_input_info, 'Customer Input[^+++]*', 1, level)) str
  FROM (select 101 as keyvalue,to_clob('Customer Input 05/15/2023 07:20:20 20424596 Reference to the above ticket+++Customer Input 04/30/2023 19:40:58 Plesae replace the item as this is a faulty one 
    +++ Public Notes 04/30/2023 18:19:18 +++Customer Input 04/30/2023 17:54:54 Shared the faulty machine pics for quick action Problem Context: When was the issue first observed? - 4/30, 1AM or so Were there any recent changes 
    or maintenance performed? - Language Preference: English| ') as customer_input_info from dual) t
CONNECT BY instr(customer_input_info, 'Customer Input', 1, level - 1) > 0
order by 1

我的错误越来越小了

ORA-00932: inconsistent datatypes: expected - got CLOB 00932. 00000 - "inconsistent datatypes: expected %s got %s" *Cause:
*Action: Error at Line: 44 Column: 38.

我无法更改内部SQL查询,因为源表是CLOB数据类型.我应该对外部查询进行哪些更改.

推荐答案

如果源文本是CLOB,不管长度有多长,您发布的代码都会收到该错误.问题不在于长度本身,而在于每个拆分行值也是一个CLOB,您不能将distinct与CLOB一起使用.

使用distinct通常是一个迹象,表明有一个更深层次的问题正在被掩盖.如果没有它,您确实会得到重复项,但这是针对多个源行的CONNECT-BY查询的一个众所周知的问题,并且随着行数的增加,情况会变得越来越糟.

您需要将CONNECT-BY限制到相同的源行,这很简单,假设keyvalue是唯一的;但您还需要引入非确定性函数调用,以防止它inflating 结果,例如:

CONNECT BY instr(text, 'Customer Input', 1, level - 1) > 0
AND keyvalue = PRIOR keyvalue
AND PRIOR dbms_random.value IS NOT NULL

fiddle

如果您切换到使用递归子查询分解而不是分层查询,您可能会发现更容易理解和维护:

WITH r (keyvalue, text, pos, x) as (
  SELECT keyvalue, text, 1, trim(regexp_substr(text, 'Customer Input[^+++]*', 1, 1))
  FROM t
  UNION ALL
  SELECT keyvalue, text, pos + 1, trim(regexp_substr(text, 'Customer Input[^+++]*', 1, pos + 1))
  FROM r
  WHERE instr(text, 'Customer Input', 1, pos) > 0
)
SELECT keyvalue, pos, x
FROM r
order by keyvalue, pos;
KEYVALUE POS X
101 1 Customer Input 05/15/2023 07:20:20 My name is ABX
101 2 Customer Input 04/30/2023 19:40:58 I have issue related to water purifier purchased on Jan 23
101 3 Customer Input 04/30/2023 Requesting to send a technicial, we could not bring them up due to the same issue that was looked into in ticket 20092. We dont know if this is the same issue as the previous ticket, but need to know the reason Language Preference: English|
101 4
102 1 Customer Input 05/15/2023 07:20:20 20424596 Reference to the above ticket
102 2 Customer Input 04/30/2023 19:40:58 Plesae replace the item as this is a faulty one
102 3 Customer Input 04/30/2023 17:54:54 Shared the faulty machine pics for quick action Problem Context: When was the issue first observed? - 4/30, 1AM or so Were there any recent changes
    or maintenance performed? - Language Preference: English|
102 4

fiddle

我给它留了相同的停止条件,这将生成一个最终的空条目.无论是哪种方法,你都可能想要重新审视这一点.

Sql相关问答推荐

GROUP BY和GROUP_CONCAT用于计算比赛排名

使用`lag()`获取上一个时间戳

没有循环的SQL更新多个XML node 值

PostgreSQL使用SQL子查询在时间间隔内 Select 数据

直接加法(1+1)与聚合函数SUM(1+1)的区别是什么

如何实现同一列的递归计算?

如何在Postgres中为单值输入多行?

如何在 SQL Server 中解决这个复杂的窗口查询?

如何将 START 和 END 日期之间的日期差异作为 SQL 中的单独列获取

如何为 ActiveRecord 联接应用附加条件

删除每个不同日期中未 Select 的最短最长时间

INSERT INTO 语法

如何在 SQL 中将两行(或多行)jsonb 数组合并为一行

删除重复记录但保留最新的SQL查询

IN子句使用的表值用户定义函数参数

计数时如何为所有时间间隔返回 0 而不是什么都不返回

Postgresql 需要一个查询,为我提供所有没有具有特定状态值的子元素的父母

如何根据与 BigQuery 中另一个表的匹配更新一个表中的列?

Select 多年的日期范围

当我按 PK 分组时,该表中的所有列在每个组中都具有相同的值.那么为什么 SQL Server 需要对这些列进行聚合呢?