假设我有表1和表2以及下面的数据.我想 for each ID找到表1(JoiningDt)的下一个最接近的日期匹配,该匹配在表2.ClosestDt中可用.注意:表2中的最接近的匹配应该大于表1的日期,并且没有2 ID可以具有相同的最接近的匹配(例如:ID 2应该取值08—Apr—2024,尽管07—Apr—2024是最近的,因为它已经被ID 1取了.)

表1:

ID JoiningDt DocNum
1 05-Apr-2024 A123
2 06-Apr-2024 A123
3 04-Apr-2024 B123

表2

DocNum ClosestDt
A123 03-Apr-2024
A123 04-Apr-2024
A123 07-Apr-2024
A123 08-Apr-2024
B123 02-Apr-2024
B123 05-Apr-2024

我的预期输出是:

ID JoiningDt DocNum ClosestDt
1 05-Apr-2024 A123 07-Apr-2024
2 06-Apr-2024 A123 08-Apr-2024
3 04-Apr-2024 B123 05-Apr-2024

当我try 一个左外连接,我得到

ID JoiningDt DocNum ClosestDt
1 2024-04-05 A123 2024-04-07
1 2024-04-05 A123 2024-04-08
2 2024-04-06 A123 2024-04-07
2 2024-04-06 A123 2024-04-08
3 2024-04-04 B123 2024-04-05
select t1.ID ,t1.JoiningDt, t1.DocNum, (t2.ClosestDt)
from #Table1 t1
left join #Table2 t2 on
    t1.DocNum = t2.DocNum
    and t2.ClosestDt > t1.JoiningDt

我也试过使用rownumber,但最具挑战性的部分是为id 2获取下一个匹配(8—apr而不是7—apr),因为id 1已经使用了7—apr.

推荐答案

您可以使用一个相关子查询,在那里您可以 Select 最近的日期.

对于这种动态,您需要至少处理两次数据. 其基本思想是使用rownumber来确定两个日期的位置, Select 两个或更多的叉,然后简单地 Select 行中的下一个,这取决于它的行数.

这意味着表2中有足够的数据来填充这一列,否则您需要进行更多的编程,以确定在最后一个日期之后的下一个日期.

CREATE TABLE Table1 (
  ID INTEGER,
  JoiningDt DATETIME,
  DocNum VARCHAR(4)
);

INSERT INTO Table1
  (ID, JoiningDt, DocNum)
VALUES
  ('1', '05-Apr-2024', 'A123'),
  ('2', '06-Apr-2024', 'A123'),
  ('3', '04-Apr-2024', 'B123');

CREATE TABLE Table2 (
  DocNum VARCHAR(4),
  ClosestDt DATETIME
);

INSERT INTO Table2
  (DocNum, ClosestDt)
VALUES
  ('A123', '03-Apr-2024'),
  ('A123', '04-Apr-2024'),
  ('A123', '07-Apr-2024'),
  ('A123', '08-Apr-2024'),
  ('B123', '02-Apr-2024'),
  ('B123', '05-Apr-2024');
9 rows affected
WITH CTE1 As (
SELECT t1.ID, t1.JoiningDt, t1.DocNum,
  (SELECT TOP 1 ClosestDt FROM Table2 
  WHERE DocNum = t1.DocNum AND ClosestDt > t1.JoiningDt ORDER BY ClosestDt  ASC   ) ClosestDt
  , ROW_NUMBER() OVER(PARTITION BY DocNum,   (SELECT TOP 1 ClosestDt FROM Table2 
  WHERE DocNum = t1.DocNum AND ClosestDt > t1.JoiningDt ORDER BY ClosestDt  ASC   ) ORDER BY ID) rn
FROM Table1 t1
 )
SELECT ID, JoiningDt, DocNum, 
  CASE WHEN rn = 1 then ClosestDt ELSE
  (SELECT ClosestDt FROM Table2 
  WHERE DocNum = c1.DocNum AND ClosestDt > c1.JoiningDt ORDER BY ClosestDt  ASC   
  OFFSET c1.rn -1  ROWS FETCH NEXT 1 ROWS ONLY) END 
  FROM CTE1 c1
ID JoiningDt DocNum (No column name)
1 2024-04-05 00:00:00.000 A123 2024-04-07 00:00:00.000
2 2024-04-06 00:00:00.000 A123 2024-04-08 00:00:00.000
3 2024-04-04 00:00:00.000 B123 2024-04-05 00:00:00.000

fiddle

另一种使用两个CTE的方法,这样子 Select 就不会运行两次,但它就像第一次只重写了

WITH CTE1 As (
SELECT t1.ID, t1.JoiningDt, t1.DocNum,
  (SELECT TOP 1 ClosestDt FROM Table2 
  WHERE DocNum = t1.DocNum AND ClosestDt > t1.JoiningDt ORDER BY ClosestDt  ASC   ) ClosestDt
FROM Table1 t1
 ), CTE2 AS (
  SELECT
       ID, JoiningDt, DocNum, ClosestDt
    , ROW_NUMBER() OVER(PARTITION BY DocNum,   ClosestDt ORDER BY ID) rn
  FROM CTE1
  )
SELECT ID, JoiningDt, DocNum, 
  CASE WHEN rn = 1 then ClosestDt ELSE
  (SELECT ClosestDt FROM Table2 
  WHERE DocNum = c1.DocNum AND ClosestDt > c1.JoiningDt ORDER BY ClosestDt  ASC   
  OFFSET c1.rn -1  ROWS FETCH NEXT 1 ROWS ONLY) END 
  FROM CTE2 c1
ID JoiningDt DocNum (No column name)
1 2024-04-05 00:00:00.000 A123 2024-04-07 00:00:00.000
2 2024-04-06 00:00:00.000 A123 2024-04-08 00:00:00.000
3 2024-04-04 00:00:00.000 B123 2024-04-05 00:00:00.000

fiddle

最好在编程语言中使用cursorm是一种缓慢的方法,因为它是一种缓慢的方法,而且对于太多的行,这将花费很长的时间,所以您应该限制表1的行数,

DECLARE @ID int
DECLARE @JoiningDt DATETIME
DECLARE @DocNum VARCHAR(4)

DECLARE @MyTableVar TABLE (
    ID INT NOT NULL,
    JoiningDt datetime,
    DocNum VARCHAR(4),
    NearestDate DATETIME);


DECLARE db_cursor CURSOR FOR 
SELECT ID, JoiningDt, DocNum FROM Table1

OPEN db_cursor  
FETCH NEXT FROM db_cursor INTO @ID, @JoiningDt, @DocNum

WHILE @@FETCH_STATUS = 0  
BEGIN  
INSERT INTO @MyTableVar
SELECT TOP 1 @ID, @JoiningDt, @DocNum,ClosestDt   FROM table2
WHERE DocNum = @DocNum AND ClosestDt > @JoiningDt     
  AND NOT EXISTS (SELECT 1 FROM  @MyTableVar WHERE NearestDate  = ClosestDt)
  ORDER BY ClosestDt ASC; 

      FETCH NEXT FROM db_cursor INTO @ID, @JoiningDt, @DocNum
END 

CLOSE db_cursor  
DEALLOCATE db_cursor
SELECT * FROM @MyTableVar ORDER BY ID 

ID JoiningDt DocNum NearestDate
1 2024-04-05 00:00:00.000 A123 2024-04-07 00:00:00.000
2 2024-04-06 00:00:00.000 A123 2024-04-08 00:00:00.000
3 2024-04-04 00:00:00.000 B123 2024-04-05 00:00:00.000
4 2024-04-07 00:00:00.000 A123 2024-04-09 00:00:00.000
5 2024-04-08 00:00:00.000 A123 2024-04-11 00:00:00.000

fiddle

Sql相关问答推荐

如何在T—SQL中找到值更改之前的日期?

如何在SQL中按每个子组的顺序更新数据?

SQL Select 最小优先级

PostgreSQL使用SQL子查询在时间间隔内 Select 数据

SQL子查询返回多个值错误

如何在SQL Server中统计按备注分组的记录数

在查询Oracle SQL中创建替代ID

使用拆分器将已分组的不同值连接在一起

在SQL中,筛选其他列中只有一个值的ID

此过程如何在不引用传递的参数值的情况下执行工作?

为什么左联接结果在MS Access数据库中不匹配

在特定条件下使用 LAG,确定要采用什么 LAG 值?

我可以在 T-SQL (SQL Server) 的函数内使用 OPTION 子句吗?

如何计算两个非周期性时间序列+分组的重叠持续时间

使用 regexp_replace 替换所有出现的特殊字符

从每行中排除最大元素

奇怪的甲骨文行为

BigQuery 错误:SELECT 列表表达式引用 esthetician.LICENSE_TYPE,它既未在 [49:8] 分组也未聚合

如何使用子查询锁定此查询中的选定行?

如何在 SQL Server 中参数化 Select top 'n'