我需要确定表中的权重是否有效.此表有五列:USER_ID、ACTIVATION_ID、ACTIVATION_DT、Weight和Weight_Source.查询输出需要包括名为VALID_WEIGHTY的计算列.每个用户的第一行应将Valid_Weight列标记为1.Valid_Weight列中的1表示权重有效.如果当前权重与VALID_WEIGHTY列中包含1的上次权重之间的变化小于5%,则认为权重有效.如果权重被视为有效,则VALID_WEIGHT列将被标记为1.如果权重被视为无效,则VALID_WEIGHT列将被标记为0.当前行的权重不能根据Valid_Weight列中为0的前一行进行计算,而必须根据Valid Weight列中为1的最新行进行计算.

我一直在try 使用递归CTE,但由于达到max_recursion_row限制,它一直失败.我认为我的代码正在产生无限循环.

我只需要知道判断表中所有行的最有效方法.

结果应该是这样的:

Activity_ID User_ID Activity_DT Weight Weight_Source Valid Weight
1 56 2017-04-19 163 clinic 1
2 56 2017-04-25 163.80 home_scale 1
3 56 2017-04-27 164.46 home_scale 1
4 56 2017-05-11 85.32 home_scale 0
5 56 2017-05-12 190.26 home_scale 0
6 56 2017-05-20 166.45 home_scale 1
7 56 2017-05-21 191.14 home_scale 0
8 56 2017-06-09 159.17 home_scale 1
9 56 2017-06-10 160.94 home_scale 1

以下是我一直试图引用的递归CTE代码:

WITH    RECURSIVE   cteValWgt(User_ID, Activity_ID, Activity_DT, Weight, Weight_Source, Valid_Weight)
        
AS      (SELECT comb.User_ID,
                comb.Activity_ID,
                comb.Activity_DT,
                comb.Weight,
                comb.Weight_Source,
                1 AS Valid_Weight
                
         FROM   etl.rdm_weight_validation AS COMB
         
         WHERE  comb.ROW# = 1
         
        UNION ALL
        
         SELECT comb.User_ID,
                comb.Activity_ID,
                comb.Activity_DT,
                comb.Weight,
                comb.Weight_Source,
                CASE    WHEN ABS(comb.Weight - vw.Weight)/(vw.Weight) <= 0.05 THEN 1
                        ELSE 0
                END AS Valid_Weight
                
         FROM   cteValWgt AS VW
                INNER JOIN  etl.rdm_weight_validation AS comb
                            ON  vw.User_ID = comb.User_ID
                                AND
                                vw.Activity_DT < comb.Activity_DT
                                AND
                                vw.Valid_Weight = 1)

SELECT  vw.User_ID,
        vw.Activity_ID,
        vw.Activity_DT,
        vw.Weight,
        vw.Weight_Source,
        vw.Valid_Weight

FROM    cteValWgt AS VW

ORDER BY vw.User_ID, vw.Activity_DT
                                

推荐答案

您的查询似乎并未实际迭代数据集;锚点按ROW# = 1过滤,但此列在递归中不起作用,在递归中,从CTE到数据表的联接条件看起来不正确.

在没有现有测序栏的情况下,我们可以从用row_number()枚举记录开始,这样我们就可以使用该信息高效地从一个记录遍历到下一个记录.

此外,我们还可以通过跟踪遇到的最后一个有效权重来优化查询,以便使用case个表达式直接与当前权重进行比较.

以下是对查询进行措辞的一种方式:

with recursive 
    -- enumerate the dataset for each user
    data as (
        select t.*, 
            row_number() over(partition by user_id order by activity_id) rn
        from mytable t 
    ),
    -- recursive CTE
    rcte (activity_id, user_id, activity_dt, weight, weight_source, rn, weight_is_valid, last_valid_weight) as (
        -- anchor : first row per user
        select activity_id, user_id, activity_dt, weight, weight_source, rn, 1, weight
        from data
        where rn = 1
        union all
        -- recursion
        select d.activity_id, d.user_id, d.activity_dt, d.weight, d.weight_source, d.rn,
            -- conditional expressions to assess the validity of the weight
            case when abs( d.weight - r.last_valid_weight ) / r.last_valid_weight <= 0.05 then 1 else 0 end,
            case when abs( d.weight - r.last_valid_weight ) / r.last_valid_weight <= 0.05 then d.weight else r.last_valid_weight end
        from rcte r
        -- join to bring the "next" user row
        inner join data d on d.user_id = r.user_id and d.rn = r.rn + 1
    )
select * from rcte
activity_id user_id activity_dt weight weight_source rn weight_is_valid last_valid_weight
1 56 2017-04-19 163.00 clinic 1 1 163.00
2 56 2017-04-25 163.80 home_scale 2 1 163.80
3 56 2017-04-27 164.46 home_scale 3 1 164.46
4 56 2017-05-11 85.32 home_scale 4 0 164.46
5 56 2017-05-12 190.26 home_scale 5 0 164.46
6 56 2017-05-20 166.45 home_scale 6 1 166.45
7 56 2017-05-21 191.14 home_scale 7 0 166.45
8 56 2017-06-09 159.17 home_scale 8 1 159.17
9 56 2017-06-10 160.94 home_scale 9 1 160.94

100

Sql相关问答推荐

在甲骨文中查找前一个星期一的S日期

替换条件中的单元格值

没有循环的SQL更新多个XML node 值

仅 for each 唯一ID返回一个元素,并仅返回最新连接的记录

当一个视图在Postgres中失效时?

我可以在SQL的IN子句中使用比子查询包含的值更少的值吗?

用户购买平台及金额统计

根据最大值为字母数字大小写分配数值

是否可以为表中的所有列生成散列值?

动态组/转置

SQL将三个表中的三列组合为一列

显示十进制列,但尽可能显示为整数

不存在记录的国外关键点

使用多个数据库调用重载 CQRS 模式

创建具有多个子查询的 SQL 视图

计算组内多个日期间隔go 年的累计天数

使用长 IN 子句的 SQL 优化

批量更改WooCommerce中所有产品的税收状态

忽略与给定列匹配的行的 LAG 函数

Postgres:表的累积视图