用户提交他/她阅读的开始和结束页面的间隔 书请注意,用户可以为同一本书提交多个间隔. 我需要查询宣布最推荐的五本书在系统中,这是根据有多少个独特的页面被读取, 在第一个操作中提交了间隔的所有用户(按读页数最多的书到读页数最少的书排序).

book_user表是我需要查询的数据透视表,因此如何获得插入记录的结果:

阅读间隔:

User 1 read from page 10 to page 30 in Book 1
User 2 read from page 2 to page 25 in Book 1
User 1 read from page 40 to page 50 in Book 2
User 3 read from page 1 to page 10 in Book 2
The most read books results:
Book 1 -> 28 pages
Book 2 -> 20 pages

我try 这个查询:

select 'book_id',books.name as book_name,SUM(end_page - start_page) AS num_of_read_pages FROM book_user JOIN books ON books.id=book_user.book_id GROUP BY book_id ORDER BY num_of_read_pages DESC;

但它并没有得到重叠间隔的唯一页面

当我问chatgpt时,它给了我这个查询,它是递归的,但是它不工作,它只是循环

WITH RECURSIVE cte AS (
    SELECT book_id, MIN(start_page) AS start_page, MAX(end_page) AS end_page
    FROM book_user
    GROUP BY book_id, start_page
    UNION ALL
    SELECT cte.book_id, cte.start_page, cte.end_page
    FROM cte
    JOIN book_user ON cte.book_id = book_user.book_id AND cte.start_page <= book_user.start_page AND cte.end_page >= book_user.end_page
)
SELECT book_id, SUM(end_page - start_page + 1) AS total_pages
FROM cte
GROUP BY book_id
ORDER BY total_pages DESC;

推荐答案

请参见示例.

  1. 具有相同开始日期的组范围(段),如(3-10)、(3-10)、(3-21)-(3-21).带有开始日期的ROW_NUMBER()
  2. Recursive query
    2.1. Anchor. Take ranges, where start_page not belongs any other range.
    From set (3-10),(5-12),(4-7) take (3-10)
    2.2. Join ranges
  3. 再次按start_date分组
  4. 计数页数
with recursive  t as( 
     -- join ranges with same start_page 
     -- and row_number() for sequence join
  select book_id,start_page,max(end_page)end_page
    ,row_number()over(partition by book_id order by start_page) rn
  from book_user
  group by book_id,start_page
)
,r as(  -- recursive join 
     -- anchor - ranges with a "free" start_page
  select 0 lvl,bu.book_id,bu.start_page,bu.end_page,bu.rn
  from t bu
  where not exists(select 1 from t bu2 
         where bu2.book_id=bu.book_id and bu2.rn<bu.rn
           and bu.start_page between bu2.start_page and bu2.end_page)
  union all
  select lvl+1,r.book_id,r.start_page,t.end_page,t.rn
  from r inner join t on t.book_id=r.book_id and t.rn>r.rn
       and r.end_page between t.start_page and r.end_page
)
select book_id,sum(end_page-start_page+1) total_pages
from ( -- again, we group segments with the same start_page and different end_page
      select book_id,start_page,max(end_page) end_page 
      from r 
      group by book_id,start_page
  ) gr
group by book_id

详情请点击此处demo

输出

book_id total_pages path
1 38 2-2:25,26,2:25,26:31,33-33:40
2 25 1-1:10,1:10:11,1:10:11:14,40-40:50
3 31 1-1:10,20-20:40

测试数据

create table books(id int,book_name varchar(20));
insert into books values(1,'Book 1'),(2,'Book 2'),(3,'Book 3');
create table users(id int,user_name varchar(20));
insert into users values(10,'User 10'),(20,'User 20'),(30,'User 40'),(30,'User 40');
create table book_user(user_id int,book_id int,start_page int,end_page int);
insert into book_user values
 (10,1, 10,30)
,(20,1,  2,25)
,(30,1,  2,26)
,(30,1, 10,31)
,(40,1, 33,40)
,(10,2, 40,50)
,(30,2,  1,10)
,(40,2, 10,11)
,(20,2, 11,14)
,(10,3,  1,10)
,(20,3, 20,40)
;

Mysql相关问答推荐

在MySQL中存储一条帖子的点赞数量

MySQL:返回所有条件(但不满足其他条件)为真的所有结果

比较2个mysql json数据类型列

表列中的SQL SUM MENY值记录单个查询

MySQL 关于 JSON 数组和子查询的问题

生成json数据并根据特定表的匹配条件进行过滤

Mysql - 按周分组但从 1 开始计数

如何使用等于、喜欢和不等于在 json 字段上编写查询?

java.lang.NullPointerException:无法调用com.proj.my.repository.OrderRepository.save(Object),因为this.orderRepository为空

MySql,如何从列中替换某种格式的字符串

MySQL如何在小时和分钟之间 Select 时间

Over() 函数不覆盖表中的所有行

WHERE SQL 语句中的列顺序是否重要

DECIMAL(m, n) 在 64 位系统中如何表示?

在 PHP 中获取 MySQL 列的总和

MYSQL:创建表中的索引关键字以及何时使用它

Doctrine2 迁移向下迁移并从浏览器而不是命令行迁移

MySQL 连接运算符

更改 Laravel 的 created_at 和 updated_at 的名称

从另一个表中 Select 具有 id 的行