下面是一张不同年份不同口味冰淇淋的桌子(名称:ice_cream_table):
Year Flavor
2008 Mint
2008 Mint
2008 Cookie Dough
2008 Cookie Dough
2008 Pistachio
2013 Chocolate
2013 Cookie Dough
2013 Pistachio
2013 Chocolate
2013 Pistachio
2017 Chocolate
2017 Vanilla
2017 Chocolate
2017 Cookie Dough
2017 Strawberry
2019 Mango
2019 Lemon
2019 Vanilla
2019 Mango
2019 Mango
2022 Chocolate
2022 Chocolate
2022 Mint
2022 Strawberry
2022 Cherry
CREATE TABLE ice_cream_table (
"Year" INT,
"Flavor" VARCHAR(100)
);
INSERT INTO ice_cream_table
("Year", "Flavor")
VALUES
('2008', 'Mint'),
('2008', 'Mint'),
('2008', 'Cookie Dough'),
('2008', 'Cookie Dough'),
('2008', 'Pistachio'),
('2013', 'Chocolate'),
('2013', 'Cookie Dough'),
('2013', 'Pistachio'),
('2013', 'Chocolate'),
('2013', 'Pistachio'),
('2017', 'Chocolate'),
('2017', 'Vanilla'),
('2017', 'Chocolate'),
('2017', 'Cookie Dough'),
('2017', 'Strawberry'),
('2019', 'Mango'),
('2019', 'Lemon'),
('2019', 'Vanilla'),
('2019', 'Mango'),
('2019', 'Mango'),
('2022', 'Chocolate'),
('2022', 'Chocolate'),
('2022', 'Mint'),
('2022', 'Strawberry'),
('2022', 'Cherry');
在每一年,我都想弄清楚:有多少口味是第一次出现的,有多少口味是往年出现的,有多少种不同的口味(第一次+重复)?
以下是我的代码:
with yearly_flavor as (
select
year,
flavor,
row_number() over (partition by flavor order by year) as rn
from
ice_cream_table
group by
flavor, year
),
new_flavor as (
select
year,
count(flavor) as new_flavor
from
yearly_flavor
where
rn = 1
group by
year
),
repeated_flavor as (
select
year,
case
when count(flavor) is null then 0
else count(flavor)
end as repeated_flavor,
count(flavor) as new_flavor
from
yearly_flavor
where
rn > 1
group by
year
),
total_flavor as (
select
year,
count(distinct flavor) as total_flavor
from
ice_cream_table
group by
year
)
select
n.year,
n.new_flavor,
r.repeated_flavor,
t.total_flavor
from
new_flavor n
left join
repeated_flavor r on n.year = r.year
join
total_flavor t on n.year = t.year
order by
n.year;
我的输出:
+------+-----------+----------------+--------------+
| Year | new_flavor | repeated_flavor| total_flavor |
+------+-----------+----------------+--------------+
| 2008 | 3 | NULL | 3 |
| 2013 | 1 | 2 | 3 |
| 2017 | 2 | 2 | 4 |
| 2019 | 2 | 1 | 3 |
| 2022 | 1 | 3 | 4 |
+------+-----------+----------------+--------------+
我做得对吗?我使用Python的经验更丰富,而使用SQL的经验更少.代码运行,但我不确定我的逻辑是否捕捉到了本质.