我有几年(2003-2008)的数据分布不均匀(WRT日期).我想查询给定开始和结束日期集的数据,按照PostgreSQL8.3(http://www.postgresql.org/docs/8.3/static/functions-datetime.html#FUNCTIONS-DATETIME-TRUNC)中支持的任何间隔(日、周、月、季度、年)对数据进行分组.
问题是一些查询在所要求的时间内给出连续的结果, 就像这个一样:
select to_char(date_trunc('month',date), 'YYYY-MM-DD'),count(distinct post_id)
from some_table where category_id=1 and entity_id = 77 and entity2_id = 115
and date <= '2008-12-06' and date >= '2007-12-01' group by
date_trunc('month',date) order by date_trunc('month',date);
to_char | count
------------+-------
2007-12-01 | 64
2008-01-01 | 31
2008-02-01 | 14
2008-03-01 | 21
2008-04-01 | 28
2008-05-01 | 44
2008-06-01 | 100
2008-07-01 | 72
2008-08-01 | 91
2008-09-01 | 92
2008-10-01 | 79
2008-11-01 | 65
(12 rows)
但其中一些由于没有数据而错过了一些时间间隔,如下所示:
select to_char(date_trunc('month',date), 'YYYY-MM-DD'),count(distinct post_id)
from some_table where category_id=1 and entity_id = 75 and entity2_id = 115
and date <= '2008-12-06' and date >= '2007-12-01' group by
date_trunc('month',date) order by date_trunc('month',date);
to_char | count
------------+-------
2007-12-01 | 2
2008-01-01 | 2
2008-03-01 | 1
2008-04-01 | 2
2008-06-01 | 1
2008-08-01 | 3
2008-10-01 | 2
(7 rows)
其中,所需的结果集为:
to_char | count
------------+-------
2007-12-01 | 2
2008-01-01 | 2
2008-02-01 | 0
2008-03-01 | 1
2008-04-01 | 2
2008-05-01 | 0
2008-06-01 | 1
2008-07-01 | 0
2008-08-01 | 3
2008-09-01 | 0
2008-10-01 | 2
2008-11-01 | 0
(12 rows)
缺少条目的计数为0.
我看过之前关于Stack Overflow的讨论,但他们似乎没有解决我的问题,因为我的分组周期是(日、周、月、季度、年)之一,运行时由应用程序决定.所以我想像使用日历表或顺序表的左连接这样的方法不会有什么帮助.
我目前的解决方案是使用日历模块在Python中(在Turbogears App中)填补这些空白.
有没有更好的方法来做这件事.