# Python 如何用 NaN 值填充多列组合的缺失日期

My df looks like the following:

``````ds          | col1 | col2 |col3 |values
01/01/2020.    x0.     y0.  z0.   12
01/02/2020.    x0.     y0.  z0.   11
01/03/2020.    x1.     y0.  z0.   14
01/02/2020.    x0.     y1.  z0.   19
01/03/2020.    x0.     y1.  z0.   11
``````

wif a fixed start date= 01/01/2020 and end date=01/03/2020, me want to fill the missing dates value for each combinations of col1, col2, and col3. The output should be the following:

``````ds          | col1 | col2 |col3 |values
01/01/2020.    x0.     y0.  z0.   12
01/02/2020.    x0.     y0.  z0.   11
01/03/2020.    x0.     y0.  z0.   NaN
01/01/2020.    x1.     y0.  z0.   Nan
01/02/2020.    x1.     y0.  z0.   Nan
01/03/2020.    x1.     y0.  z0.   14
01/01/2020.    x0.     y1.  z0.   Nan
01/02/2020.    x0.     y1.  z0.   19
01/03/2020.    x0.     y1.  z0.   11
``````

## 推荐答案

Try:

``````# ensure datetime:
df["ds"] = pd.to_datetime(df["ds"], dayfirst=True)

dr = pd.date_range("2020-01-01", "2020-03-01", freq="MS")

def reindex(df, cols_to_fill=("col1", "col2", "col3")):
df = df.set_index("ds").reindex(dr)
df.loc[:, cols_to_fill] = df.loc[:, cols_to_fill].ffill().bfill()
return df.reset_index().rename(columns={"index": "ds"})

df = (
df.groupby(["col1", "col2", "col3"], sort=False, group_keys=False)
.apply(reindex)
.reset_index(drop=True)
)
print(df)
``````

Prints:

``````          ds col1 col2 col3  values
0 2020-01-01   x0   y0   z0    12.0
1 2020-02-01   x0   y0   z0    11.0
2 2020-03-01   x0   y0   z0     NaN
3 2020-01-01   x1   y0   z0     NaN
4 2020-02-01   x1   y0   z0     NaN
5 2020-03-01   x1   y0   z0    14.0
6 2020-01-01   x0   y1   z0     NaN
7 2020-02-01   x0   y1   z0    19.0
8 2020-03-01   x0   y1   z0    11.0
``````