设置Pandas DataFrame的索引时,列数组的最后一个元素不会将项合并/分组在一起.

假设以下测试数据:

test_data = {
    "desk": ["DESK1", "DESK2", "DESK3", "DESK4", "DESK5", "DESK6", "DESK7", "DESK8", "DESK9", "DESK10"],
    "phone": ["111-1111", "111-1111", "111-1111", "111-1111", "444-4444", "444-4444", "111-1111", "111-1111", "123-4567", "123-4567"],
    "email": ["Desk1_Email@email.com", "Desk1_Email@email.com", "Desk3_Email@email.com", "Desk4_Email@email.com", "Desk5@email.com", "Desk5@email.com", "Desk7@email.com", "Desk8@email.com", "Desk9@email.com", "Desk10@email.com"],
    "team1": ["Adam", "xxxx", "Tiana", "", "Gina", "Gina", "Ruby", "Becca", "John", ""],
    "team2": ["", "", "Dime", "", "Ed", "", "", "", "Fa", "Tim"],
}

已创建DataFrame:

import io
import pandas as pd
from django.http.response import HttpResponse
from rest_framework import status

### Create DataFrame from test_data
df = pd.DataFrame(test_data)

然后try 写入并返回文件(&A)

### Write & return the file
with io.BytesIO() as buffer:
    with pd.ExcelWriter(buffer) as writer:
        df: pd.DataFrame = df
        groupby_columns = ['desk', 'phone', 'email']
        df.set_index(groupby_columns, inplace=True, drop=True, append=False )
        df.to_excel(writer, index=True, sheet_name="Team Matrix", merge_cells=True)

        return HttpReponse(
            buffer.getvalue(),
            headers={
                "Content-Type": "application/vnd.openxmlformats-" "officedocument.spreadsheetml.sheet",
                "Content-Disposition": "attachment; filename=excel-export.xlsx",
            },
            status=status.HTTP_201_CREATED,
        )

它返回以下文件: Returned Excel File

但我想要的是前三列(办公桌,电话,邮箱)合并,如果相同的数据,这与上面的代码它做了办公桌和电话列,但电话列不像其他两个分组/合并. Desired Returned Excel File

推荐答案

一种可能的解决方案是,将空("")值放入所需的单元格,然后合并单元格:

这将创建一个具有空单元格的新数据帧:

def fn(x):
    x.loc[x.index[0] + 1 :, ["desk", "phone", "email"]] = ""
    return x


empty_rows = df.loc[:, ["team1", "team2"]].eq("").all(axis=1)
groups = ((df["email"] != df["email"].shift()) | empty_rows).cumsum()
df = df.groupby(groups, group_keys=False).apply(fn)

打印:

     desk     phone                  email  team1 team2
0   DESK1  111-1111  Desk1_Email@email.com   Adam      
1                                            xxxx      
2   DESK3  111-1111  Desk2_Email@email.com  Tiana  Dime
3   DESK4  111-1111  Desk2_Email@email.com             
4   DESK5  444-4444     my_email@email.com   Gina    Ed
5                                            Gina      
6   DESK7  111-1111         MagicSchoolbus   Ruby      
7   DESK8  111-1111        Desk8@email.com  Becca      
8   DESK9  123-4567        Desk9@email.com   John    Fa
9  DESK10  123-4567       Desk10@email.com          Tim

此步骤将合并EXCEL中的前3列:

def merge_fn(g):
    if len(g) == 1:
        return
    first, last = g.index[0] + 1, g.index[-1] + 1
    worksheet.merge_range(first, 0, last, 0, g.iat[0, 0], merge_format)
    worksheet.merge_range(first, 1, last, 1, g.iat[0, 1], merge_format)
    worksheet.merge_range(first, 2, last, 2, g.iat[0, 2], merge_format)


writer = pd.ExcelWriter("out.xlsx", engine="xlsxwriter")

df.to_excel(writer, sheet_name="Team Matrix", index=False)
workbook = writer.book
worksheet = writer.sheets["Team Matrix"]
merge_format = workbook.add_format({"align": "left", "valign": "top", "border": 0})
df.groupby(groups, group_keys=False).apply(merge_fn)

writer.close()

创建out.xlsx个(来自LibreOffice的屏幕快照):

enter image description here

Python-3.x相关问答推荐

如何创建多个日志(log)文件

是否可以使用参数对Flask重定向?

与 pandas 0.22 相比,pandas 2.0.3 中的 df.replace() 会抛出 ValueError 错误

「Python Pandas」多级索引列和行匹配,如果列和行名称相似,则排除这些单元格中的值添加

Sunburst 折线图可视化

使用 Python 截断并重新编号对应于特定 ID/组的列

在 pytest 中,如何测试 sys.exit('some error message')?

Python 列表求和所有出现的保留顺序

Dask 多阶段资源设置导致 Failed to Serialize 错误

无法使用 Python 和 Selenium 检索 href 属性

判断是否存在大文件而不下载它

具有函数值的 Python 3 枚举

运行 PyCharm 测试时如何解决django.core.exceptions.ImproperlyConfigured:找不到 GDAL 库?

Python 3.9.8 使用 Black 并导入 `typed_ast.ast3` 失败

virtualenv virtualenvwrapper virtualenv:错误:无法识别的参数:--no-site-packages

Linux Mint 上的 Python3 错误没有名为蓝牙的模块

Python configparser 不会接受没有值的键

如何获得 BeautifulSoup 标签的所有直接子代?

尾部斜杠的 FastAPI 重定向返回非 ssl 链接

python asyncio add_done_callback 与 async def