因此,我正在try 编写一个用数据(词典列表)填充Excel工作表的Python脚本. 问题是我试图在Excel表格中填充的表格位于表格的中间(它不是从A1开始的).

因此,我编写了这个方法,它从表的开始位置开始,然后填充它:

def generateExcel(data_list):
    file_path = 'Finaltemplatetesting.xlsx'

    sheet_name = 'Main'
    workbook = openpyxl.load_workbook(file_path)

    # Select the sheet
    sheet = workbook[sheet_name]
    # Find the starting cell (top-left cell) of the table
    i=0
    for row in sheet.iter_rows(values_only=True):
        i=i+1
        if 'Project' in row:
            header_row = row
            break

    header_col = header_row.index('Project')
    #intializing the starting row
    start_row=i+2
    # Fill the table with the new data
    
    for data in data_list:
        sheet.insert_rows(start_row)
        for row_data in data:
            for col_idx, col_name in enumerate(header_row):
                cell = sheet.cell(row=start_row, column=header_col + col_idx, value=row_data.get(col_name))   
        start_row += 1
    workbook.save("Finaltempalterrrrrr.xlsx")

桌子下面有一些单元格,我正在用它们来计算...等等,所以我的 idea 是在使用sheet.insert_rows()的列下面添加一行,然后填充它.

当我运行此函数时,它会损坏我的Excel工作表,就像我打开我得到的工作表一样:

we found a problem with some content in 'finaltemplaterrrrr.xlsx' , do you want us to try to recover as much as we can ? if you trust the source of this workbook, click yes.

然后,在我单击是之后,它将关闭工作表.

有趣的是,如果我try 以这种方式只添加一行,它可以很好地工作,并且它完全按照我想要的方式添加行:

        sheet.insert_rows(start_row)
        for row_data in data:
            for col_idx, col_name in enumerate(header_row):
                cell = sheet.cell(row=start_row, column=header_col + col_idx, value=row_data.get(col_name))   
        
    workbook.save("Finaltempalterrrrrr.xlsx")

我不知道当我try 使用for循环添加多行时,它为什么不能工作,如上面的代码所示


图纸模板的一部分如下所示:

sheet template

-

虚拟数据列表

[
    [{'Value1': '15', 'Project': 'AVTR', 'Score': 'Normal', 'Name': 'Raul', 'Date': '2023-03-16T14:58:33+01:00'}],
    [{'Value1': '10', 'Project': 'TRG', 'Score': 'High', 'Name': 'Bob', 'Date': 'N/A'}],
    [{'Value1': '12', 'Project': 'AVTR', 'Score': 'High', 'Name': 'Alice', 'Date': 'N/A'}]
    ]

推荐答案

在您的代码中,对于data_list中的每个数据项,您将插入一个新行,然后填充它.

# Starting point for row insertion.
start_row = i + 2

# Iterate through the data_list.
for data in data_list:
    # Insert a row at start_row.
    sheet.insert_rows(start_row)

    # [Populate the inserted row]

    # Increment the start_row by 1 for the next iteration.
    start_row += 1

因此,随着循环的进行,插入点(start_row)的索引递增.

例如,让我们考虑下表,目标是插入AliceBob之间的数据:

Name Value
Alice 5
Bob 10
... ...

这意味着对于每个后续数据项,您将把前面的数据项和Bob进一步向下推.

例如,对于插入在AliceBob之间的三个数据项:

Name Value
Alice 5
Data1 val1
Data2 val2
Data3 val3
Bob 10

当您考虑Excel中复杂的格式和对象时,这可能是一个问题:多次插入行可能会导致格式、公式、合并单元格等方面的压力,这可能会导致报告的损坏.


为了进行测试,我将try 在开头插入整个data_list所需的所有行:您将执行一次插入操作,然后使用def insert_rows(self, idx, amount=1)方法填充这些行,而不是在循环中重复插入行:

sheet.insert_rows(start_row, len(data_list))

这将是:

def generateExcel(data_list):
    file_path = 'Finaltemplatetesting.xlsx'

    sheet_name = 'Main'
    workbook = openpyxl.load_workbook(file_path)

    # Select the sheet
    sheet = workbook[sheet_name]
    # Find the starting cell (top-left cell) of the table
    i = 0
    for row in sheet.iter_rows(values_only=True):
        i = i+1
        if 'Project' in row:
            header_row = row
            break

    header_col = header_row.index('Project')
    # Initializing the starting row
    start_row = i+2
    # Insert all required rows at once
    sheet.insert_rows(start_row, len(data_list))

    # Now fill the data
    for data in data_list:
        for col_idx, col_name in enumerate(header_row):
            cell = sheet.cell(row=start_row, column=header_col + col_idx, value=data.get(col_name))
        start_row += 1

    workbook.save("Finaltempalterrrrrr.xlsx")

如果您的data_list struct 是:

[
    [{'Value1': '15', 'Project': 'AVTR', 'Score': 'Normal', 'Name': 'Raul', 'Date': '2023-03-16T14:58:33+01:00'}],
    [{'Value1': '10', 'Project': 'TRG', 'Score': 'High', 'Name': 'Bob', 'Date': 'N/A'}],
    [{'Value1': '12', 'Project': 'AVTR', 'Score': 'High', 'Name': 'Alice', 'Date': 'N/A'}]
]

这将意味着它有一个额外的列表层,这不是基于初始代码的预期.具体地说,每个词典都包装在一个列表中,使得data_list成为词典列表的列表.初始代码期望data_list只是一个字典列表,如:

data_list = [
    {'Value1': '15', 'Project': 'AVTR', 'Score': 'Normal', 'Name': 'Raul', 'Date': '2023-03-16T14:58:33+01:00'},
    {'Value1': '10', 'Project': 'TRG', 'Score': 'High', 'Name': 'Bob', 'Date': 'N/A'},
    {'Value1': '12', 'Project': 'AVTR', 'Score': 'High', 'Name': 'Alice', 'Date': 'N/A'}
]

如果您希望使用原始的data_list struct ,则需要调整在循环中访问数据的方式.具体地说,您需要遍历data_list中的内部列表:

# Now fill the data
for inner_list in data_list:
    for data in inner_list:
        for col_idx, col_name in enumerate(header_row):
            cell = sheet.cell(row=start_row, column=header_col + col_idx, value=data.get(col_name))
        start_row += 1

在为三个数据项插入行之后,您将得到三个空行:

Name Value
Alice 5
Bob 10

然后,后续循环填充以下行:

Name Value
Alice 5
Data1 val1
Data2 val2
Data3 val3
Bob 10

您只需调整工作表一次,而不是多次操作工作表的 struct .这降低了无意中扰乱Excel内部 struct 或其各种对象和引用的风险.

然后判断问题是否仍然存在.


If the issue persists even after simplifying the code to just insert the rows, there might be other complexities within the Excel file causing the corruption. These could be related to formulas, formatting, merged cells, charts, or other objects. Merged cells, especially, can be troublesome when inserting rows/columns. If the table area has merged cells, consider unmerging them and trying the row insertion again.
As a troubleshooting step, create a very simple Excel file with a table in the middle (just as in your actual scenario, but without any other features or data). Run your script on this file to see if the issue persists. If it works, then gradually add the other features from your actual Excel file to this test file, testing at each stage. That will help identify which feature in your original Excel file is causing the issue.

有时,问题可能出在Excel模板本身.try 打开Excel模板(Finaltemplatetesting.xlsx),然后将其另存为新文件.现在,在脚本中使用新文件来查看问题是否仍然存在.

您还可以try 其他方法进行测试:

仍然是openpyxl:

  • 将整个工作表读取到一个Python数据 struct 中.
  • 清理床单.
  • 写回数据,在需要的地方插入新行. 这样,您就不需要不断地编辑实时工作表,并且可以减少损坏的可能性.

Or, again for testing, you could try pandas, with openpyxl backend**:
You would read the Excel file into a DataFrame, manipulate the DataFrame, and then write the DataFrame back to Excel.

例如:

import pandas as pd

# Load the Excel file into a DataFrame
df = pd.read_excel('Finaltemplatetesting.xlsx', sheet_name='Main', engine='openpyxl')

# Assuming your data_list looks like [{'Project': 'proj1', 'Value': 'value1'}, {'Project': 'proj2', 'Value': 'value2'}]
data_df = pd.DataFrame(data_list)

# Locate the position of 'Project' in the original DataFrame
idx = df[df['Name'] == 'Project'].index[0]

# Split the DataFrame into two parts: above and below the insertion point
df1 = df.iloc[:idx+1]
df2 = df.iloc[idx+1:]

# Concatenate the three DataFrames: df1, data_df, and df2
result = pd.concat([df1, data_df, df2], ignore_index=True)

# Save the result back to Excel
result.to_excel('Finaltempalterrrrrr.xlsx', index=False, engine='openpyxl')

Python相关问答推荐

2维数组9x9,不使用numpy.数组(MutableSequence的子类)

如何使用pandasDataFrames和scipy高度优化相关性计算

运行Python脚本时,用作命令行参数的SON文本

Julia CSV for Python中的等效性Pandas index_col参数

Python虚拟环境的轻量级使用

如何请求使用Python将文件下载到带有登录名的门户网站?

pyscript中的压痕问题

如何获得每个组的时间戳差异?

在极性中创建条件累积和

使用Python从rotowire中抓取MLB每日阵容

循环浏览每个客户记录,以获取他们来自的第一个/最后一个渠道

当条件满足时停止ODE集成?

使用字典或列表的值组合

如果有2个或3个,则从pandas列中删除空格

如何为需要初始化的具体类实现依赖反转和接口分离?

read_csv分隔符正在创建无关的空列

PYTHON中的pd.wide_to_long比较慢

PYTHON中的selenium不会打开 chromium URL

大型稀疏CSR二进制矩阵乘法结果中的错误

为什么在不先将包作为模块导入的情况下相对导入不起作用