在您的代码中,对于data_list
中的每个数据项,您将插入一个新行,然后填充它.
# Starting point for row insertion.
start_row = i + 2
# Iterate through the data_list.
for data in data_list:
# Insert a row at start_row.
sheet.insert_rows(start_row)
# [Populate the inserted row]
# Increment the start_row by 1 for the next iteration.
start_row += 1
因此,随着循环的进行,插入点(start_row
)的索引递增.
例如,让我们考虑下表,目标是插入Alice
到Bob
之间的数据:
Name |
Value |
Alice |
5 |
Bob |
10 |
... |
... |
这意味着对于每个后续数据项,您将把前面的数据项和Bob
进一步向下推.
例如,对于插入在Alice
和Bob
之间的三个数据项:
Name |
Value |
Alice |
5 |
Data1 |
val1 |
Data2 |
val2 |
Data3 |
val3 |
Bob |
10 |
当您考虑Excel中复杂的格式和对象时,这可能是一个问题:多次插入行可能会导致格式、公式、合并单元格等方面的压力,这可能会导致报告的损坏.
为了进行测试,我将try 在开头插入整个data_list
所需的所有行:您将执行一次插入操作,然后使用def insert_rows(self, idx, amount=1)
方法填充这些行,而不是在循环中重复插入行:
sheet.insert_rows(start_row, len(data_list))
这将是:
def generateExcel(data_list):
file_path = 'Finaltemplatetesting.xlsx'
sheet_name = 'Main'
workbook = openpyxl.load_workbook(file_path)
# Select the sheet
sheet = workbook[sheet_name]
# Find the starting cell (top-left cell) of the table
i = 0
for row in sheet.iter_rows(values_only=True):
i = i+1
if 'Project' in row:
header_row = row
break
header_col = header_row.index('Project')
# Initializing the starting row
start_row = i+2
# Insert all required rows at once
sheet.insert_rows(start_row, len(data_list))
# Now fill the data
for data in data_list:
for col_idx, col_name in enumerate(header_row):
cell = sheet.cell(row=start_row, column=header_col + col_idx, value=data.get(col_name))
start_row += 1
workbook.save("Finaltempalterrrrrr.xlsx")
如果您的data_list
struct 是:
[
[{'Value1': '15', 'Project': 'AVTR', 'Score': 'Normal', 'Name': 'Raul', 'Date': '2023-03-16T14:58:33+01:00'}],
[{'Value1': '10', 'Project': 'TRG', 'Score': 'High', 'Name': 'Bob', 'Date': 'N/A'}],
[{'Value1': '12', 'Project': 'AVTR', 'Score': 'High', 'Name': 'Alice', 'Date': 'N/A'}]
]
这将意味着它有一个额外的列表层,这不是基于初始代码的预期.具体地说,每个词典都包装在一个列表中,使得data_list
成为词典列表的列表.初始代码期望data_list
只是一个字典列表,如:
data_list = [
{'Value1': '15', 'Project': 'AVTR', 'Score': 'Normal', 'Name': 'Raul', 'Date': '2023-03-16T14:58:33+01:00'},
{'Value1': '10', 'Project': 'TRG', 'Score': 'High', 'Name': 'Bob', 'Date': 'N/A'},
{'Value1': '12', 'Project': 'AVTR', 'Score': 'High', 'Name': 'Alice', 'Date': 'N/A'}
]
如果您希望使用原始的data_list
struct ,则需要调整在循环中访问数据的方式.具体地说,您需要遍历data_list
中的内部列表:
# Now fill the data
for inner_list in data_list:
for data in inner_list:
for col_idx, col_name in enumerate(header_row):
cell = sheet.cell(row=start_row, column=header_col + col_idx, value=data.get(col_name))
start_row += 1
在为三个数据项插入行之后,您将得到三个空行:
Name |
Value |
Alice |
5 |
|
|
|
|
|
|
Bob |
10 |
然后,后续循环填充以下行:
Name |
Value |
Alice |
5 |
Data1 |
val1 |
Data2 |
val2 |
Data3 |
val3 |
Bob |
10 |
您只需调整工作表一次,而不是多次操作工作表的 struct .这降低了无意中扰乱Excel内部 struct 或其各种对象和引用的风险.
然后判断问题是否仍然存在.
If the issue persists even after simplifying the code to just insert the rows, there might be other complexities within the Excel file causing the corruption. These could be related to formulas, formatting, merged cells, charts, or other objects. Merged cells, especially, can be troublesome when inserting rows/columns. If the table area has merged cells, consider unmerging them and trying the row insertion again.
As a troubleshooting step, create a very simple Excel file with a table in the middle (just as in your actual scenario, but without any other features or data). Run your script on this file to see if the issue persists. If it works, then gradually add the other features from your actual Excel file to this test file, testing at each stage. That will help identify which feature in your original Excel file is causing the issue.
有时,问题可能出在Excel模板本身.try 打开Excel模板(Finaltemplatetesting.xlsx
),然后将其另存为新文件.现在,在脚本中使用新文件来查看问题是否仍然存在.
您还可以try 其他方法进行测试:
仍然是openpyxl
:
- 将整个工作表读取到一个Python数据 struct 中.
- 清理床单.
- 写回数据,在需要的地方插入新行.
这样,您就不需要不断地编辑实时工作表,并且可以减少损坏的可能性.
Or, again for testing, you could try pandas
, with openpyxl
backend**:
You would read the Excel file into a DataFrame, manipulate the DataFrame, and then write the DataFrame back to Excel.
例如:
import pandas as pd
# Load the Excel file into a DataFrame
df = pd.read_excel('Finaltemplatetesting.xlsx', sheet_name='Main', engine='openpyxl')
# Assuming your data_list looks like [{'Project': 'proj1', 'Value': 'value1'}, {'Project': 'proj2', 'Value': 'value2'}]
data_df = pd.DataFrame(data_list)
# Locate the position of 'Project' in the original DataFrame
idx = df[df['Name'] == 'Project'].index[0]
# Split the DataFrame into two parts: above and below the insertion point
df1 = df.iloc[:idx+1]
df2 = df.iloc[idx+1:]
# Concatenate the three DataFrames: df1, data_df, and df2
result = pd.concat([df1, data_df, df2], ignore_index=True)
# Save the result back to Excel
result.to_excel('Finaltempalterrrrrr.xlsx', index=False, engine='openpyxl')