正如您可以在https://python-pptx.readthedocs.io/en/latest/api/text.html%的python-pptx文档中找到的那样
- 文本框架由段落和
- 段落由游程组成,并指定用作其游程默认设置的字体配置.
- 运行指定具有特定字体配置的段落文本的一部分-可能与段落中的默认字体配置不同
这三个字段都有一个名为text
的字段:
- 文本框架的
text
包含它的所有段落中的所有文本,它们连接在一起,并在段落之间使用适当的换行符.
- 段落的
text
包含它的所有运行的所有文本,这些文本连接成一个长字符串,在运行的任何文本中出现所谓的软换行符(软换行符)的地方都放置一个垂直制表符(\v)(软换行符类似于换行符,但不会结束段落).
- 游程
text
包含要用特定字体配置(字体系列、字体大小、斜体/粗体/下划线、 colored颜色 等)呈现的文本.它是任何文本的字体配置的最低级别.
现在,如果您在PowerPoint演示文稿的文本框架中指定一行文本,则此文本框架很可能只有一个段落,该段落将只有一个运行.
假设这一行写着:Hi there! How are you? What is your name?
,并且都是正常的(既不斜体也不粗体),大小为10.
现在,如果你继续使用PowerPoint,并通过将问题How are you? What is your name?
设置为斜体来突出它们,你将在我们的段落中得到2个连续的结果:
Hello there!
,使用段落中的默认字体配置
How are you? What is you name?
,字体配置指定附加的italic属性.
现在想象一下,我们想让How are you?
更突出,把它变成粗体和斜体.我们最终得了3分:
Hello there!
,使用段落中的默认字体配置.
How are you?
,字体配置指定BOLD和ITALIC属性
What is your name?
,其中字体配置指定ITALIC属性.
更进一步,使How are you?
人中的are
人更大.我们得到5分:
Hello there!
,使用段落中的默认字体配置.
How
,字体配置指定BOLD和ITALIC属性
are
,字体配置指定BOLD和ITALIC属性以及字体大小16
you?
,字体配置指定BOLD和ITALIC属性
What is your name?
,其中字体配置指定ITALIC属性.
因此,如果您试图用您问题中的代码I'm fine!
来替换How are you?
,您将不会成功,因为文本How are you?
实际上分布在3个运行中.
您可以再往上一层查看段落的文本,仍然是Hello there! How are you? What is your name?
,因为它是所有运行的文本的串联.
但如果您继续替换段落的文本,它将擦除所有运行,并始终使用文本Hello there! I'm fine! What is your name?
创建一个新运行,同时删除我们在What is your name?
上设置的所有格式.
因此,在不影响段落中其他文本格式的情况下更改段落中的文本是相当复杂的.即使您要查找的文本具有所有相同的格式,也不能保证它在一次运行中.因为如果你--在上面的例子中--再把are
变小,5个游程很可能会保留下来,2到4个游程现在只有相同的字体配置.
下面是生成一个测试演示文稿的代码,其中包含一个文本框,其中包含与我上面的示例中给出的完全相同的段落:
from pptx import Presentation
from pptx.chart.data import CategoryChartData
from pptx.enum.chart import XL_CHART_TYPE,XL_LABEL_POSITION
from pptx.util import Inches, Pt
from pptx.dml.color import RGBColor
from pptx.enum.dml import MSO_THEME_COLOR
# create presentation with 1 slide ------
prs = Presentation()
slide = prs.slides.add_slide(prs.slide_layouts[5])
textbox_shape = slide.shapes.add_textbox(Pt(200),Pt(200),Pt(30),Pt(240))
text_frame = textbox_shape.text_frame
p = text_frame.paragraphs[0]
font = p.font
font.name = 'Arial'
font.size = Pt(10)
font.bold = False
font.italic = False
font.color.rgb = RGBColor(0,0,0)
run = p.add_run()
run.text = 'Hello there! '
run = p.add_run()
run.text = 'How '
font = run.font
font.italic = True
font.bold = True
run = p.add_run()
run.text = 'are'
font = run.font
font.italic = True
font.bold = True
font.size = Pt(16)
run = p.add_run()
run.text = ' you?'
font = run.font
font.italic = True
font.bold = True
run = p.add_run()
run.text = ' What is your name?'
run.font.italic = True
prs.save('text-01.pptx')
如果你在PowerPoint中打开它,它看起来是这样的:
现在,如果您在其上使用此代码:
from pptx import Presentation
from pptx.chart.data import CategoryChartData
from pptx.shapes.graphfrm import GraphicFrame
from pptx.enum.chart import XL_CHART_TYPE
from pptx.util import Inches
def replace_text(replacements, shapes):
for shape in shapes:
if shape.has_text_frame:
text_frame = shape.text_frame
for (match, replacement) in replacements.items():
if text_frame.text.find(match)>=0:
for paragraph in text_frame.paragraphs:
pos = paragraph.text.find(match)
while pos>=0:
replace_runs_text(paragraph.runs, pos, len(match), replacement)
pos = paragraph.text.find(match)
def replace_runs_text(runs, pos, match_len, replacement):
cnt = len(runs)
i = 0
while i<cnt:
olen = len(runs[i].text)
if pos<olen:
# we found the run, where the match starts!
to_replace = replacement
repl_len = len(to_replace)
while i<cnt:
run = runs[i]
otext = run.text
olen = len(otext)
if pos+match_len < olen:
# our match ends before the end of the text of this run therefore
# we put the rest of our replacement string here and we are done!
run.text = otext[0:pos]+to_replace+otext[pos+match_len:]
return
if pos+match_len == olen:
# our match ends together with the text of this run therefore
# we put the rest of our replacement string here and we are done!
run.text = otext[0:pos]+to_replace
return
# we still haven't found all of our original match string
# so we process what we have here and go on to the next run
part_match_len = olen-pos
ntext = otext[0:pos]
if repl_len <= part_match_len:
# we now found at least as many characters for our match string
# as we have replacement characters for it. Thus we use up the
# the rest of our replacement string here and will replace the
# remainder of the match with an empty string (which happens
# to happen in this exact same spot for the next run ;-))
ntext += to_replace
repl_len = 0
to_replace = ''
else:
# we have got some more match characters but still more
# replacement characters than match characters found
ntext += to_replace[0:part_match_len]
to_replace = to_replace[part_match_len:]
repl_len -= part_match_len
run.text = ntext # save the new text to the run
match_len -= part_match_len # this is what is left to match
pos = 0 # in the next run, we start at pos 0 with our match
i += 1 # and off to the next run
else:
pos -= olen # the relative position of our match in the next run's text
i += 1 # and off to the next run
# create presentation with 1 slide ------
prs = Presentation('text-01.pptx')
# what is to be replaced
replacements = { 'How are you?': "I'm fine!" }
# loop through all slides and replace text in all their shapes
for slide in prs.slides:
replace_text(replacements, slide.shapes)
# save changed presentation
prs.save('text-02.pptx')
生成的演示文稿如下所示:
正如您所看到的,它将替换字符串精确地映射到现有的字体配置上,因此,如果您的匹配项和它的替换项具有相同的长度,则替换字符串将保留匹配的确切格式.
但是--作为一个重要的旁注--如果文本框架打开了自动调整大小或适合框架,即使所有的工作也不能避免你搞砸格式,如果替换后的文本需要或多或少的空间!