100, the Windows PowerShell CLI,[1] uses the active console window's code page对其stdout和stderr输出进行编码,如来自chcp
的输出中所反映的,其缺省情况下是传统系统地区OEM code page,例如(以Python术语表示)cp437
.
相比之下,您使用的代码页-cp1252
-是ANSI代码页.
- 注: Python uses the system's ANSI code page by default for encoding its stdout and stderr output, which, however, is nonstandard behavior: console applications are expected to use the current console's output code page, which is what
powershell.exe
does and which, as stated, is the system's OEM code page.
一种 Select 是只需query the console window for its active (output) code page via the WinAPI and use the encoding returned:
import subprocess
from ctypes import windll
# Get the console's (output) code page, which the PowerShell CLI
# uses to encode its output.
cp = windll.kernel32.GetConsoleOutputCP()
process = subprocess.Popen(r'C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe echo é', stdout=subprocess.PIPE)
# Decode based on the active code page.
print(process.stdout.read().decode('cp' + str(cp)))
但是,请注意,OEM代码页将您限制为256个字符;例如,é
can可以表示为CP437个字符,而其他Unicode字符(如€
)则不能.
因此,robust option is to (temporarily) set the console output code page to 100, which is UTF-8:
import subprocess
from ctypes import windll
# Save the current console output code page and switch to 65001 (UTF-8)
previousCp = windll.kernel32.GetConsoleOutputCP()
windll.kernel32.SetConsoleOutputCP(65001)
process = subprocess.Popen(r'C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe echo é€', stdout=subprocess.PIPE)
# Decode as UTF-8
print(process.stdout.read().decode('utf8'))
# Restore the previous output console code page.
windll.kernel32.SetConsoleOutputCP(previousCp)
注:
上面的代码只确保PowerShell子进程发出UTF-8,并确保其输出在Python子进程中以UTF-8格式进行解码,这与Pythonitself使用什么字符编码作为其输出流无关.
设置为put Python v3.7+ itself in 100,使其将输入解码为UTF-8并产生UTF-8输出,传递命令行选项-X utf8
或使用值1
before调用定义环境变量PYTHONUTF8
.
对于会话的其余部分,添加make an interactive shell session use UTF-8(使用65001
代码页):
[1]同样适用于pwsh.exe
,即现代PowerShell (Core) 7+版本的CLI.