我正在使用pdfbox 2.0.26将pdf转换为图像.Maven依赖项如下所示.
<dependency>
<groupId>org.apache.pdfbox</groupId>
<artifactId>fontbox</artifactId>
<version>2.0.26</version>
</dependency>
<dependency>
<groupId>org.apache.pdfbox</groupId>
<artifactId>pdfbox</artifactId>
<version>2.0.26</version>
</dependency>
我写的程序就像
FileInputStream fin = new FileInputStream("/path/to/sample.pdf");
try(final PDDocument doc = PDDocument.load(fin)){
PDFRenderer pdfRenderer = new PDFRenderer((doc));
BufferedImage bim = pdfRenderer.renderImageWithDPI(0, 300, ImageType.RGB);
File myObj = new File("/path/to/sample.png");
FileOutputStream fos = new FileOutputStream(myObj);
ImageIOUtil.writeImage(bim, "png", fos);
fin.close();
fos.close();
} catch (IOException e) {
System.out.println("error");
}
它在我的MacOS上运行得很好(尽管图像中的字体与PDF中的字体不同),但当我在Linux服务器上运行它时,中文字符会丢失.
The source PDF file can be found here the source file. I detected the font using adobe reader, the result are pasted as following.
The resulting image file is like:
我该怎么做才能解决这个问题呢?谢谢