PDFBox 获取图片位置和大小

在本教程中,我们将学习如何从所有页面获取 PDF 中图像的坐标或位置和大小。这可以通过使用PDFStreamEngine类来完成。该类通过提供回调接口来处理和执行处理PDF文档的操作。

为了获取PDF文档中图像的位置和大小,我们将扩展PDFStreamEngine类并拦截并实现processOperator()方法。

对于 PDF 文档中的每个对象,我们将检查该对象是否为图像对象并获取其属性,例如 (X, Y) 坐标和大小。为此,我们可以使用在PDFStreamEngine.processPage(page) 中调用的processOperator()方法。

按照以下步骤获取现有 PDF 文档中图像的坐标或位置和大小 

PDFBox 继承 PDFStreamEngine

在这里,我们必须首先创建一个Java 类并使用PDFStreamEngine对其进行扩展。这可以在下面的代码中显示。

publicclass GetImageLocationsAndSize extends PDFStreamEngine {  
  
......  
  
}  

PDFBox 调用 processPage()

对于 PDF 文档中的每个页面,调用processPage()方法。此方法接受页面名称作为参数。它可以显示在以下代码中。

for( PDPage page : document.getPages() )  
                {  
    pageNum++;  
    printer.processPage(page);  
                }  

PDFBox 覆盖 processOperator()方法

对于 PDF 页面中的每个对象,processOperator在processPage()方法中被调用。我们还可以覆盖processOperator()方法。

@Override  
protectedvoid processOperator( Operator operator, List<COSBase>operands)  
 throws IOException {   
  
...........  
  
}  

PDFBox 检查图像

现在,我们可以检查已发送到processOperator()方法的对象是否为图像对象。

if( xobjectinstanceof PDImageXObject)  
            {  
                PDImageXObject image = (PDImageXObject)xobject;  
    intimageWidth = image.getWidth();  
    intimageHeight = image.getHeight();                  
                System.out.println("\nImage [" + objectName.getName() + "]");  
              }  

PDFBox 打印位置和尺寸

最后,如果给定对象是图像对象,则打印图像的位置和大小。

   // position of image in the PDF in terms of user space units  
          System.out.println("Position in PDF = " + ctmNew.getTranslateX() + ",  
          " + ctmNew.getTranslateY() + " in user space units");  
  
/ raw size in pixels  
          System.out.println("Raw image size  = " + imageWidth + ",  
          " + imageHeight + " in pixels");  
  
// displayed size in user space units  
          System.out.println("Displayed size  = " + imageXScale + ",  
          " + imageYScale + " in user space units");  

PDFBox 获取图片位置和大小 完整示例

package com.yiidian;

import org.apache.pdfbox.contentstream.PDFStreamEngine;
import org.apache.pdfbox.contentstream.operator.DrawObject;
import org.apache.pdfbox.contentstream.operator.Operator;
import org.apache.pdfbox.contentstream.operator.state.*;
import org.apache.pdfbox.cos.COSBase;
import org.apache.pdfbox.cos.COSName;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.graphics.PDXObject;
import org.apache.pdfbox.pdmodel.graphics.form.PDFormXObject;
import org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject;
import org.apache.pdfbox.util.Matrix;

import java.io.File;
import java.io.IOException;
import java.util.List;

public class GetImageLocationsAndSize extends PDFStreamEngine {

    public void GetImageLocationsAndSize() throws IOException {
        // preparing PDFStreamEngine
        addOperator(new Concatenate());
        addOperator(new DrawObject());
        addOperator(new SetGraphicsStateParameters());
        addOperator(new Save());
        addOperator(new Restore());
        addOperator(new SetMatrix());
    }

    public static void main(String[] args) throws IOException {

        PDDocument document = null;
        String fileName = "d:/blank.pdf";
        try {
            document = PDDocument.load(new File(fileName));
            GetImageLocationsAndSize printer = new GetImageLocationsAndSize();
            int pageNum = 0;
            for (PDPage page : document.getPages()) {
                pageNum++;
                System.out.println("\n\nProcessing PDF page: " + pageNum + "\n-------------------------------- - ");
                printer.processPage(page);
            }
        } finally {
            if (document != null) {
                document.close();
            }
        }
    }

    protected void processOperator(Operator operator, List<COSBase> operands)
            throws IOException {
        String operation = operator.getName();
        if ("Do".equals(operation)) {
            COSName objectName = (COSName) operands.get(0);
            // get the PDF object
            PDXObject xobject = getResources().getXObject(objectName);

            // check if the object is an image object
            if (xobject instanceof PDImageXObject) {
                PDImageXObject image = (PDImageXObject) xobject;
                int imageWidth = image.getWidth();
                int imageHeight = image.getHeight();

                System.out.println("\nImage [" + objectName.getName() + "]");

                Matrix ctmNew = getGraphicsState().getCurrentTransformationMatrix();
                float imageXScale = ctmNew.getScalingFactorX();
                float imageYScale = ctmNew.getScalingFactorY();

                // position of image in the PDF in terms of user space units
                System.out.println("position in PDF = " + ctmNew.getTranslateX() + ", " + ctmNew.getTranslateY() + " in user space units");

                // raw size in pixels
                System.out.println("raw image size  = " + imageWidth + ", " + imageHeight + " in pixels");

                // displayed size in user space units
                System.out.println("displayed size  = " + imageXScale + ", " + imageYScale + " in user space units");

            } else if (xobject instanceof PDFormXObject) {
                PDFormXObject form = (PDFormXObject) xobject;
                showForm(form);
            }
        } else {
            super.processOperator(operator, operands);
        }
    }
} 

输出结果如下:

热门文章

优秀文章