Pages Accessing pages in the PDF document

Introduction

Internally all SetaPDF components have to deal with pages of PDF documents. This process is done via the SetaPDF_Core_Document_Catalog_Pages class which can be used individually, too. 

Get an Instance of the Pages Object

The SetaPDF-Core component offers an easy access to get access to this document via the documents catalog instance:

PHP
$document = new SetaPDF_Core_Document();
$pages = $document->getCatalog()->getPages();

Count Pages

A common task is to receive the page count of an existing PDF document.

The Pages instance implements the Countable interface, so a count() method is implemented. You can get the page count this way: 

PHP
$pages = $document->getCatalog()->getPages();
$pageCount = $pages->count();
// or
$pageCount = count($pages);

Get a Page Object

To get access to an existing page you can use the getPage() method: 

Description

Get a page.

Parameters
$pageNumber : integer
 

To get access to the last page a simple helper method exists: getLastPage()

The Pages class is optimized in resolving a page object by using the logic of the balanced tree which can be used in a PDF document page tree. For each page it uses the beeline to find the object representing the page. If you know that you will need access to all pages it is faster to ensure all objects with a single run by the following method: 

Description

This method makes sure that all pages are read.

It walks the complete page tree to cache/get all page objects in one iteration. This method should be used if all pages of a document should be handled. It is much faster than using the random access.

Exceptions

Throws BadMethodCallException

Delete Pages

To delete a page you can use the deletePage() method.  

You should  use this method with care: It will only remove the page object from the page tree. If the page includes e.g. form fields, these fields will not be removed from the document! 

Description
public void SetaPDF_Core_Document_Catalog_Pages::deletePage ( integer $pageNumber )

Deletes a page.

Parameters
$pageNumber : integer
 
Exceptions

Throws SetaPDF_Core_SecHandler_Exception

Create a Page

To create a new page the class offers you a helper method:

Description
public SetaPDF_Core_Document_Page SetaPDF_Core_Document_Catalog_Pages::create ( string|array $format [, string $orientation = \SetaPDF_Core_PageFormats::ORIENTATION_PORTRAIT [, boolean $append = true ]] )

Create a page.

Parameters
$format : string|array

The page format. See constants in SetaPDF_Core_PageFormats and the getFormat() method.

$orientation : string

The orientation. See constants in SetaPDF_Core_PageFormats.

$append : boolean

Whether the page should be appended to the page tree or not.

Append or Prepend a Page

A new created page can be appended to the existing pages automatically if the $append parameters of the create() method is set to true (default). Internally the create() method calls the append() method in that case.

If the $append parameter is set to false the page is not attached to the page tree at all. In that case it has to be passed additionally to the append() or prepend() method. 

Copy Pages Between Document Instances

Sometimes it is necessary to add a simple existing PDF pages to another PDF document.

This will really only work with simple flat PDF pages. If a page includes foreign structures which refer to other PDF objects or pages in the origin document, ALL objects will be copied to the result document (though they were never needed). This technic should only be used if you have control over the added PDF page. A more robust solution will offer the SetaPDF-Merger component which will handle these situations. 

Adding a foreign PDF page is possible by passing its page instance to the append() or prepent() method of the resulting document after calling the flattenInheritedAttributes() on the page instance that should be appended:

PHP
<?php
require_once('library/SetaPDF/Autoload.php');

$writer = new SetaPDF_Core_Writer_Http('Noisy-Tube.pdf', true);
$document = SetaPDF_Core_Document::loadByFilename(
    'files/pdfs/camtown/products/Noisy-Tube.pdf', $writer
);
$pagesToAppendTo = $document->getCatalog()->getPages();

$documentToAppend = SetaPDF_Core_Document::loadByFilename(
    'files/pdfs/camtown/Laboratory-Report.pdf'
);
$pagesToAppend = $documentToAppend->getCatalog()->getPages();

$pageCount = $pagesToAppend->count();
for ($pageNo = 1; $pageNo <= $pageCount; $pageNo++) {
    $pageToAppend = $pagesToAppend->getPage($pageNo);
    $pageToAppend->flattenInheritedAttributes();
    $pagesToAppendTo->append($pageToAppend);
}

$document->save()->finish();            

The call of flattenInheritedAttributes() is necessary because of possible internal structures of a PDF page tree. This call will technically modify the page dictionary in both document instances.

If you need to work further on both document instances and you only want the modification to happen in the appended page instance you need to use the extract() method. This will decouple the objects from the origin document instance: 

PHP
<?php
require_once('library/SetaPDF/Autoload.php');

$writer = new SetaPDF_Core_Writer_Http('Noisy-Tube.pdf', true);
$document = SetaPDF_Core_Document::loadByFilename(
    'files/pdfs/camtown/products/Noisy-Tube.pdf', $writer
);
$pagesToAppendTo = $document->getCatalog()->getPages();

$documentToAppend = SetaPDF_Core_Document::loadByFilename(
    'files/pdfs/camtown/Laboratory-Report.pdf'
);
$pagesToAppend = $documentToAppend->getCatalog()->getPages();

$pageCount = $pagesToAppend->count();
for ($pageNo = 1; $pageNo <= $pageCount; $pageNo++) {
    $pageToAppend = $pagesToAppend->extract($pageNo, $document);
    $pageToAppend->flattenInheritedAttributes();
    $pagesToAppendTo->append($pageToAppend);
}

// modify a page in the origin page without any effect on the resulting document
$page = $pagesToAppend->getPage(1);
$page->rotateBy(90);

$document->save()->finish();