The Main Class

Introduction

The main functionalities of the SetaPDF-Merger component are encapsulated in a single class. The SetaPDF_Merger class. 

Get an Instance

The Merger component can be created by passing an existing document instance to its constructor. All documents or files will be appended to this document instance. If no instance is passed the component will create an own, empty instance which can be used further.

Reading and writing the PDF document is up to the Core component.

To start without an initial document the common logic is following:

PHP
$merger = new \SetaPDF_Merger();
// ...
$merger->merge();

$document = $merger->getDocument();
$document->setWriter(new \SetaPDF_Core_Writer_Http());
$document->save()->finish();

Start with an existing document instance: 

PHP
$writer = new \SetaPDF_Core_Writer_Http();
$document = \SetaPDF_Core_Document::loadByFilename('document.pdf', $writer);

$merger = new \SetaPDF_Merger($document);
// ...
$merger->merge();

$document->save()->finish();

Additional Helper Methods

The SetaPDF_Merger class comes with some helper methods, that can be used to easily prepare the merging process: 

getCurrentDocument()

Get the currently processed document instance.

getDocument()

Alias for getInitialDocument.

getDocumentByFilename()

Get a document instance by a filename.

getInitialDocument()

Returns the initial document.

getPageCount()

Helper method to get the page count of a document or file.

Handle Errors

It's always possible that a PDF document you want to merge is not valid and produces errors during the merge process. The cause of this can be a corrupted PDF document or anything else like passing a non-pdf document to the component.

If an error occurs the whole merge process will stop because an Exception was thrown.

Because the SetaPDF components don't rely on PDF documents to be files which are reachable through an unique path but also support documents to be represented as a simple string or stream, it may be sometimes hard to find the document that triggers the error.

There are 2 different situation an error could occur:

  1. When starting to read the PDF files main structure. 
  2. While processing the PDF file.

In the first situation we would not have a document instance at all. The SetaPDF_Merger component will throw a SetaPDF_Merger_Exception instance in that case while forwarding the previous exception through the $previous parameter. Additionally this exception will allow you to access the filename of the currently processed file through the getPdfFilename() method.

The second situation is an error that occurs while reading the PDF internal objects. This errors will not be catched by the Merger component but need to be catched individually.   

The error handling also depends on the method you use to add the PDF document to the merge process. 

When using the addFile() method all possible errors are raised during the run of merge() because there's no logic in addFile() which will throw an exception. But if you use addDocument() you will need to ensure that no error is thrown during the document initialization process, too.  

The reader instance of the passed document instance is also a factor (a string reader doesn't have a filename at all).

If all readers are stream readers (File, MaxFile, Stream) errors can be catched this way: 

PHP
$merger = new \SetaPDF_Merger();
// add some files
$merger->addFile('good.pdf');
$merger->addFile('better.pdf');

// special handling if the document instance is created individually
try {
    $document = \SetaPDF_Core_Document::loadByFilename('whatever.pdf');
    $merger->addDocument($document);
} catch (\Exception $e) {
    var_dump($e->getMessage());
    die();
}

// add more files
$merger->addFile('faulty.pdf');
$merger->addFile('best.pdf');

// merge
try {
    $merger->merge();

} catch (\SetaPDF_Merger_Exception $e) {
    var_dump($e->getMessage());
    $filename = $e->getFilename();
    var_dump($filename);
    die();

} catch (\Exception $e) {
    var_dump($e->getMessage());

    $document = $merger->getCurrentDocument();
    $reader = $document->getParser()->getReader();
    /**
     * If using addFile() the reader instance IS a stream reader throughout
     * @var SetaPDF_Core_Reader_Stream $reader
     */
    $data = stream_get_meta_data($reader->getStream());
    var_dump($data['uri']);
    die();
}

Additional errors could occur in the final save() call of the document instance. So don't forget to encapsulate this in a try/catch block, too. Sadly it is currently impossible to get access to a document instance that triggers such deep error and it is needed to find the triggering document manually.