- Getting Started
- The Main Class
- Add Files or Documents
- Encrypted Documents
- PDF Forms
- Performance Optimizations
- PDF Portfolios (aka PDF Packages or Collections)
- Refactor Version 1 Code
- API Reference
PDF Portfolios (aka PDF Packages or Collections)
Table of Contents
Since PDF 1.4 it is possible to embed external files in the body of a PDF document and link them through e.g. file attachment annotations or through the embedded files name tree.
In PDF 1.7 a new feature was introduced which allows an enhanced presentation of file attachments stored in a PDF document. It may specify how a conforming reader application should present the file attachments. The PDF specification named such presentation "portable collection" or more general "Collections". Sadly none of these terms made it into any viewer or creator application. Acrobat 8 for example called a file that makes use of collections a PDF Package while it was called PDF Portfolio in Acrobat 9. PDF Portfolios in Acrobat 9 were also enriched with a compiled ActionScript program.
Other reader and creator applications also use the term PDF Porfolio when it comes to Collections. We will use this term in the documentation as well while our code makes use of the more PDF specification related terms.
The SetaPDF-Merger component allows you to create and interact with PDF Portfolios in a very intuitive way.
A PDF Portfolio starts with a PDF document that represents the container. This document could display e.g. a message that a conforming reader application is needed to display PDF Portfolios (it is also called cover sheet in some applications). It can be an existing PDF document or a completely new document.
SetaPDF_Merger_Collection class is the main class to use if you start to handle PDF Portfolios.
It requires a document instance in its constructor which represents such container PDF or, in case you want to edit an existing PDF Portfolio, the loaded document instance:
$document = SetaPDF_Core_Document::load(...); $collection = new SetaPDF_Merger_Collection($document);
To simply check if a document is a PDF Portfolio, you can use the isCollection() method:
$isCollection = $collection->isCollection();
The class will set the appropriate entries in the document structure automatically if you at least add a file or folder. To force the creation of a PDF Portfolio a simple call to getDictionary(true) is needed (not needed if you plan to add files or folders). Following example creates a simple cover sheet and defining that the document is a PDF Portfolio:
A PDF Portfolio isn't restricted to PDF files but you can add any file type. Adding files is straight forward by using the addFile() method:
Add a file to the collection.
- $pathOrReader : SetaPDF_Core_Reader_ReaderInterface|string
A reader instance or a path to a file.
- $filename : string
The filename in UTF-8 encoding.
- $description : null|string
The description of the file in UTF-8 encoding.
- $fileStreamParams : array
See SetaPDF_Core_EmbeddedFileStream::setParams() method.
- $mimeType : null|string
The subtype of the embedded file. Shall conform to the MIME media type names defined in Internet RFC 2046
- $collectionItem : null|array|SetaPDF_Merger_Collection_Item
The data described by the collection schema.
The name that was used to register the file specification in the embedded files name tree.
Following example adds an existing PDF file from a local path and a dynamically created text file:
If you pass files through a reader instance as shown with the text file in the previous example you may add additional parameters for the generated embedded file stream. This is possible by the $fileStreamParams parameter or by resolving the file specification by the returned name:
PDF Portfolios use the files attached to a PDF document in the global embedded files name tree. The collection class offers a proxy method, which will return all embedded file specifications. Their names are the keys of the returned array:
A single file specification can be resolved by its name with the
As all files in a PDF Portfolio are located in the global embedded files name tree, the collection instance offers a proxy method
deleteFile() which proxies
$collection->deleteFile('registered-filename.pdf'); // is the same as calling $document->getCatalog() ->getNames() ->getEmbeddedFiles() ->remove('registered-filename.pdf');
The filename is the name with which the file specification is registered in the embedded files name tree in the PDF document. It doesn't need to be identically to the filename of the embedded file itself.
Folders in PDF Portfolios are an extension to the PDF specification (ExtensionLevel 3 by Adobe) and also land up in PDF 2.0.
With folders you can organize files into a hierachical tree structure.
The collection instance offers a simple method which allows you to check if folders are in use or not:
$hasFolders = $collection->hasFolders();
Folders are represented by the
To get all files located in a folder, just use the
getFiles() method of the folder instance.
Following example will show all files and folders in a PDF Portfolio (without sorting):
As shown in the previous example you can also use the
getFile() method to resolve a single file specification by its name in a folder.
To access a subfolder by its name just use the
Moving a folder is done by calling its
A PDF Portfolio can be presented in a table view with individual fields. By default a reader application will use the standard fields available in a file specification.
By using a schema it is possible to define all fields and their types individually. A schema can reference standard file-related fields such as the filename or its description but also allows you to define completely individual fields. These fields refer to data in a collection item which dictionary can be assigned to a file specification or its instance to a folder instance.
Defining a schema is done through the
SetaPDF_Merger_Collection_Schema instance, which can be resolved that easy:
$schema = $collection->getSchema();
The schema class offers various method which allows you to interact with the schema and their fields:
Add a field to the schema.
Adds several fields to the schema.
Get the collection instance.
Get a field instance by its name.
Get all field instances.
Check if a field exists.
Remove a field from the schema.
A field is represented by a
SetaPDF_Merger_Collection_Schema_Field instance. Most of the above methods allow you to just pass strings and constants while the field instances were created internally. Following example shows some ways to create fields with default or individual data:
As you may have noticed the constants prefixed with DATA_* refer to data available by default fields of a file specification or folder. The constants prefixed with TYPE_* define a data type. All available constants are:
Constant defining the compressed size property
Constant defining the creation date property
Constant defining the description property
Constant defining the file name property
Constant defining the modification date property
Constant defining the size property
Constant defining a date data type
Constant defining a number type
Constant defining a string data type (value needs to be in PdfDocEncoding or UTF-16BE)
Collection items are used to assign data described by the collection schema for a particular file or folder. The data or a collection item instance can be passed as the
$collectionItem parameter in the
addFolder() method of both the collection or a folder instance.
A collection item instance is a wrapper class around the collection item dictionary and optionally validates the data against a given collection schema:
// create a collection item $collectionItem = new SetaPDF_Merger_Collection_Item(); // add the company value $collectionItem->setEntry('company', SetaPDF_Core_Encoding::toPdfString('tektown'), $schema); // ignore the schema $collectionItem->setEntry( 'secret', 'value', SetaPDF_Merger_Collection_Schema::TYPE_STRING ); // add several entries $collectionItem->setData([ 'company' => SetaPDF_Core_Encoding::toPdfString('lenstown'), 'order' => 5 ], $schema);
If you set the collection item data through the
addFolder() methods you can pass an instance of a collection item or an array, which will be forwarded to the
setData() method of a newly created instance.
Notice that string values needs to be passed in PdfDocEncoding or UTF-16BE. You can use the
SetaPDF_Core_Encoding::toPdfString() method to convert it from your local encoding.
Set the initial view.
- $view : string
A view constant.
The initial document that should be presented can be set using the
Set the name of the document, that should be initially presented.
If you want to open a document, that is located in a subfolder, you will need to pass the id of the subfolder as a prefix to the name:
$collection->setInitialDocument('<' . $folder->getId() . '>' . $name);
- $name : string
The splitter bar can be configured through following methods of the collection instance:
Get the orientation of the splitter bar.
Get the initial position of the splitter bar.
Set the orientation of the splitter bar.
Set the initial position of the splitter bar.
The sorting can be defined by using the
Set the data that specifies the order in which the collection shall be sorted in the user interface.
- $sort : array
The key is the field name, while the value defines the direction. Valid key names are field names defined in the schema or SetaPDF_Merger_Collection_Schema::DATA_* constants.