Tagged PDF
Table of Contents
Introduction
As of version 2.49 the SetaPDF-Merger component allows you to handle tagged PDF files.
Tagged PDF files, or also known as "accessible PDF files" or "PDF files with tags" include structural information to enhance their accessibility for individuals with disabilities. While the PDF specification itself defines "Logical Structures" and "Tagged PDF" separate standards evolved to define how to create accessible PDF document in the real world. These standards are known as ISO 14289-1 / PDF/UA-1 and ISO 14289-2 / PDF/UA-2 (PDF 2.0). While the SetaPDF-Merger component is not a converter, the source PDF documents needs to already been created conforming to an expected standard.
Handling PDF files with tag-structures can be a performance intensive task as these kind of structures may consist of several thousands of objects. Because of this we suggest to not process foreign PDF files if you want to keep their tag structures.
We also suggest to use compressed cross-reference streams for these kind of structures to reduce the output-size.
Enable Handling of Tags
The handling of tags can be enabled by simply calling following method on the merger instance:
Description
?string $subTag = 'Part'
Set the flag if tag structures should be handled during the merge process.
Parameters
- $handleTags : bool
- $subTag : ?string
The default sub-tag name for each
addDocument()
/addFile()
call. Ifnull
, all tags are put on the same level.
Exceptions
Then you can simply add the documents or PDF files through the addDocument()
or addFile()
methods and finally call merge()
to create a well taged PDF document.
Please note that merging pages of the same document instance several times is not possible if tags are handled. Also the document instances should not be re-used after the whole process as the structure tree may had changed.
Examples
Merge Simple Tagged PDFs
The following script simply merges two tagged PDF documents. The new tag structure will create a sub-tag (Part
by default) for each merged document:
Merge PDF/UA-1 PDF files
The following script merges two PDF/UA-1 conforming PDF files, keeps all tags on the same level and updates the result to PDF/UA-1, too:
Split PDF/UA-2 PDF files
The following script extract a single page of a PDF/UA-2 conforming document and results in a PDF/UA-2 conforming PDF.
The document we use for demonstration purpose is an example file from the Latex project.