SetaPDF_Core_Document A class representing a PDF document

File: /SetaPDF v2/Core/Document.php

This class represents a PDF document in all SetaPDF components. It offers the main functionalities for managing objects, cross reference tables and writers of the document instance.

It also tracks changes of objects and security handlers.

Class hierarchy

Implements

Summary

Constants

CACHE_ENCAPSULATED_CONTENT_STREAMS

public const string SetaPDF_Core_Document::CACHE_ENCAPSULATED_CONTENT_STREAMS = 'EncapsulatedContentStream'

Cache constant

CACHE_FONT

public const string SetaPDF_Core_Document::CACHE_FONT = 'Font'

Cache constant

CACHE_FONT_DESCRIPTOR

public const string SetaPDF_Core_Document::CACHE_FONT_DESCRIPTOR = 'FontDescriptor'

Cache constant

CACHE_ICC_PROFILE

public const string SetaPDF_Core_Document::CACHE_ICC_PROFILE = 'ICCProfile'

Cache constant

CACHE_X_OBJECT

public const string SetaPDF_Core_Document::CACHE_X_OBJECT = 'XObject'

Cache constant

SAVE_METHOD_REWRITE

public const integer SetaPDF_Core_Document::SAVE_METHOD_REWRITE = 0

Save method constant defining a rewrite by resolving objects starting at the root object

SAVE_METHOD_REWRITE_ALL

Save method constant defining a rewrite by writing all available objects

SAVE_METHOD_UPDATE

public const integer SetaPDF_Core_Document::SAVE_METHOD_UPDATE = 1

Save method constant defining an incremental update

STATE_CLEANED_UP

public const string SetaPDF_Core_Document::STATE_CLEANED_UP = 'cleanedUp'

State constant

STATE_FINISHED

public const string SetaPDF_Core_Document::STATE_FINISHED = 'finished'

State constant

STATE_NONE

public const string SetaPDF_Core_Document::STATE_NONE = 'none'

State constant

STATE_SAVED

public const string SetaPDF_Core_Document::STATE_SAVED = 'saved'

State constant

STATE_WRITING_BODY

public const string SetaPDF_Core_Document::STATE_WRITING_BODY = 'writingBody'

State constant

STATE_WRITING_XREF

public const string SetaPDF_Core_Document::STATE_WRITING_XREF = 'writingXRef'

State constant


Static Properties

$_instanceCounter

static protected int SetaPDF_Core_Document::$_instanceCounter = 0

A counter for generating unique instance identifications

$_instanceIdentPrefix

A random prefix for generating unique instance identifications


Properties

$_beforeSaveCallbacks

protected array SetaPDF_Core_Document::$_beforeSaveCallbacks = array()

An array of callbacks that should be called before the save method is executed.

$_blockedReferencedObjects

Blocked referenced objects

This array holds objects which should NOT be automatically resolved.

$_cache

protected array SetaPDF_Core_Document::$_cache = array(...)

An array for cached objects and data.

$_cacheReferencedObjects

Defines if referenced objects should be cached or not

$_catalog

Documents catalog instance

$_changedObjects

Changed objects

$_cleanUpObjects

Flag saying that objects should be cleaned up automatically

$_compressXref

protected bool SetaPDF_Core_Document::$_compressXref = false

Defines if the cross-reference table will be compressed

See

$_currentObject

The indirect object which is currently written

$_currentObjectData

The object id and generation number of the currently written object

$_deletedObjects

Deleted objects

$_directWrite

protected bool SetaPDF_Core_Document::$_directWrite = false

Defining whether the PDF objects should be written at once or object by object

$_fileBodyMethod

A method/function which should be called to fill the document body

$_info

The document's info object instance

$_instanceIdent

Identification of a document instance

$_maxObjId

Current/max object id

$_newFileIdentifier

The none permanent file identifier

$_objectStreams

protected array SetaPDF_Core_Document::$_objectStreams = array()

Array for information about object streams

$_objectStreamsParser

The parser object used for parsing object streams

$_objects

protected array SetaPDF_Core_Document::$_objects = array()

Newly created or resolved objects

$_objectsToIds

protected array SetaPDF_Core_Document::$_objectsToIds = array()

A relation between objects and ids

$_parser

The parser object of the existing document

$_pdfVersion

$_referencedObjects

protected array SetaPDF_Core_Document::$_referencedObjects = array()

Referenced objects

This array holds information about objects to which references were written. Needed to create deep copies of an object from one to another document

$_saveMethod

Incremental update or rewrite the document

See

$_secHandler

The security handler of the new document

$_secHandlerIn

The security handler of the existing document

$_state

protected string SetaPDF_Core_Document::$_state = 'none'

A flag defining the state of the document object instance

$_trailer

$_trailerChanged

protected bool SetaPDF_Core_Document::$_trailerChanged = false

Flag defining if the trailer was touched/changed

$_useWriteCallbacks

Flag saying that write callbacks are in use

$_writeCallbacks

protected array SetaPDF_Core_Document::$_writeCallbacks = array()

Array of write callbacks

$_xref

An instance of a cross-reference

If the document is created of an existing one this will be an instance of SetaPDF_Core_Parser_CrossReferenceTable


Static Methods

load()

public static SetaPDF_Core_Document::load (
SetaPDF_Core_Reader_ReaderInterface $reader [, SetaPDF_Core_Writer_WriterInterface $writer = null [, string $className = 'SetaPDF_Core_Document' ]]
): SetaPDF_Core_Document

Creates an instance of a document based on an existing PDF.

Parameters
$reader : SetaPDF_Core_Reader_ReaderInterface

A reader instance

$writer : SetaPDF_Core_Writer_WriterInterface

A writer instance

$className : string

The class name to initiate

Return Values

Returns a SetaPDF_Core_Document instance

Exceptions

Throws SetaPDF_Core_Parser_CrossReferenceTable_Exception,Exception

loadByFilename()

public static SetaPDF_Core_Document::loadByFilename (
string $filename [, SetaPDF_Core_Writer_WriterInterface $writer = null [, string $className = 'SetaPDF_Core_Document' ]]
): SetaPDF_Core_Document

Initiate an instance by a filename.

Parameters
$filename : string

The path to the pdf file

$writer : SetaPDF_Core_Writer_WriterInterface

A writer instance

$className : string

The class name to initiate

Exceptions

Throws SetaPDF_Core_Parser_CrossReferenceTable_Exception

Throws SetaPDF_Core_Reader_Exception

loadByString()

public static SetaPDF_Core_Document::loadByString (
string $string [, SetaPDF_Core_Writer_WriterInterface $writer = null [, string $className = 'SetaPDF_Core_Document' ]]
): SetaPDF_Core_Document

Initiate an instance by a pdf string.

Parameters
$string : string

Content of the pdf

$writer : SetaPDF_Core_Writer_WriterInterface

A writer instance

$className : string

The class name to initiate

Exceptions

Throws SetaPDF_Core_Parser_CrossReferenceTable_Exception

Throws SetaPDF_Core_Reader_Exception


Methods

__construct()

The constructor.

Parameters
$writer : SetaPDF_Core_Writer_WriterInterface

The writer to which the document should be written

__call()

public SetaPDF_Core_Document::__call (
string $method, array $arguments
): mixed

Implement magic methods for getting helper objects.

You can use the methods from SetaPDF_Core_Document_Catalog::getDocumentMagicMethods().

Additional you can use "getFormFiller", "getMerger", "getSigner" and "getStamper" if you want to receive instances of these components.

Parameters
$method : string

The method name

$arguments : array

The arguments

Exceptions

Throws BadMethodCallException

See

_cleanUpTrailer()

Cleans up trailer entries.

Parameters
$trailer : SetaPDF_Core_Type_Dictionary
 

_releaseObjects()

protected SetaPDF_Core_Document::_releaseObjects (
void
): void

Release objects.

_updateFileIdentifier()

protected SetaPDF_Core_Document::_updateFileIdentifier (
void
): string

Update or create a file identifier.

Return Values

The new file identifier

_writeCrossReferenceTable()

Write the cross-reference table.

Exceptions

Throws SetaPDF_Core_Exception

_writeFileBody()

protected SetaPDF_Core_Document::_writeFileBody (
void
): array

Main method which writes the file body.

This method should extended/overwritten to implement individual logic if the document should be build at runtime.

Return Values

The objects metadata that written

_writeFileHeader()

protected SetaPDF_Core_Document::_writeFileHeader (
void
): void

Writes the file header.

Exceptions

Throws SetaPDF_Core_Exception

_writeObject()

protected SetaPDF_Core_Document::_writeObject (
SetaPDF_Core_Document $document, integer $objectId, integer|null $generation, boolean $cache
): void

Writes an object to the resulting document but evaluates first if this is neccesarry.

Parameters
$document : SetaPDF_Core_Document
 
$objectId : integer
 
$generation : integer|null
 
$cache : boolean
 
Exceptions

Throws SetaPDF_Core_Document_ObjectNotDefinedException

Throws SetaPDF_Core_Document_ObjectNotFoundException

Throws SetaPDF_Core_Exception

Throws SetaPDF_Core_Parser_Exception

Throws SetaPDF_Core_Type_Exception

Throws SetaPDF_Exception

Throws SetaPDF_Exception_NotImplemented

_writeTrailer()

protected SetaPDF_Core_Document::_writeTrailer (
void
): void

Write the trailer dictionary and the pointer top the initial xref table.

Exceptions

Throws SetaPDF_Core_Exception

addBeforeSaveCallback()

public SetaPDF_Core_Document::addBeforeSaveCallback (
string $name, callable $callback
): bool

Adds a callback that will get executed before the save method is processed.

Parameters
$name : string
 
$callback : callable
 
See

addCache()

public SetaPDF_Core_Document::addCache (
string $type, string $name, mixed $value
): void

Adds a cache item by its type and name.

Parameters
$type : string
 
$name : string
 
$value : mixed
 

addIndirectObjectReferenceWritten()

Cache written object references.

This method is called if an indirect object reference is written. This makes sure that the class knows about maybe unwritten objects.

Parameters
$indirectObject : SetaPDF_Core_Type_IndirectObjectInterface

The indirect object

Exceptions

Throws SetaPDF_Core_Document_ObjectNotFoundException

blockReferencedObject()

This prohibits that a reference to this objects will be written.

Objects defined via this method will not automatically be resolved if a reference to them was written.

Parameters
$indirectObject : SetaPDF_Core_Type_IndirectObjectInterface

The indirect object

See

cleanUp()

public SetaPDF_Core_Document::cleanUp (
void
): void

Release objects to free memory and cycled references.

After calling this method the instance of this object is unusable!

clearCache()

public SetaPDF_Core_Document::clearCache (
[ string $type = null [, null|string $name = null ]]
): void

Clears the complete cache, an item by type or by type and name.

Parameters
$type : string
 
$name : null|string
 

cloneIndirectObject()

Clones an indirect object.

Parameters
$indirectObject : SetaPDF_Core_Type_IndirectObject

The indirect object to clone

createNewObject()

Create a new indirect object.

Parameters
$value : SetaPDF_Core_Type_AbstractType

The value of the new indirect object

deleteObject()

Delete an indirect object.

Parameters
$object : SetaPDF_Core_Type_IndirectObjectInterface

The indirect object to delete

deleteObjectById()

public SetaPDF_Core_Document::deleteObjectById (
integer $objectId [, integer $generation = 0 ]
): void

Deletes an indirect object by its object id and generation number.

Parameters
$objectId : integer

The object id of the object

$generation : integer

The generation id of the object

Exceptions

Throws SetaPDF_Core_Document_ObjectNotDefinedException

Throws SetaPDF_Core_Document_ObjectNotFoundException

Throws SetaPDF_Core_Exception

Throws SetaPDF_Core_Parser_Pdf_InvalidTokenException

Throws SetaPDF_Core_Reader_Exception

Throws SetaPDF_Core_Type_Exception

Throws SetaPDF_Exception

Throws SetaPDF_Exception_NotImplemented

ensureObject()

Makes sure that an object is ensured through this document (if possible).

Parameters
$indirectObject : SetaPDF_Core_Type_IndirectObjectInterface

The indirect object to ensure

finish()

Forwards a finish signal to the attached writer.

getCache()

public SetaPDF_Core_Document::getCache (
string $type, string $name
): mixed

Get a cache item by its type and name.

Parameters
$type : string
 
$name : string
 

getCacheReferencedObjects()

Says that referenced objects get cached or not.

getCatalog()

Get the catalog object.

getCompressXref()

public SetaPDF_Core_Document::getCompressXref (
void
): bool

Check whether the xref tables are/should be compressed or not.

getCurrentObject()

Returns the currently written object.

getCurrentObjectData()

Returns the currently written object data.

getCurrentObjectDocument()

Get the document object of the currently written/handled object.

getDirectWrite()

public SetaPDF_Core_Document::getDirectWrite (
void
): bool

Gets whether the PDF objects should be written individually (true) or after assembling a single string (false).

getFileIdentifier()

public SetaPDF_Core_Document::getFileIdentifier (
[ boolean $permanent = false [, boolean $create = true ]]
): string

Get a file identifier.

Parameters
$permanent : boolean
 
$create : boolean
 
Exceptions

Throws SetaPDF_Core_Type_Exception

getIdForObject()

Return the object id and generation number for an indirect object or reference.

This method makes sure that objects are nearly independent of their original document and the matching between document, object and their ids is handled at one place: in this method.

Parameters
$indirectObject : SetaPDF_Core_Type_IndirectObjectInterface

The indirect object

getInfo()

Get the document's info object.

getInstanceIdent()

public SetaPDF_Core_Document::getInstanceIdent (
void
): string

Get the instance identifier of this document.

getOwnerPdfDocument()

Implementation of the SetaPDF_Core_Type_Owner interface.

getParser()

Get the parser object.

getPdfVersion()

public SetaPDF_Core_Document::getPdfVersion (
void
): string

Returns the PDF version of the document.

getSaveMethod()

public SetaPDF_Core_Document::getSaveMethod (
void
): integer

Get the current used save method.

This method can be used by objects at writing time to evaluate if it is possible to edit referencing values or not.

getSecHandler()

Get the security handler for the output document.

getSecHandlerIn()

Returns the security handler of the original document.

getState()

public SetaPDF_Core_Document::getState (
void
): string

Return the current object state.

getTrailer()

Returns the trailer dictionary.

getWriter()

Get current writer object.

getXref()

Get the cross-reference object.

handleWriteCallback()

Method called when a PDF type will be written.

This method could be used to manipulate a value just before it will get written to the writer object.

Parameters
$value : SetaPDF_Core_Type_AbstractType
 

hasCache()

public SetaPDF_Core_Document::hasCache (
string $type, string $name
): bool

Checks if a cache item with a specific type and name exists.

Parameters
$type : string
 
$name : string
 

hasSecHandler()

public SetaPDF_Core_Document::hasSecHandler (
void
): boolean

Checks whether a security handler is attached to this document.

hasSecurityHandler()

WARNING: This method is marked as deprecated!

Use SetaPDF_Core_Document::hasSecHandler() instead.

Alias for hasSecHandler().

objectRegistered()

Checks if an indirect object is already registered for/in this document instance.

Parameters
$indirectObject : SetaPDF_Core_Type_IndirectObjectInterface

The indirect object to check

registerWriteCallback()

public SetaPDF_Core_Document::registerWriteCallback (
callback $callback, string $type, string $name
): void

Register a write callback.

Parameters
$callback : callback
 
$type : string
 
$name : string
 

releaseObject()

public SetaPDF_Core_Document::releaseObject (
SetaPDF_Core_Type_IndirectObject $object [, boolean $cleanUp = true ]
): boolean

Releases an indirect object from the internal object cache.

Parameters
$object : SetaPDF_Core_Type_IndirectObject
 
$cleanUp : boolean

Whether to call cleanUp() on the object or not.

removeBeforeSaveCallback()

public SetaPDF_Core_Document::removeBeforeSaveCallback (
string $name
): bool

Removes a callback that was added before.

Parameters
$name : string
 
See

removeReferencedObject()

Remove referenced objects.

This method is needed if an object is e.g. moved to a compressed cross-reference stream and alredy written there. In that case it needs to be removed from this "list".

Parameters
$indirectObject : SetaPDF_Core_Type_IndirectObjectInterface
 

resolveIndirectObject()

public SetaPDF_Core_Document::resolveIndirectObject (
integer $objectId [, integer|null $generation = 0 [, boolean $cache = true ]]
): SetaPDF_Core_Type_IndirectObject

Resolves an indirect object.

Parameters
$objectId : integer

The object id

$generation : integer|null

The generation number. Could be also "null" to find an object with an unknown generation number with the xref parser

$cache : boolean

Should the object be cached?

Exceptions

Throws SetaPDF_Core_Document_ObjectNotDefinedException

Throws SetaPDF_Core_Document_ObjectNotFoundException

Throws SetaPDF_Core_Exception

Throws SetaPDF_Core_Parser_Pdf_InvalidTokenException

Throws SetaPDF_Core_Reader_Exception

Throws SetaPDF_Core_Type_Exception

Throws SetaPDF_Exception

Throws SetaPDF_Exception_NotImplemented

save()

public SetaPDF_Core_Document::save (
[ boolean|integer $method = true ]
): SetaPDF_Core_Document

Saves the document.

The PDF format offers a way to add changes to a document by simply appending the changes to the end of the file. This method is called incremental update and has the advantage that it is very fast, because only changed objects have to be written. This behavior is the default one, when calling the save()-method. Sadly it makes it easy to revert the document to the previous state by simply cutting the bytes of the last revision.

The parameter of the save()-method allows you to define that the document should be rebuild from scratch by resolving the complete object structure. Just pass SetaPDF_Core_Document::SAVE_METHOD_REWRITE to it. This task is very performance intensive, because the complete document have to be parsed, interpreted and rewritten.

Additionally, it is possible to rewrite the whole document with all available objects. The benefit of this solution is that it will keep compressed object streams intact: SetaPDF_Core_Document::SAVE_METHOD_REWRITE_ALL. The disadvantage is, that unused objects may be copied/written, too.

Parameters
$method : boolean|integer

Update or rewrite the document

Exceptions

Throws SetaPDF_Core_Document_ObjectNotDefinedException

Throws SetaPDF_Core_Document_ObjectNotFoundException

Throws SetaPDF_Core_Exception

Throws SetaPDF_Core_Parser_CrossReferenceTable_Exception

Throws SetaPDF_Core_Parser_Exception

Throws SetaPDF_Core_SecHandler_Exception

Throws SetaPDF_Core_Type_Exception

Throws SetaPDF_Core_Type_IndirectReference_Exception

Throws SetaPDF_Exception

Throws SetaPDF_Exception_NotImplemented

Throws BadMethodCallException

setCacheReferencedObjects()

public SetaPDF_Core_Document::setCacheReferencedObjects (
boolean $cacheReferencedObjects
): void

Define if referenced objects should be cached or not.

Parameters
$cacheReferencedObjects : boolean

The flag status

setCleanUpObjects()

public SetaPDF_Core_Document::setCleanUpObjects (
boolean $cleanUpObjects
): void

Set the behavior if the cleanUp()-methods of objects get called automatically.

Parameters
$cleanUpObjects : boolean

The flag status

setCompressXref()

public SetaPDF_Core_Document::setCompressXref (
bool $compressXref
): void

Define whether the cross-reference should be compressed or not.

By default, the SetaPDF-Core component writes the cross-reference in the standard format or in the format which is defined in the source document, if any available.

Parameters
$compressXref : bool

Pass true to enforce that the cross-reference will be compressed. Pass false to enforce a standard uncompressed cross-reference table.

Exceptions

Throws SetaPDF_Core_SecHandler_Exception

Throws SetaPDF_Core_Type_Exception

Throws BadMethodCallException

setDirectWrite()

public SetaPDF_Core_Document::setDirectWrite (
bool $directWrite
): void

Defines whether the PDF objects should be written individually (true) or after assembling a single string (false).

Parameters
$directWrite : bool
 

setFileBodyMethod()

public SetaPDF_Core_Document::setFileBodyMethod (
callback $callback
): void

Set the callback method/function which will write the file body.

Parameters
$callback : callback
 

setMinPdfVersion()

public SetaPDF_Core_Document::setMinPdfVersion (
string $minPdfVersion
): void

Set the minimal PDF version.

Parameters
$minPdfVersion : string

The minimal pdf version

Exceptions

Throws SetaPDF_Core_SecHandler_Exception

Throws SetaPDF_Core_Type_Exception

setNewFileIdentifier()

public SetaPDF_Core_Document::setNewFileIdentifier (
string $newFileIdentifier
): void

Set a custom non-permanent file identifier.

Parameters
$newFileIdentifier : string
 

setPdfVersion()

public SetaPDF_Core_Document::setPdfVersion (
string|float $pdfVersion
): void

Set the PDF version of the document.

Parameters
$pdfVersion : string|float

The pdf version

Exceptions

Throws SetaPDF_Core_SecHandler_Exception

Throws SetaPDF_Core_Type_Exception

setSecHandler()

Set the security handler for this document.

Parameters
$secHandler : SetaPDF_Core_SecHandler_SecHandlerInterface

The new secHandler

Exceptions

Throws SetaPDF_Core_SecHandler_Exception

Throws SetaPDF_Core_Type_Exception

setWriter()

Set the writer object.

A writer instance can only be set prior the first call to save() or after a finish() call.

Parameters
$writer : SetaPDF_Core_Writer_WriterInterface

The new writer object

Exceptions

Throws BadMethodCallException

unBlockReferencedObject()

unRegisterWriteCallback()

public SetaPDF_Core_Document::unRegisterWriteCallback (
string $type, string $name
): void

Un-Register a write callback.

Parameters
$type : string
 
$name : string
 

update()

public SetaPDF_Core_Document::update (
SplSubject $subject
): void

Implementation of the observer pattern.

This method is automatically called if an observed object was changed.

Parameters
$subject : SplSubject

The SplSubject notifying the observer of an update.

write()

public SetaPDF_Core_Document::write (
string $bytes
): mixed

Writes content to the attached writer.

Parameters
$bytes : string
 
Exceptions

Throws SetaPDF_Core_Exception

writeChangedObjects()

Write changed objects.

Return Values

What objects were written?

writeObject()

Writes an object to the resulting document.

This method should only be called in the _writeFileBody()-method or in the callback method of it.

Parameters
$object : SetaPDF_Core_Type_IndirectObject