SetaPDF_Core_Document A class representing a PDF document

File: /SetaPDF v2/Core/Document.php

This class represents a PDF document in all SetaPDF components. It offers the main functionalities for managing objects, cross reference tables and writers of the document instance.

It also tracks changes of objects and security handlers.

Class hierarchy

Implements

Summary

Constants

SAVE_METHOD_REWRITE

const string SetaPDF_Core_Document::SAVE_METHOD_REWRITE = 0

Save method constant defining a rewrite by resolving objects starting at the root object

SAVE_METHOD_REWRITE_ALL

const string SetaPDF_Core_Document::SAVE_METHOD_REWRITE_ALL = NULL

Save method constant defining a rewrite by writing all available objects

SAVE_METHOD_UPDATE

const string SetaPDF_Core_Document::SAVE_METHOD_UPDATE = 1

Save method constant defining an incremental update

STATE_CLEANED_UP

const string SetaPDF_Core_Document::STATE_CLEANED_UP = 'cleanedUp'

State constant

STATE_FINISHED

const string SetaPDF_Core_Document::STATE_FINISHED = 'finished'

State constant

STATE_NONE

const string SetaPDF_Core_Document::STATE_NONE = 'none'

State constant

STATE_SAVED

const string SetaPDF_Core_Document::STATE_SAVED = 'saved'

State constant

STATE_WRITING_BODY

const string SetaPDF_Core_Document::STATE_WRITING_BODY = 'writingBody'

State constant

STATE_WRITING_XREF

const string SetaPDF_Core_Document::STATE_WRITING_XREF = 'writingXRef'

State constant


Static Properties

$_instanceCounter

static protected integer SetaPDF_Core_Document::$_instanceCounter = 0

A counter for generating unique instance identifications

$_instanceIdentPrefix

static protected string SetaPDF_Core_Document::$_instanceIdentPrefix

A random prefix for generating unique instance identifications


Properties

$_beforeSaveCallbacks

protected array SetaPDF_Core_Document::$_beforeSaveCallbacks = array()

An array of callbacks that should be called before the save method is executed.

$_blockedReferencedObjects

protected array SetaPDF_Core_Document::$_blockedReferencedObjects = array()

Blocked referenced objects

This array holds objects which should NOT be automatically resolved.

$_cacheReferencedObjects

protected boolean SetaPDF_Core_Document::$_cacheReferencedObjects = false

Defines if referenced objects should be cached or not

$_catalog

protected SetaPDF_Core_Document_Catalog SetaPDF_Core_Document::$_catalog

Documents catalog instance

$_changedObjects

protected array SetaPDF_Core_Document::$_changedObjects

Changed objects

$_cleanUpObjects

protected boolean SetaPDF_Core_Document::$_cleanUpObjects = true

Flag saying that objects should be cleaned up automatically

$_compressXref

protected boolean SetaPDF_Core_Document::$_compressXref = false

Defines if the cross reference table will be compressed

See

$_currentObject

protected SetaPDF_Core_Type_IndirectObject SetaPDF_Core_Document::$_currentObject

The indirect object which is currently written

$_currentObjectData

protected array SetaPDF_Core_Document::$_currentObjectData

The object id and generation number of the currently written object

$_directWrite

protected boolean SetaPDF_Core_Document::$_directWrite = false

Defining whether the PDF objects should be written at once or object by object

$_fileBodyMethod

protected callback SetaPDF_Core_Document::$_fileBodyMethod

A method/function which should be called to fill the document body

$_info

protected SetaPDF_Core_Document_Info SetaPDF_Core_Document::$_info

The documents info object instance

$_instanceIdent

protected string SetaPDF_Core_Document::$_instanceIdent

Identification of a document instance

$_maxObjId

protected integer SetaPDF_Core_Document::$_maxObjId = 0

Current/max object id

$_newFileIdentifier

protected string SetaPDF_Core_Document::$_newFileIdentifier

The none permanent file identifier

$_objectStreams

protected array SetaPDF_Core_Document::$_objectStreams = array()

Array for information about object streams

$_objectStreamsParser

protected SetaPDF_Core_Parser_Pdf SetaPDF_Core_Document::$_objectStreamsParser

The parser object used for parsing object streams

$_objects

protected array SetaPDF_Core_Document::$_objects = array()

Newly created or resolved objects

$_objectsToIds

protected array SetaPDF_Core_Document::$_objectsToIds = array()

A relation between objects and ids

$_parser

protected SetaPDF_Core_Parser_Pdf SetaPDF_Core_Document::$_parser

The parser object of the existing document

$_pdfVersion

protected string SetaPDF_Core_Document::$_pdfVersion = '1.3'

PDF version

See

$_referencedObjects

protected array SetaPDF_Core_Document::$_referencedObjects = array()

Referenced objects

This array holds information about objects to which references were written. Needed to create deep copies of an object from one to another document

$_saveMethod

protected integer SetaPDF_Core_Document::$_saveMethod = 0

Incremental update or rewrite the document

See

$_secHandler

protected SetaPDF_Core_SecHandler_SecHandlerInterface SetaPDF_Core_Document::$_secHandler

The security handler of the new document

$_secHandlerIn

protected SetaPDF_Core_SecHandler_SecHandlerInterface SetaPDF_Core_Document::$_secHandlerIn

The security handler of the existing document

$_state

protected string SetaPDF_Core_Document::$_state = 'none'

A flag defining the state of the document object instance

$_trailer

protected SetaPDF_Core_Type_Dictionary SetaPDF_Core_Document::$_trailer

The trailer dictionary

$_trailerChanged

protected boolean SetaPDF_Core_Document::$_trailerChanged = false

Flag defining if the trailer was touched/changed

$_useWriteCallbacks

protected boolean SetaPDF_Core_Document::$_useWriteCallbacks = false

Flag saying that write callbacks are in use

$_writeCallbacks

protected array SetaPDF_Core_Document::$_writeCallbacks = array()

Array of write callbacks

$_writer

$_xref

protected SetaPDF_Core_Document_CrossReferenceTable SetaPDF_Core_Document::$_xref

An instance of a cross reference

If the document is created of an existing one this will be an instance of SetaPDF_Core_Parser_CrossReferenceTable


Static Methods

load()

public static SetaPDF_Core_Document::load (
SetaPDF_Core_Reader_ReaderInterface $reader [, SetaPDF_Core_Writer_WriterInterface $writer = null [, string $className = 'SetaPDF_Core_Document' ]]
): SetaPDF_Core_Document

Creates an instance of a document based on an existing PDF.

Parameters
$reader : SetaPDF_Core_Reader_ReaderInterface

A reader instance

$writer : SetaPDF_Core_Writer_WriterInterface

A writer instance

$className : string

The class name to initiate

Return Values

Returns a SetaPDF_Core_Document instance

Exceptions

Throws SetaPDF_Core_Parser_CrossReferenceTable_Exception,Exception

loadByFilename()

public static SetaPDF_Core_Document::loadByFilename (
string $filename [, SetaPDF_Core_Writer_WriterInterface $writer = null [, string $className = 'SetaPDF_Core_Document' ]]
): SetaPDF_Core_Document

Initiate an instance by a filename.

Parameters
$filename : string

The path to the pdf file

$writer : SetaPDF_Core_Writer_WriterInterface

A writer instance

$className : string

The class name to initiate

loadByString()

public static SetaPDF_Core_Document::loadByString (
string $string [, SetaPDF_Core_Writer_WriterInterface $writer = null [, string $className = 'SetaPDF_Core_Document' ]]
): SetaPDF_Core_Document

Initiate an instance by a pdf string.

Parameters
$string : string

Content of the pdf

$writer : SetaPDF_Core_Writer_WriterInterface

A writer instance

$className : string

The class name to initiate


Methods

__construct()

The constructor.

Parameters
$writer : SetaPDF_Core_Writer_WriterInterface

The writer to which the document should be written

__call()

public SetaPDF_Core_Document::__call (
string $method, array $arguments
): mixed

Implement magic methods for getting helper objects.

You can use the methods from SetaPDF_Core_Document_Catalog::getDocumentMagicMethods().

Additional you can use "getFormFiller", "getMerger", "getSigner" and "getStamper" if you want to receive instances of these components.

Parameters
$method : string

The method name

$arguments : array

The arguments

Exceptions

Throws BadMethodCallException

See

_cleanUpTrailer()

Cleans up trailer entries.

Parameters
$trailer : SetaPDF_Core_Type_Dictionary
 

_releaseObjects()

protected SetaPDF_Core_Document::_releaseObjects (
void
): void

Release objects.

_updateFileIdentifier()

protected SetaPDF_Core_Document::_updateFileIdentifier (
void
): string

Update or create a file identifier.

Return Values

The new file identifier

_writeCrossReferenceTable()

Write the cross reference table.

_writeFileBody()

protected SetaPDF_Core_Document::_writeFileBody (
void
): boolean

Main method which writes the file body.

This method should extended/overwritten to implement individual logic if the document should be build at runtime.

Return Values

If body was written

_writeFileHeader()

protected SetaPDF_Core_Document::_writeFileHeader (
void
): void

Writes the file header.

_writeObject()

protected SetaPDF_Core_Document::_writeObject (
SetaPDF_Core_Document $document, integer $objectId, integer $generation, boolean $cache
): void

Writes an object to the resulting document but evaluates first if a write is neccesarry.

Parameters
$document : SetaPDF_Core_Document
 
$objectId : integer
 
$generation : integer
 
$cache : boolean
 

_writeTrailer()

protected SetaPDF_Core_Document::_writeTrailer (
void
): void

Write the trailer dictionary and the pointer top the initial xref table.

addBeforeSaveCallback()

public SetaPDF_Core_Document::addBeforeSaveCallback (
$name, $callback
): bool

Adds a callback that will get executed before the save method is processed.

Parameters
$name
 
$callback
 
See

addIndirectObjectReferenceWritten()

Cache written object references.

This method is called if an indirect object reference is written. This makes sure that the class knows about maybe unwritten objects.

Parameters
$indirectObject : SetaPDF_Core_Type_IndirectObjectInterface

The indirect object

Exceptions

Throws SetaPDF_Core_Document_ObjectNotFoundException

blockReferencedObject()

This prohibit that a reference to this objects will be written.

Objects defined via this method will not automatically be resolved if an reference to them was written.

Parameters
$indirectObject : SetaPDF_Core_Type_IndirectObjectInterface

The indirect object

See

cleanUp()

public SetaPDF_Core_Document::cleanUp (
void
): void

Release objects to free memory and cycled references.

After calling this method the instance of this object is unusable!

cloneIndirectObject()

Clones an indirect object.

Parameters
$indirectObject : SetaPDF_Core_Type_IndirectObject

The indirect object to clone

createNewObject()

Create a new indirect object.

Parameters
$value : SetaPDF_Core_Type_AbstractType

The value of the new indirect object

deleteObject()

Delete an indirect object.

Parameters
$object : SetaPDF_Core_Type_IndirectObjectInterface

The indirect object to delete

deleteObjectById()

public SetaPDF_Core_Document::deleteObjectById (
integer $objectId [, integer $generation = 0 ]
): void

Deletes an indirect object by its object id and generation number.

Parameters
$objectId : integer

The object id of the object

$generation : integer

The generation id of the object

ensureObject()

Makes sure that an object is ensured through this document (if possible).

Parameters
$indirectObject : SetaPDF_Core_Type_IndirectObjectInterface

The indirect object to ensure

finish()

Forwards a finish signal to the attached writer.

getCacheReferencedObjects()

Says that referenced objects get cached or not.

getCatalog()

Get the catalog object.

getCurrentObject()

Returns the currently written object.

getCurrentObjectData()

Returns the currently written object data.

getCurrentObjectDocument()

Get the object of the currently written/handled object.

getDirectWrite()

public SetaPDF_Core_Document::getDirectWrite (
void
): bool

Gets whether the PDF objects should be written individually (true) or after assembling a single string (false).

getFileIdentifier()

public SetaPDF_Core_Document::getFileIdentifier (
[ boolean $permanent = false [, boolean $create = true ]]
): string

Get a file identifier.

Parameters
$permanent : boolean
 
$create : boolean
 

getIdForObject()

Return the object id and generation number for an indirect object or reference.

This method makes sure that objects are nearly independent of their original document and the matching between document, object and their ids is handled at one place: in this method.

Parameters
$indirectObject : SetaPDF_Core_Type_IndirectObjectInterface

The indirect object

getInfo()

Get the documents info object.

getInstanceIdent()

public SetaPDF_Core_Document::getInstanceIdent (
void
): string

Get the instance identifier of this document.

getOwnerPdfDocument()

Implementation of the SetaPDF_Core_Type_Owner interface.

getParser()

Get the parser object.

getPdfVersion()

public SetaPDF_Core_Document::getPdfVersion (
void
): string

Returns the PDF version of the document.

getSaveMethod()

public SetaPDF_Core_Document::getSaveMethod (
void
): integer

Get the current used save method.

This method can be used by objects at writing time to evaluate if it is possible to edit referencing values or not.

getSecHandler()

getSecHandlerIn()

Returns the security handler of the original document.

getState()

public SetaPDF_Core_Document::getState (
void
): string

Return the current object state.

getTrailer()

Returns the trailer dictionary.

getWriter()

Get current writer object.

getXref()

Get the cross reference object.

handleWriteCallback()

Method called when a PDF type will be written.

This method could be used to manipulate a value just before it will get written to the writer object.

Parameters
$value : SetaPDF_Core_Type_AbstractType
 

hasSecHandler()

public SetaPDF_Core_Document::hasSecHandler (
void
): boolean

Checks whether a security handler is attached to this document.

hasSecurityHandler()

Alias for hasSecHandler().

objectRegistered()

Checks if an indirect object is already registered for/in this document instance.

Parameters
$indirectObject : SetaPDF_Core_Type_IndirectObjectInterface

The indirect object to check

registerWriteCallback()

public SetaPDF_Core_Document::registerWriteCallback (
callback $callback, string $type, string $name
): void

Register a write callback.

Parameters
$callback : callback
 
$type : string
 
$name : string
 

releaseObject()

Releases an indirect object from the internal object cache.

Parameters
$object : SetaPDF_Core_Type_IndirectObject
 

removeBeforeSaveCallback()

Removes a callback that was added before.

Parameters
$name
 
See

resolveIndirectObject()

public SetaPDF_Core_Document::resolveIndirectObject (
integer $objectId [, integer|null $generation = 0 [, boolean $cache = true ]]
): SetaPDF_Core_Type_IndirectObject

Resolves an indirect object.

Parameters
$objectId : integer

The object id

$generation : integer|null

The generation number. Could be also "null" to find an object with an unknown generation number with the xref parser

$cache : boolean

Should the object be cached?

Exceptions

Throws SetaPDF_Core_Document_ObjectNotDefinedException,SetaPDF_Core_Document_ObjectNotFoundException

save()

public SetaPDF_Core_Document::save (
[ boolean|integer $method = true ]
): SetaPDF_Core_Document

Saves the document.

The PDF format offers a way to add changes to a document by simply appending the changes to the end of the file. This method is called incremental update and has the advantage that it is very fast, because only changed objects have to be written. This behavior is the default one, when calling the save()-method. Sadly it makes it easy to revert the document to the previous state by simply cutting the bytes of the last revision.

The parameter of the save()-method allows you to define that the document should be rebuild from scratch by resolving the complete object structure. Just pass SetaPDF_Core_Document::SAVE_METHOD_REWRITE to it. This task is very performance intensive, because the complete document have to be parsed, interpreted and rewritten.

Additionally it is possible to rewrite the whole document with all available objects. The benefit of this solution is that it will keep compressed object streams intact: SetaPDF_Core_Document::SAVE_METHOD_REWRITE_ALL. The disadvantage ist, that unused objects may be copied/written, too.

Parameters
$method : boolean|integer

Update or rewrite the document

Exceptions

Throws InvalidArgumentException

Throws SetaPDF_Core_Exception

Throws BadMethodCallException

setCacheReferencedObjects()

public SetaPDF_Core_Document::setCacheReferencedObjects (
boolean $cacheReferencedObjects
): void

Define if referenced objects should be cached or not.

Parameters
$cacheReferencedObjects : boolean

The flag status

setCleanUpObjects()

public SetaPDF_Core_Document::setCleanUpObjects (
boolean $cleanUpObjects
): void

Set the behavior if the cleanUp()-methods of objects get called automatically.

Parameters
$cleanUpObjects : boolean

The flag status

setCompressXref()

public SetaPDF_Core_Document::setCompressXref (
boolean $compressXref
): void

Define whether the cross reference should be compressed or not.

By default the SetaPDF-Core component writes the cross-reference in the standard format or in the format which is defined in the source document, if any available.

Parameters
$compressXref : boolean

Pass true to enforce that the cross reference will be compressed. Pass false to enforce a standard uncompressed cross reference table.

Exceptions

Throws BadMethodCallException

setDirectWrite()

public SetaPDF_Core_Document::setDirectWrite (
bool $directWrite
): void

Defines whether the PDF objects should be written individually (true) or after assembling a single string (false).

Parameters
$directWrite : bool
 

setFileBodyMethod()

public SetaPDF_Core_Document::setFileBodyMethod (
callback $callback
): void

Set the callback method/function which will write the file body.

Parameters
$callback : callback
 

setMinPdfVersion()

public SetaPDF_Core_Document::setMinPdfVersion (
string $minPdfVersion
): void

Set the minimal PDF version.

Parameters
$minPdfVersion : string

The minimal pdf version

setNewFileIdentifier()

public SetaPDF_Core_Document::setNewFileIdentifier (
string $newFileIdentifier
): void

Set a custom non-permanent file identifier.

Parameters
$newFileIdentifier : string
 

setPdfVersion()

public SetaPDF_Core_Document::setPdfVersion (
string $pdfVersion
): void

Set the PDF version of the document.

Parameters
$pdfVersion : string

The pdf version

setSecHandler()

Set the security handler for this document.

Parameters
$secHandler : SetaPDF_Core_SecHandler_SecHandlerInterface

The new secHandler

Exceptions

Throws BadMethodCallException,SetaPDF_Core_SecHandler_Exception

setWriter()

Set the writer object.

A writer instance can only be set prior the first call to save() or after a finish() call.

Parameters
$writer : SetaPDF_Core_Writer_WriterInterface

The new writer object

Exceptions

Throws BadMethodCallException

unBlockReferencedObject()

unRegisterWriteCallback()

public SetaPDF_Core_Document::unRegisterWriteCallback (
string $type, string $name
): void

Un-Register a write callback.

Parameters
$type : string
 
$name : string
 

update()

public SetaPDF_Core_Document::update (
SplSubject $subject
): void

Implementation of the observer pattern.

This method is automatically called if an observed object was changed.

Parameters
$subject : SplSubject

The SplSubject notifying the observer of an update.

write()

public SetaPDF_Core_Document::write (
string $s
): mixed

Writes content to the attached writer.

Parameters
$s : string
 
Exceptions

Throws SetaPDF_Core_Exception

writeChangedObjects()

public SetaPDF_Core_Document::writeChangedObjects (
void
): boolean

Write changed objects.

Return Values

was an object written?

writeObject()

Writes an object to the resulting document.

This method should only called in the _writeFileBody()-method or in the callback method of it.

Parameters
$object : SetaPDF_Core_Type_IndirectObject
 

writeReferencedObjects()

Write referenced objects.