setasign\SetaPDF2\Extractor
Extractor The main class of the SetaPDF-Extractor Component
File: /SetaPDF v2/Extractor/Extractor.php
Old class name (alias):
\SetaPDF_Extractor
Class hierarchy
Summary
Constants
VERSION
Properties
$_document
The document instance
$_strategy
The extraction strategy
Methods
__construct()
?Strategy\AbstractStrategy $strategy = null,
bool $ignoreFaultyStreams = false
The constructor.
Parameters
- $document : \setasign\SetaPDF2\Core\Document
- $strategy : ?Strategy\AbstractStrategy
- $ignoreFaultyStreams : bool
getResultByPage()
Get the result by the default or individual strategy of a specific page by its page object.
Parameters
- $page : \setasign\SetaPDF2\Core\Document\Page
- $boundaryBox : ?string
If set the page boundary is used to limit the result to the rectangle of the given boundary. See \setasign\SetaPDF2\Core\PageBoundaries::XXX_BOX constants for possible values.
Exceptions
Throws \setasign\SetaPDF2\Core\Exception
Throws \setasign\SetaPDF2\Core\Parser\Pdf\InvalidTokenException
getResultByPageNumber()
?string $boundaryBox = null
Get the result by the default or individual strategy of a specific page by its page number.
Parameters
- $pageNumber : int
- $boundaryBox : ?string
If set the page boundary is used to limit the result to the rectangle of the given boundary. See \setasign\SetaPDF2\Core\PageBoundaries::XXX_BOX constants for possible values.
Exceptions
Throws \setasign\SetaPDF2\Core\Exception
See
getTextItemsByPage()
Get all text items by the default or individual strategy of a specific page by its page object.
These text items can be used to get a result by an individual method of a strategy (e.g. the Strategy\PlainStrategy::getResultByTextItems() method. By using this intermediate state it is possible to use several filters, which may collect the same text-items.
Parameters
- $page : \setasign\SetaPDF2\Core\Document\Page
- $boundaryBox : ?string
If set the page boundary is used to limit the result to the rectangle of the given boundary. See \setasign\SetaPDF2\Core\PageBoundaries::XXX_BOX constants for possible values.
Exceptions
Throws \setasign\SetaPDF2\Core\Exception
Throws \setasign\SetaPDF2\Core\Parser\Pdf\InvalidTokenException