SetaPDF_Extractor The main class of the SetaPDF-Extractor Component
File: /SetaPDF v2/Extractor.php
Class hierarchy
Summary
Constants
VERSION
The version
Properties
$_document
protected SetaPDF_Core_Document SetaPDF_Extractor::$_document
The document instance
Methods
__construct()
public SetaPDF_Extractor::__construct (
SetaPDF_Core_Document $document [, SetaPDF_Extractor_Strategy_AbstractStrategy|null $strategy = null [, bool $ignoreFaultyStreams = false ]]
)The constructor.
Parameters
- $document : SetaPDF_Core_Document
- $strategy : SetaPDF_Extractor_Strategy_AbstractStrategy|null
- $ignoreFaultyStreams : bool
getResultByPageNumber()
public SetaPDF_Extractor::getResultByPageNumber (
integer $pageNumber [, string $boundaryBox = null ]
): SetaPDF_Extractor_Result_Collection|SetaPDF_Extractor_Result_Words|SetaPDF_Extractor_Result_WordGroups|string|string[]Get the result by the default or individual strategy of a specific page.
Parameters
- $pageNumber : integer
- $boundaryBox : string
If set the page boundary is used to limit the result to the rectangle of the given boundary. See
SetaPDF_Core_PageBoundaries::XXX_BOX
constants for possible values.
Exceptions
Throws SetaPDF_Core_Exception
See
getStrategy()
public SetaPDF_Extractor::getStrategy (
void
): SetaPDF_Extractor_Strategy_AbstractStrategy|SetaPDF_Extractor_Strategy_PlainGet the extraction strategy.
setStrategy()
Set the extraction strategy.
Parameters
- $strategy : SetaPDF_Extractor_Strategy_AbstractStrategy