SetaPDF_Extractor_Filter_Rectangle A rectangle filter.

File: /SetaPDF v2/Extractor/Filter/Rectangle.php

This filter allows you to define a rectangle which is used to filter text items. This filter automatically takes care of rotated pages/coordinate systems.

The origin of the coordinate system is the lower left throughout.

Class hierarchy

Implements

Summary

Constants

MODE_CONTACT

const string SetaPDF_Extractor_Filter_Rectangle::MODE_CONTACT = 'contact'

A mode constant.

This mode says that the text item has to contact the rectangle of this filter instance through any point or intersection.

MODE_CONTAINS

const string SetaPDF_Extractor_Filter_Rectangle::MODE_CONTAINS = 'contains'

A mode constant.

This mode says that the whole text item has to be contained by the rectangle of this filter instance.


Properties

$_id

protected string|null SetaPDF_Extractor_Filter_Rectangle::$_id

The id of this filter.

$_mode

protected string SetaPDF_Extractor_Filter_Rectangle::$_mode = 'contact'

The mode to work with.

$_page

protected SetaPDF_Core_Document_Page SetaPDF_Extractor_Filter_Rectangle::$_page

The current page object.

$_rectangle

protected SetaPDF_Core_Geometry_Rectangle SetaPDF_Extractor_Filter_Rectangle::$_rectangle

The rectangle to filter by.

$_rotation

protected int SetaPDF_Extractor_Filter_Rectangle::$_rotation = 0

The rotation value.

$_transformedRectangle

protected SetaPDF_Core_Geometry_Rectangle SetaPDF_Extractor_Filter_Rectangle::$_transformedRectangle

The rotated rectangle (if needed).


Methods

__construct()

public SetaPDF_Extractor_Filter_Rectangle::__construct (
SetaPDF_Core_Geometry_Rectangle $rectangle [, string $mode = self::MODE_CONTACT [, null|string $id = null ]]
)

The constructor.

The filter can work in 2 modes, which can be controlled by the 2nd paramter of the constructor:

1. MODE_CONTACT: This mode will match tangent items. 2. MODE_CONTAINS: This mode will match if the rectangle contains the whole text item.

Parameters
$rectangle : SetaPDF_Core_Geometry_Rectangle

The rectangle to filter by.

$mode : string

A mode constant.

$id : null|string

filter id.

_contact()

Checks whether the rectangle and the text items contacting each other.

Parameters
$textItem : SetaPDF_Extractor_TextItem
 

_contains()

Checks whether the rectangle contains the item or not.

Parameters
$textItem : SetaPDF_Extractor_TextItem
 

accept()

Method that is called to decide if a text item accepted or not.

Parameters
$textItem : SetaPDF_Extractor_TextItem
 
Exceptions

Throws SetaPDF_Extractor_Exception

See

getId()

public SetaPDF_Extractor_Filter_Rectangle::getId (
void
): null|string

Get the filter id.

getMode()

Get the mode.

getRectangle()

Get the rectangle.

Parameters
$ignoreTransform : bool

Whether to ignore the transformation (rotation and translation of the page coordinate syste) or not.

setPage()

Set the current page object.

Parameters
$page : SetaPDF_Core_Document_Page