SetaPDF_Extractor_Filter_Rectangle A rectangle filter.

File: /SetaPDF/Extractor/Filter/Rectangle.php

This filter allows you to define a rectangle which is used to filter text items. This filter automatically takes care of rotated pages/coordinate systems.

The origin of the coordinate system is the lower left throughout.

Class hierarchy

Implements

Summary

Constants

MODE_CONTACT

This mode says that the text item has to contact the rectangle of this filter instance through any point or intersection.

MODE_CONTAINS

This mode says that the whole text item has to be contained by the rectangle of this filter instance.


Properties

$_id

The id of this filter.

$_mode

protected string SetaPDF_Extractor_Filter_Rectangle::$_mode = 'contact'

The mode to work with.

$_page

$_rectangle

$_rotatedRectangle

$_rotation

The rotation value.


Methods

__construct()

public SetaPDF_Extractor_Filter_Rectangle::__construct ( SetaPDF_Core_Geometry_Rectangle $rectangle [, string $mode = self::MODE_CONTACT [, null|string $id = null ]] )

The constructor.

The filter can work in 2 modes, which can be controlled by the 2nd paramter of the constructor:

1. MODE_CONTACT: This mode will match tangent items. 2. MODE_CONTAINS: This mode will match if the rectangle contains the whole text item.

Parameters
$rectangle : SetaPDF_Core_Geometry_Rectangle

The rectangle to filter by.

$mode : string

A mode constant.

$id : null|string

filter id.

_contact()

public bool SetaPDF_Extractor_Filter_Rectangle::_contact ( SetaPDF_Extractor_TextItem $textItem )

Checks whether the rectangle and the text items contacting each other.

Parameters
$textItem : SetaPDF_Extractor_TextItem
 

_contains()

public bool SetaPDF_Extractor_Filter_Rectangle::_contains ( SetaPDF_Extractor_TextItem $textItem )

Checks whether the rectangle contains the item or not.

Parameters
$textItem : SetaPDF_Extractor_TextItem
 

accept()

public bool|string SetaPDF_Extractor_Filter_Rectangle::accept ( SetaPDF_Extractor_TextItem $textItem )

Method that is called to decide if a text item accepted or not.

Parameters
$textItem : SetaPDF_Extractor_TextItem
 
Exceptions

Throws SetaPDF_Extractor_Exception

See

getId()

public null|string SetaPDF_Extractor_Filter_Rectangle::getId ( void )

Get the filter id.

getMode()

public string SetaPDF_Extractor_Filter_Rectangle::getMode ( void )

Get the mode.

getRectangle()

public SetaPDF_Core_Geometry_Rectangle SetaPDF_Extractor_Filter_Rectangle::getRectangle ( [ bool $ignoreRotation = false ] )

Get the rectangle.

Parameters
$ignoreRotation : bool

Whether to ignore the rotation or not.

getRotation()

public int SetaPDF_Extractor_Filter_Rectangle::getRotation ( void )

Get the current rotation.

setPage()

public void SetaPDF_Extractor_Filter_Rectangle::setPage ( [ SetaPDF_Core_Document_Page $page = null ] )

Set the current page object.

Parameters
$page : SetaPDF_Core_Document_Page