SetaPDF_Extractor_Filter_Rectangle A rectangle filter.

File: /SetaPDF v2/Extractor/Filter/Rectangle.php

This filter allows you to define a rectangle which is used to filter text items. This filter automatically takes care of rotated pages/coordinate systems.

The origin of the coordinate system is the lower left throughout.

Class hierarchy

Implements

Summary

Constants

MODE_CONTACT

public const string SetaPDF_Extractor_Filter_Rectangle::MODE_CONTACT = 'contact'

A mode constant.

This mode says that the text item has to contact the rectangle of this filter instance through any point or intersection.

MODE_CONTAINS

public const string SetaPDF_Extractor_Filter_Rectangle::MODE_CONTAINS = 'contains'

A mode constant.

This mode says that the whole text item has to be contained by the rectangle of this filter instance.


Properties

$_id

The id of this filter.

$_mode

protected string SetaPDF_Extractor_Filter_Rectangle::$_mode = 'contact'

The mode to work with.

$_page

$_rectangle

$_rotation

The rotation value.

$_transformedRectangle


Methods

__construct()

The constructor.

The filter can work in 2 modes, which can be controlled by the 2nd paramter of the constructor:

  1. MODE_CONTACT: This mode will match tangent items.
  2. MODE_CONTAINS: This mode will match if the rectangle contains the whole text item.
Parameters
$rectangle : SetaPDF_Core_Geometry_Rectangle

The rectangle to filter by.

$mode : string

A mode constant.

$id : null|string

The filter id.

_contact()

Checks whether the rectangle and the text items contacting each other.

Parameters
$textItem : SetaPDF_Extractor_TextItem
 
Exceptions

Throws SetaPDF_Core_Exception

_contains()

Checks whether the rectangle contains the item or not.

Parameters
$textItem : SetaPDF_Extractor_TextItem
 
Exceptions

Throws SetaPDF_Core_Exception

accept()

Method that is called to decide if a text item accepted or not.

Parameters
$textItem : SetaPDF_Extractor_TextItem
 
Exceptions

Throws SetaPDF_Extractor_Exception

Throws SetaPDF_Core_Exception

See

getId()

public SetaPDF_Extractor_Filter_Rectangle::getId (
void
): null|string

Get the filter id.

getMode()

Get the mode.

getRectangle()

Get the rectangle.

Parameters
$ignoreTransform : bool

Whether to ignore the transformation (rotation and translation of the page coordinate syste) or not.

setPage()

Set the current page object.

Parameters
$page : SetaPDF_Core_Document_Page
 
Exceptions

Throws SetaPDF_Core_Type_Exception