SetaPDF_Extractor_Sorter_Baseline A sorter class that sorts lines by comparing the baseline of text items.

File: /SetaPDF v2/Extractor/Sorter/Baseline.php

Class hierarchy

Summary

Properties

$_baselineThreshold

protected float SetaPDF_Extractor_Sorter_Baseline::$_baselineThreshold = 0.69999999999999996

Threshold which keeps items on the same line.

$_matrix

A temporary matrix used in the sort process.


Methods

getBaselineThreshold()

Get the threshold which keeps items on the same line.

groupByLines()

Groups all text items by lines.

Parameters
$textItems : SetaPDF_Extractor_TextItem[]

The text items

horizontallyThenVertically()

itemsJoining()

Checks if two items joining each other.

Parameters
$prevItem : SetaPDF_Extractor_Result_CompareableInterface

The left item.

$item : SetaPDF_Extractor_Result_CompareableInterface

The right item.

$spaceWidthFactor : float

The space width factor.

setBaselineThreshold()

public SetaPDF_Extractor_Sorter_Baseline::setBaselineThreshold (
float $threshold
): void

Set the threshold which keeps items on the same line.

Parameters
$threshold : float
 

verticallyThenHorizontally()