SetaPDF_Extractor_ContentStreamCleaner Helper class to clean up content streams.
File: /SetaPDF v2/Extractor/ContentStreamCleaner.php
Class hierarchy
Summary
Constants
REGEX_COLORS
public const string SetaPDF_Extractor_ContentStreamCleaner::REGEX_COLORS = '/(?<=[}\\]\\x00\\x09\\x0A\\x0C\\x0D\\x20]|^)([\\d\\.\\-]+[\\x00\\x09\\x0A\\x0C\\x0D\\x20]+){1,4}(k|K|SC|sc|SCN|scn|rg|RG|g|G)(?=[\\x00\\x09\\x0A\\x0C\\x0D\\x20{\\[\\/]|$)/S'
Constant defining a regex for color operators.
REGEX_PATHOPERATORS
public const string SetaPDF_Extractor_ContentStreamCleaner::REGEX_PATHOPERATORS = '/(?<=[}\\]\\x00\\x09\\x0A\\x0C\\x0D\\x20]|^)([\\d\\.\\-]+[\\x00\\x09\\x0A\\x0C\\x0D\\x20]+){0,6}(m|l|c|v|y|re|h|S|s|f|F|f\\*|B|B\\*|b|b\\*|n|W|W\\*)(?=[\\x00\\x09\\x0A\\x0C\\x0D\\x20{\\[\\/]|$)/S'
Constant defining a regex for path operators.
TYPE_ALL
Constant defining all content types.
TYPE_INLINE_IMAGE
Constant defining a content type.
TYPE_NONE
Constant defining a content type.
TYPE_OPERATOR
Constant defining a content type.
TYPE_STRING
Constant defining a content type.
Static Methods
_strposa()
private static SetaPDF_Extractor_ContentStreamCleaner::_strposa (
string $haystack, array $needles [, int $offset = 0 ]
): bool|intSearches for the closest needle in the string.
If there is no needle in the string, it will return false.
Parameters
- $haystack : string
- $needles : array
- $offset : int
clean()
public static SetaPDF_Extractor_ContentStreamCleaner::clean (
string|array $data, array $regexes [, int $target = SetaPDF_Extractor_ContentStreamCleaner::TYPE_OPERATOR ]
): stringCleans a content stream string by using regexes on the chosen targets.
The regexes will NOT affect literal string objects.
Parameters
- $data : string|array
- $regexes : array
- $target : int
splitStream()
public static SetaPDF_Extractor_ContentStreamCleaner::splitStream (
string $string [, int $ignore = SetaPDF_Extractor_ContentStreamCleaner::TYPE_INLINE_IMAGE ]
): arraySplits a content stream string into literal strings, inline images and operators (all left).
The pieces offer information about their type.
Parameters
- $string : string
- $ignore : int