SetaPDF_Core_Encoding A wrapper class for handling PDF specific encodings

File: /SetaPDF v2/Core/Encoding.php

This class is a wrapper around iconv/mb_*-functions to offer a transparent support of PDF specific and independent, unknown encodings.

By default the class will use mb functions if available. Otherwise it will fallback to iconv functions. To use specific functions just set the static property:

SetaPDF_Core_Encoding::setLibrary('mb');
// or
SetaPDF_Core_Encoding::setLibrary('iconv');

Class hierarchy

Summary

Constants

MAC_EXPERT

public const string SetaPDF_Core_Encoding::MAC_EXPERT = 'MacExpertEncoding'

MacExpertEncoding

MAC_ROMAN

public const string SetaPDF_Core_Encoding::MAC_ROMAN = 'MacRomanEncoding'

MacRomanEncoding

MAX_EXPERT

public const string SetaPDF_Core_Encoding::MAX_EXPERT = 'MacExpertEncoding'

PDF_DOC

public const string SetaPDF_Core_Encoding::PDF_DOC = 'PDFDocEncoding'

PDFDocEncoding

STANDARD

public const string SetaPDF_Core_Encoding::STANDARD = 'StandardEncoding'

StandardEncoding

SYMBOL

public const string SetaPDF_Core_Encoding::SYMBOL = 'Symbol'

Symbol

WIN_ANSI

public const string SetaPDF_Core_Encoding::WIN_ANSI = 'WinAnsiEncoding'

WinAnsiEncoding

ZAPF_DINGBATS

public const string SetaPDF_Core_Encoding::ZAPF_DINGBATS = 'ZapfDingbats'

ZapfDingbats


Static Properties

$library

static public string SetaPDF_Core_Encoding::$library

Library to use for conversion between encodings


Static Methods

convert()

public static SetaPDF_Core_Encoding::convert (
string $string, string $inEncoding, string $outEncoding
): string

Converts a string from one to another encoding.

A kind of wrapper around iconv/mb_convert_encoding plus the separate processing of PDF related encodings.

Parameters
$string : string

The string to convert in $inEncoding

$inEncoding : string

The "in"-encoding

$outEncoding : string

The "out"-encoding

convertPdfString()

public static SetaPDF_Core_Encoding::convertPdfString (
string $string [, string $outEncoding = 'UTF-8' ]
): string

Converts a PDF string (in PDFDocEncoding or UTF-16BE) to another encoding.

This method automatically detects UTF-16BE encoding in the input string and removes the BOM.

Parameters
$string : string

The string to convert in PDFDocEncoding or UTF-16BE

$outEncoding : string

The "out"-encoding

fromUtf16Be()

public static SetaPDF_Core_Encoding::fromUtf16Be (
array|SetaPDF_Core_Font_Cmap_CmapInterface $table, string $string [, boolean $ignore = false [, boolean $translit = false [, string $substituteChar = '' ]]]
): string

Converts a string from UTF-16BE to another predefined encoding.

Parameters
$table : array|SetaPDF_Core_Font_Cmap_CmapInterface

The translation table

$string : string

The input string

$ignore : boolean

Characters that cannot be represented in the target charset are silently discarded

$translit : boolean

Transliteration activated

$substituteChar : string
 

getLibrary()

public static SetaPDF_Core_Encoding::getLibrary (
void
): string

Get the library to use for multibyte string operations.

If none is defined the method will check for the mbstring module and define it or iconv automatically.

getPredefinedEncodingTable()

public static SetaPDF_Core_Encoding::getPredefinedEncodingTable (
string $encoding
): array

Get the translation table of a predefined PDF specific encodings.

Parameters
$encoding : string
 
Exceptions

Throws InvalidArgumentException

isPredefinedEncoding()

public static SetaPDF_Core_Encoding::isPredefinedEncoding (
string $encoding
): boolean

Checks if an encoding is a PDF specific predefined encoding.

Parameters
$encoding : string
 

isUtf16Be()

public static SetaPDF_Core_Encoding::isUtf16Be (
string $string
): bool

Checks a string for UTF-16BE BOM.

Parameters
$string : string
 

setLibrary()

public static SetaPDF_Core_Encoding::setLibrary (
string $library
): void

Set the library to use for multibyte string operations.

Parameters
$library : string

Possible values are 'mb' for mbstring functions or 'iconv' for iconv functions.

strSplit()

public static SetaPDF_Core_Encoding::strSplit (
string $string [, string $encoding = 'UTF-8' ]
): array

Splits a string into an array.

Parameters
$string : string
 
$encoding : string
 

strlen()

public static SetaPDF_Core_Encoding::strlen (
string $string [, string $encoding = 'UTF-8' ]
): int

Get the length of a string in a specific encoding.

Parameters
$string : string
 
$encoding : string
 

substr()

public static SetaPDF_Core_Encoding::substr (
string $string, int $start [, int $length = null [, string $encoding = 'UTF-8' ]]
): string|bool

Return part of a string.

Parameters
$string : string
 
$start : int
 
$length : int
 
$encoding : string
 
Return Values

Returns false on error

toPdfString()

public static SetaPDF_Core_Encoding::toPdfString (
string $string [, string $inEncoding = 'UTF-8' ]
): string

Converts a string into PdfDocEncoding or UTF-16BE.

Actually directly converts to UTF-16BE to support unicode. Method should be optimized to choose the correct encoding (PdfDoc or UTF-16BE) depending on the characters used.

Parameters
$string : string
 
$inEncoding : string
 

toUtf16Be()

public static SetaPDF_Core_Encoding::toUtf16Be (
array|SetaPDF_Core_Font_Cmap_CmapInterface $table, string $string [, boolean $ignore = false [, boolean $translit = false ]]
): string

Converts a string to UTF-16BE from another predefined 1-byte encoding.

Parameters
$table : array|SetaPDF_Core_Font_Cmap_CmapInterface

The translation table

$string : string

The input string

$ignore : boolean

Characters that cannot be represented in the target charset are silently discarded

$translit : boolean

Transliteration activated

unicodePointToUtf16Be()

public static SetaPDF_Core_Encoding::unicodePointToUtf16Be (
integer $unicodePoint
): false|string

Converts an unicode point to UTF16Be.

Parameters
$unicodePoint : integer
 

utf16BeToUnicodePoint()

public static SetaPDF_Core_Encoding::utf16BeToUnicodePoint (
string $utf16
): int|bool

Converts a UTF16BE character to a unicode point.

Parameters
$utf16 : string