SetaPDF_Core_Encoding A wrapper class for handling PDF specific encodings

File: /SetaPDF/Core/Encoding.php

This class is a wrapper around iconv/mb_*-functions to offer a transparent support of PDF specific and independent, unknown encodings.

By default the class will use mb functions if available. Otherwise it will fallback to iconv functions. To use specific functions just set the static property:

SetaPDF_Core_Encoding::setLibrary('mb');
// or
SetaPDF_Core_Encoding::setLibrary('iconv');

Class hierarchy

Summary

Constants

MAC_ROMAN

const string SetaPDF_Core_Encoding::MAC_ROMAN = 'MacRomanEncoding'

MacRomanEncoding

MAX_EXPERT

const string SetaPDF_Core_Encoding::MAX_EXPERT = 'MacExpertEncoding'

MacExpertEncoding

PDF_DOC

const string SetaPDF_Core_Encoding::PDF_DOC = 'PDFDocEncoding'

PDFDocEncoding

STANDARD

const string SetaPDF_Core_Encoding::STANDARD = 'StandardEncoding'

StandardEncoding

SYMBOL

const string SetaPDF_Core_Encoding::SYMBOL = 'Symbol'

Symbol

WIN_ANSI

const string SetaPDF_Core_Encoding::WIN_ANSI = 'WinAnsiEncoding'

WinAnsiEncoding

ZAPF_DINGBATS

const string SetaPDF_Core_Encoding::ZAPF_DINGBATS = 'ZapfDingbats'

ZapfDingbats


Static Properties

$library

static public string SetaPDF_Core_Encoding::$library

Library to use for conversion between encodings


Static Methods

convert()

static public string SetaPDF_Core_Encoding::convert ( string $string, string $inEncoding, string $outEncoding )

Converts a string from one to another encoding.

A kind of wrapper around iconv/mb_convert_encoding plus the separate processing of PDF related encodings.

Parameters
$string : string

The string to convert in $inEncoding

$inEncoding : string

The "in"-encoding

$outEncoding : string

The "out"-encoding

convertPdfString()

static public string SetaPDF_Core_Encoding::convertPdfString ( string $string [, string $outEncoding = 'UTF-8' ] )

Converts a PDF string (in PDFDocEncoding or UTF-16BE) to another encoding.

This method automatically detects UTF-16BE encoding in the input string and removes the BOM.

Parameters
$string : string

The string to convert in PDFDocEncoding or UTF-16BE

$outEncoding : string

The "out"-encoding

fromUtf16Be()

static public string SetaPDF_Core_Encoding::fromUtf16Be ( array|SetaPDF_Core_Font_Cmap_CmapInterface $table, string $string [, boolean $ignore = false [, boolean $translit = false [, string $substituteChar = '' ]]] )

Converts a string from UTF-16BE to another predefined encoding.

Parameters
$table : array|SetaPDF_Core_Font_Cmap_CmapInterface

The translation table

$string : string

The input string

$ignore : boolean

Characters that cannot be represented in the target charset are silently discarded

$translit : boolean

Transliteration activated

$substituteChar : string
 

getLibrary()

static public string SetaPDF_Core_Encoding::getLibrary ( void )

Get the library to use for multibyte string operations.

If none is defined the method will check for the mbstring module and define it or iconv automatically.

getPredefinedEncodingTable()

static public array SetaPDF_Core_Encoding::getPredefinedEncodingTable ( string $encoding )

Get the translation table of a predefined PDF specific encodings.

Parameters
$encoding : string
 
Exceptions

Throws InvalidArgumentException

isPredefinedEncoding()

static public boolean SetaPDF_Core_Encoding::isPredefinedEncoding ( string $encoding )

Checks if an encoding is a PDF specific predefined encoding.

Parameters
$encoding : string
 

isUtf16Be()

static public bool SetaPDF_Core_Encoding::isUtf16Be ( string $string )

Checks a string for UTF-16BE BOM.

Parameters
$string : string
 

setLibrary()

static public void SetaPDF_Core_Encoding::setLibrary ( string $library )

Set the library to use for multibyte string operations.

Parameters
$library : string

Possible values are 'mb' for mbstring functions or 'iconv' for iconv functions.

strSplit()

static public array SetaPDF_Core_Encoding::strSplit ( $string $string [, string $encoding = 'UTF-8' ] )

Splits a string into an array.

Parameters
$string : $string
 
$encoding : string
 

strlen()

static public int SetaPDF_Core_Encoding::strlen ( string $string [, string $encoding = 'UTF-8' ] )

Get the length of a string in a specific encoding.

Parameters
$string : string
 
$encoding : string
 

substr()

static public string|bool SetaPDF_Core_Encoding::substr ( string $string, int $start [, int $length = null [, string $encoding = 'UTF-8' ]] )

Return part of a string.

Parameters
$string : string
 
$start : int
 
$length : int
 
$encoding : string
 
Return Values

Returns false on error

toPdfString()

static public string SetaPDF_Core_Encoding::toPdfString ( string $string [, string $inEncoding = 'UTF-8' ] )

Converts a string into PdfDocEncoding or UTF-16BE.

Actually directly converts to UTF-16BE to support unicode. Method should be optimized to choose the correct encoding (PdfDoc or UTF-16BE) depending on the characters used.

Parameters
$string : string
 
$inEncoding : string
 

toUtf16Be()

static public string SetaPDF_Core_Encoding::toUtf16Be ( array|SetaPDF_Core_Font_Cmap_CmapInterface $table, string $string [, boolean $ignore = false [, boolean $translit = false ]] )

Converts a string to UTF-16BE from another predefined 1-byte encoding.

Parameters
$table : array|SetaPDF_Core_Font_Cmap_CmapInterface

The translation table

$string : string

The input string

$ignore : boolean

Characters that cannot be represented in the target charset are silently discarded

$translit : boolean

Transliteration activated

unicodePointToUtf16Be()

static public string SetaPDF_Core_Encoding::unicodePointToUtf16Be ( integer $unicodePoint )

Converts an unicode point to UTF16Be.

Parameters
$unicodePoint : integer
 

utf16BeToUnicodePoint()

static public int SetaPDF_Core_Encoding::utf16BeToUnicodePoint ( string $utf16 )

Converts a UTF16BE character to a unicode point.

Parameters
$utf16 : string