XFA Forms XML Forms Architecture Support

Introduction

Beside "regular" AcroForms the PDF specification introduced in version 1.5 interactive forms based on the Adobe XML Forms Architecture (XFA). These forms are commonly created with Adobe LiveCycle Designer and are described in XML.

The PDF format is only used as a kind of container for these type of forms. Depending on its type (static vs. dynamic) the PDF  document may hold a PDF representation of the form or it's up to the reader application to render the whole form from scratch.

XFA forms are supported by the SetaPDF-FormFiller component as of version 2.5.

Requirements

The parsing, handling and creation process of XFA information requires PHPs DOM extension to be installed, which is the case for a default PHP setup. 

Error Handling

To load and handle the XML content/packates the component makes heavy use of PHPs DOM functionalities. Sadly most of the DOM methods will trigger errors instead of throwing exceptions in case of a problem or error. For example, simply passing an invalid XML string to the loadXML() method will trigger an E_WARNING which may be annoying in the most cases. So it's up to the developer to check these data before passing it to the component. Anyhow the component will detect the problem too and throw additionally an exception.

To avoid warning you could also disable the libxml errors (on which the DOM extension is based upon) with the libxml_use_internal_errors function. 

PHP
libxml_use_internal_errors(true);
try {
    // ....
    $xfa->setData('<invalid>XML</innvalid>');
    // ...
} catch (Exception $e) {
    // handle exception
    // ...
    // handle details about the libxml errors
    foreach (libxml_get_errors() as $error) {
        var_dump($error);
    }
}

Check for an XFA Form

A simple check for an XFA form is possible by using the AcroForm helper class of the Catalog instance: 

PHP
$acroForm = $document->getCatalog()->getAcroForm();
if ($acroForm->isXfaForm()) {
    // xfa form
} else {
    // no xfa form
}

Additionally the SetaPDF-FormFiller comes with a method that will return a helper class to handle XFA forms. If this method will return false, the document doesn't make use of XFA features: 

PHP
$document = \SetaPDF_Core_Document::load(...);
$formFiller = new \SetaPDF_FormFiller($document);

$xfa = $formFiller->getXfa();
if (false === $xfa) {
    // no xfa form
} else {
    // an xfa form
}

The XFA helper class will offer several methods to work with the XFA information. It also offers a method to check if the form is a dynamic form

PHP
if ($xfa->isDynamic()) {
    // this is a dynamic form
} else {
    // this is a static form
}

Static XFA Forms

It is possible to create a so called "static XFA form"with the LiveCycle Designer (see the save dialog). Such document is a hybrid version and includes both an XFA and an AcroForm version of the form.

Before version 2.5 of the SetaPDF-FormFiller component, it was only possible to access the form fields of such forms by removing the XFA information via the SetaPDF_FormFiller::setRemoveXfaInformation() method.

As of version 2.5 this limitation is removed and a static form can be accessed as a regular AcroForm.Values which are passed to the field instances will be synced automatically with the data packet in the XFA.

Fill a Static XFA Form

Generally a static XFA form could be filled in two ways: It is possible to fill in the fields by the default field instanced which you can access through the fields helper:

PHP
$formFiller = new \SetaPDF_FormFiller($document);
$fields = $formFiller->getFields();

$fields['form1[0].#subform[0].TextField1[0]']->setValue('a new value in TextField1');
// ...

You will notice the confusing long field names. These names represents a kind of node-path in the XML template structure. It allows a logical mapping between an AcroForm field name and the field node in the XML template. 

As you can see the usage is the same as for a normal AcroForm. Internally the passed values are mapped to the correct data node in the XFA data packet.

The second way to fill in a static form is to set the XML data packet manually: 

PHP
$xfa = $formFiller->getXfa();

// set the new data
$xfa->setData('<form1><TextField1>a new value in TextField1</TextField1></form1>');
// sync AcroForm fields
$xfa->syncAcroFormFields();

The process is that simple: Pass the data packet and sync the AcroForm fields with the data packet. 

Limitations

The values passed to the field instances are used in the AcroForm representation as they were passed.

Flattening of static XFA forms is only possible by flattening the whole form through the fields instance.

Formatted fields like date/time and numeric fields will be formatted by the reader application but not by the component in their AcroForm field appearance.

Picture clauses are not supported for input nor ouput formatting.

Remove XFA Information

The creation of XFA forms is limited to very few application. Some PDF editors will allow you to open and fill in the document but the form editing capabilities may be disabled.

A static XFA form comes with both an XFA and a normal AcroForm representation. By removing the XFA information, it is possible to edit the form in common PDF editors as any other PDF form. You can remove the XFA information this way:

PHP
// load and register the autoload function
require_once('../../../library/SetaPDF/Autoload.php');

// create a HTTP writer
$writer = new \SetaPDF_Core_Writer_Http('normal-acro-form.pdf', true);
// get the main document isntance
$document = \SetaPDF_Core_Document::loadByFilename('_files/xfa/CheckRequest.pdf', $writer);

// now get an instance of the form filler
$formFiller = new \SetaPDF_FormFiller($document);

// get the XFA helper
$xfa = $formFiller->getXfa();
if ($xfa) {
    // if this is not a dynamic XFA form
    if (!$xfa->isDynamic()) {
        // remove the XFA package
        $document->getCatalog()->getAcroForm()->removeXfaInformation();
    } else {
        throw new Exception(
            'Removing the XFA package from a dynamic XFA form will result in a single PDF page showing only a ' .
            'compatibility error or loading message.'
        );
    }
}

// save the new document
$document->save()->finish();

Dynamic XFA Forms

Dynamic XFA forms are completely based on XML and use the PDF document as a container. The only PDF part may be a single page which is shown in reader applications that do not support XFA forms.

A dynamic XFA form allows the form to change its appearance in realtime. This includes growing tables, flexible textareas which may grow or shrink depending on their content and flexible page breakes depending on the page content. 

Fill a Dynamic XFA Form

A dynamic XFA form is not represented through any regular AcroForm fields. The fields helper class will offer no access to any field: 

PHP
$fields = $formFiller->getFields();
var_dump($fields->count() === 0); // true

A dynamic form simply only supports data through the SetaPDF_FormFiller_Xfa::setData() method: 

PHP
$xfa = $formFiller->getXfa();

// set the new data
$xfa->setData('<form1><TextField1>a new value in TextField1</TextField1></form1>');

Because a dynamic XFA form is rendered at viewing time by the reader application no further action is required. 

Get the XFA Data Package

To access the current XFA data package, you can use the SetaPDF_FormFiller_Xfa::getData() method:

PHP
$data = $xfa->getData();
if ($data) {
    $data->ownerDocument->formatOutput = true;
    $xml = $data->ownerDocument->saveXML($data);
} else {
    $xml = '<!-- no data found -->';
}

Limitations

Dynamic XFA forms are NOT accessable through the fields object of a form filler instance. In a dynamic XFA form no AcroForm fields are available.

The form data can only be passed by the SetaPDF_FormFiller_Xfa::setData() method.

Dynamic XFA form fields can NOT be flattened