Memory Usage Things you need to know about memory usage and management

Introduction

The SetaPDF components make heavy use of passing objects to other objects for observing or other purposes which ends in cycled references between the objects.

PHP comes with a very messy garbage collector until version 5.3. Prior 5.3 the garbage collector lacks on collectiong such cycled references which ends in memory leaks.

Releasing Memory in PHP 5.2

To release these references internally nearly all objects offer a cleanUp() method. If you need to remove cycled references you should start with the SetaPDF_Core_Document instance - it will forward its call to all related objects recursively.

It is only needed to use the cleanUp() method in PHP 5.2 and if the script is not ended after the finish() call and additional memory is needed!

So if you need to release memory of the PHP process you should use this method after the finish() call:

PHP
$document = \SetaPDF_Core_Document::load(...);
$pages = $document->getCatalog()->getPages();
...
$document->save()->finish();

// unset all references to any variable in the current scope
unset($pages);
// then call the cleanUp() method:
$document->cleanUp();
unset($document);

Common Memory Management

In PHP 5.3 a new Garbage Collection was introduced which is able to handle cycled references.

To allow the Garbage Collection to clean up the cycled references it is a requirement that no variable points to the related object in a specific variable scope.

To gain the full benefit of the Garbage Collection it is a prefered way to encapsulate memory intensive task into method/function scope. This way the Garbage Collection can clean up the references when the scope is left and it is not needed to unset the variables manually.

Following PHP code shows the behavior: 

PHP
<?php
require_once('library/SetaPDF/Autoload.php');

function createPdfInSeparateScope()
{
    $document = new \SetaPDF_Core_Document();
    $catalog = $document->getCatalog();
    $pages = $catalog->getPages();
    $tmpPages = array();
    for ($i = 200; $i > 0; $i--) {
        $tmpPages[] = $pages->create(\SetaPDF_Core_PageFormats::A4);
    }
}

echo 'First of all we create the variables in the function scope: <br />';
createPdfInSeparateScope();
echo 'Memory usage afterwards is: ' . memory_get_usage() . '<br />' .
     'The memory consumption is increased and cycles are not collected, ' .
     'because the possible root buffer is not full yet.<br />';

$cycles = gc_collect_cycles();
echo $cycles . ' were collected.<br />' .
     'Memory usage afterwards: ' . memory_get_usage() . '<br />';

echo '<br />';
echo 'Now we create the same variables in the global scope. And we try to ' .
     'collect the cycles without unsetting the variables:<br />';

$document = new \SetaPDF_Core_Document();
// let's create some memory usage
$catalog = $document->getCatalog();
$pages = $catalog->getPages();
$tmpPages = array();
for ($i = 200; $i > 0; $i--) {
    $tmpPages[] = $pages->create(\SetaPDF_Core_PageFormats::A4);
}

$cycles = gc_collect_cycles();
echo $cycles . ' were collected.<br />';
echo 'Memory usage afterwards: ' . memory_get_usage() . '<br />';

echo '<br />';
echo 'Now we unset the variables and try to collect the cycles again:<br />';
unset($document, $catalog, $pages, $tmpPages);

$cycles = gc_collect_cycles();
echo $cycles . ' were collected.<br />';
echo 'Memory usage afterwards: ' .  memory_get_usage() . '<br />';

By default, PHP's garbage collector is turned on. There is, however, a php.ini setting that allows you to change this: zend.enable_gc.

More information about PHP's garbade collector and how to control its behavior can be found here.