Original URL: http://www.theregister.co.uk/2008/08/15/processing_xml_php/

Harness XML with PHP 5 extensions

A time to query

By Deepak Vohra

Posted in Developer, 15th August 2008 15:02 GMT

Hands on PHP is one of the most commonly used languages for developing web sites while XML has become an industry standard for exchanging data. Increasingly, web sites use XML to transfer data through web feeds such as RSS and Atom, or through web services.

PHP 5 XML extensions provide support for parsing, transformation, XPath navigation, and schema validation of XML documents. The SimpleXML extension in PHP 5 simplifies node selection by converting an XML document to a PHP object that may be accessed with property selectors and array iterators. The XSL extension in PHP 5 is used to transform an XML document.

In this article, I'll show you how to process an example XML document, catalog.xml using the PHP 5 XML extensions.

First things first, though. Before you go anywhere, you'll need to install PHP 5 in Apache HTTP Server, and activate the XSL extension in the php.ini configuration file.

extension=php_xsl.dll

Restart Apache Server after modifying php.ini.

Create your XML

To create an XML document with the PHP 5 DOM extension create a PHP file, createXML.php , in the C:/Program Files/Apache Group/Apache2/htdocs directory, the document root directory of the Apache server. An XML document in PHP 5 is represented with DOMDocument class. Therefore, create a DOMDocument object. Specify the XML version and encoding in the DOMDocument constructor.

$domDocument = new DOMDocument('1.0','utf-8');

An element is represented with the DOMElement class. Create root element catalog with createElement(). Add the root element to the DOMDocument object with appendChild().

$catalog=    $domDocument->createElement("catalog");
$domDocument->appendChild ($catalog);

An attribute in a DOMElement object is represented with the DOMAttr class. Create attribute title with createAttribute(). Set the value of the title attribute using the value property. Add the title attribute to catalog element using setAttributeNode().

$titleAttribute= $domDocument->createAttribute("title");
    $titleAttribute->value="XML Zone";
    $catalog->setAttributeNode ($titleAttribute);

Create a journal element, including the date attribute, within the catalog element. Add an article element, including sub elements title and author, within the journal element. A text node in an element is represented with the DOMText class. Create a text node using the createTextNode() to set the text of title element.

$titleText= $domDocument->createTextNode("The Java XPath API");
 $title->appendChild ($titleText);

Output the XML document created to the browser using saveXML().

$domDocument->saveXML();

Run the PHP script with URL http://localhost/createXML.php. The XML document, catalog.xml, gets generated.

Parse the XML

To parse the XML document catalog.xml create a PHP file, parseXML.php. Create a DOMDocument object. Load the example XML document using load(String xmlFile). Another function loadXML(string xmlDocument) may be used to load an XML document from a String.

$domDocument->load("file://C:/PHP/catalog.xml");

As an example, obtain all the title nodes in the XML document using getElementsByTagName(string tagName). Iterate over the node list and output the title elements' values.

$titleNodeList=$domDocument->getElementsByTagName("title");
   for($i=0; $I<$titleNodeList->length;$i++)
   {
     echo $titleNodeList->item($i)->nodeValue;
     print "<br/>\n";
     
    }

The output from the PHP script is shown below.

The Java XPath API
JAXP validation

Navigate XML with XPath

To navigate the XML document using XPath create a PHP script, xpath.php. Create a DOMDocument object and load the example XML document with load(string filename).

The DOMXPath class is used to evaluate an XPath expression in the context of an XML document node. Create a DOMXPath object.

$domXPath=new DOMXPath($domDocument);

XPath expressions may be evaluated with query(). Parameter of type DOMNode is optional in query(). By default the context node is the root element. As an example, retrieve the values of all the title elements.

$titles=$domXPath->query("/catalog/journal/article/title");

The query() returns a DOMNodeList. Iterate over the DOMNodeList to output the values of the title elements.

foreach ($titles as $title) {

    echo 'Title: ', $title->firstChild->nodeValue;
    print "<br/>\n";
}

The output from the xpath.php script is shown below.

Title: The Java XPath API
Title: JAXP validation

Validate your XML

In this section we validate the example document with an XML schema, catalog.xsd. The schema validation functions, schemaValidate(string filename) and schemaValidateSource(string schema), return a boolean value indicating if an XML document is valid. Limitations of the schema validation functions are that only one schema may be specified for validation and that the detail of the validation error is not output.

Create a PHP script, validateXML.php. Create a DOMDocument object and load the XML document. Validate the XML document with the XML schema using schemaValidate().

$isValid=$domDocument->schemaValidate("file://C:/PHP/catalog.xsd");

The output from the PHP script is as follows.

XML Document is valid

Query XML with SimpleXML

PHP 5 provides the SimpleXML extension to select XML document nodes and node values with property selectors and array iterators. To select nodes and output node values in the XML document using the SimpleXML extension create a PHP script, simpleXML.php. The SimpleXML extension provides functions simplexml_load_file(string filename) and simplexml_load_string(string xmlDocument) to load an XML document from a file or a string.

$simplexml=simplexml_load_file("file://C:/PHP/catalog.xml");

The simplexml_load_file() converts the XML document to an object of type SimpleXMLElement, and returns the root element in the XML document. As an example, output the value of the title element of the article in the first journal element.

print $simplexml->journal->article->title;

As the $simplexml object represents the root element in the XML document parsed, the value of the title attribute in the root element, catalog, is obtained as shown below.

print   $simplexml['title'];

The attributes of an element may be output using attributes(). For example, obtain the attributes in the root element.

$attributes=$simplexml->attributes();

Output the publisher attribute.

print $attributes['publisher'];

The output is shown below.

The Java XPath API
XML Zone
IBM developerWorks

Transform XML

To transform the example XML document with an XSLT, catalog.xslt, using the XSL extension, create a PHP script, transformXML.php. Create a DOMDocument object and load the example XML document. Create another DOMDocument for the stylesheet and load the XSLT file.

$xslFile=new DOMDocument();
$xslFile->load("file://C:/PHP/catalog.xslt");

An XML document is transformed using an XSLTProcessor. Therefore, create an XSLTProcessor.

$xsltProcessor=new XSLTProcessor();

An XSLT stylesheet is imported into the XSLT processor for transformations with importStylesheet(DOMDocument) .

$xsltProcessor->importStylesheet($xslFile);

Transform the XML document with the XSLT file using transformToXML() , and output the result of the transformation.

echo  $xsltProcessor->transformToXML($domDocument);

The output from the XSLT transformation gets displayed in the browser.

There you have it: you have just parsed, navigated, transformed and validated an XML document using the PHP 5 XML extensions. As noted, my PHP 5 scripts were run in Apache but you can use any web server that supports PHP.®