PHP
Published in PHP
avatar
2 minutes read

Parsing and Processing HTML/XML in PHP

In PHP, you can parse and process HTML/XML documents using various libraries and functions.

Using SimpleXML

What is SimpleXML?

SimpleXML is a built-in PHP extension that provides an easy and convenient way to parse XML documents and convert them into objects.

How to Use SimpleXML

// Load the XML content from a file or string
$xml = simplexml_load_string($xmlContent);

// Access elements and attributes in the XML
echo $xml->elementName;
echo $xml->elementName['attributeName'];

Example

$xmlContent = '<bookstore>
    <book>
        <title>PHP Basics</title>
        <author>John Doe</author>
        <price>19.99</price>
    </book>
</bookstore>';

// Load the XML content
$xml = simplexml_load_string($xmlContent);

// Access elements and attributes
echo "Title: " . $xml->book->title . "\n";
echo "Author: " . $xml->book->author . "\n";
echo "Price: $" . $xml->book->price . "\n";

Using DOMDocument

What is DOMDocument?

DOMDocument is another built-in PHP extension that provides a more powerful way to parse and manipulate both HTML and XML documents.

How to Use DOMDocument

// Create a new DOMDocument object
$dom = new DOMDocument();

// Load the HTML/XML content from a file or string
$dom->loadHTML($htmlContent);

// Access elements and attributes using DOM methods
$elements = $dom->getElementsByTagName('elementName');
foreach ($elements as $element) {
    // Process each element
}

Example

$htmlContent = '<html>
    <body>
        <h1>Welcome to PHP</h1>
        <p>This is a paragraph.</p>
    </body>
</html>';

// Create a new DOMDocument object
$dom = new DOMDocument();

// Load the HTML content
$dom->loadHTML($htmlContent);

// Access elements and attributes
$heading = $dom->getElementsByTagName('h1')->item(0)->nodeValue;
$paragraph = $dom->getElementsByTagName('p')->item(0)->nodeValue;

echo "Heading: " . $heading . "\n";
echo "Paragraph: " . $paragraph . "\n";

0 Comment