PHPackages                             futureplc/html-dom-document - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Utility &amp; Helpers](/categories/utility)
4. /
5. futureplc/html-dom-document

ActiveLibrary[Utility &amp; Helpers](/categories/utility)

futureplc/html-dom-document
===========================

A drop-in replacement for DOMDocument that handles HTML5 documents.

v1.0.0(1y ago)231[1 PRs](https://github.com/futureplc/html-dom-document/pulls)MITPHPPHP ^8.3

Since Nov 4Pushed 1y ago1 watchersCompare

[ Source](https://github.com/futureplc/html-dom-document)[ Packagist](https://packagist.org/packages/futureplc/html-dom-document)[ Docs](https://github.com/futureplc/html-dom-document)[ GitHub Sponsors](https://github.com/futureplc)[ RSS](/packages/futureplc-html-dom-document/feed)WikiDiscussions main Synced yesterday

READMEChangelog (1)Dependencies (4)Versions (8)Used By (0)

HTML5 DOM Document
==================

[](#html5-dom-document)

[![Latest Version on Packagist](https://camo.githubusercontent.com/70e62783da5fbbf4f75619f016b3f4177a21fb7b8d53add19e4b6d0c4cacf88f/68747470733a2f2f696d672e736869656c64732e696f2f7061636b61676973742f762f667574757265706c632f68746d6c2d646f6d2d646f63756d656e742e7376673f7374796c653d666c61742d737175617265)](https://packagist.org/packages/futureplc/html-dom-document)[![Tests](https://camo.githubusercontent.com/ca13484a1eca231df9ba47845c5b99e15e17b8381ab65c66941beef9de6924e7/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f616374696f6e732f776f726b666c6f772f7374617475732f667574757265706c632f68746d6c2d646f6d2d646f63756d656e742f72756e2d74657374732e796d6c3f6272616e63683d6d61696e266c6162656c3d7465737473267374796c653d666c61742d737175617265)](https://github.com/futureplc/html-dom-document/actions/workflows/run-tests.yml)[![Total Downloads](https://camo.githubusercontent.com/a30eb5279f1b0c9aed36bba1fd3a6c7797631c0344ff518cc64fb866a78917ad/68747470733a2f2f696d672e736869656c64732e696f2f7061636b61676973742f64742f667574757265706c632f68746d6c2d646f6d2d646f63756d656e742e7376673f7374796c653d666c61742d737175617265)](https://packagist.org/packages/futureplc/html-dom-document)

The HTMLDocument package has one primary purpose: to act as a stand-in replacement for the core `DOMDocument` and related DOM classes that come with PHP.

> ⚠️ If you just need to crawl the DOM and not manipulate it in-place, consider using a package like the [Symfony DOM Crawler component](https://symfony.com/doc/current/components/dom_crawler.html).

While the builtin DOM-related classes with PHP are a great way to parse XML, they quickly fall apart when trying to parse modern HTML5 markup. This package makes it more intuitive to work with, and handles some of the quirks behind-the-scenes.

This package provides a series of classes to replace the DOM ones in a backward-compatible fashion but with a tighter interface and additional utilities bundled in to make working with HTML a breeze. These classes will return instances of the equivalent `HTML*` class instead of the `DOM*` one:

- `DOMDocument` -&gt; `HTMLDocument`
- `DOMElement` -&gt; `HTMLElement`
- `DOMNode` -&gt; `HTMLElement`
- `DOMText` -&gt; `HTMLText`
- `DOMNodeList` -&gt; `HTMLNodeList`
- `DOMXPath` -&gt; `HTMLXPath`

Installation
------------

[](#installation)

You can install the package via Composer:

```
composer require futureplc/html-dom-document
```

Features
--------

[](#features)

### Sensible return values

[](#sensible-return-values)

There's nothing more annoying than having to check union types on every operation because of PHP's legacy of using falsey return types. We've sorted this by making sure there are sensible defaults:

- If a return value expects `DOMNodeList` or `false`, we'll return an empty `DOMNodeList` if there are no values to return
- If a return value could be a `string` or `false`, we'll either throw an exception on failure or return an empty string
- No more differentiating between `DOMNode` and `DOMElement`; we have a single `HTMLElement` class that handles all scenarios of the two combined

You'll notice this philosophy throughout the interface - if there's a sensible type to return, we'll ensure you get that instead of dealing with unions.

### Easily create HTML documents and elements

[](#easily-create-html-documents-and-elements)

`DOMDocument` typically has a terse, antiquated interface that requires a lot of setup and repetition to do even basic and commonly needed tasks like creating a `DOMElement` class from a plain HTML string.

All the old `DOMDocument` style methods still work, so you can drop this package in as a replacement for existing `DOMDocument` implementations. However, we have added new ways to create HTML documents and elements without the verbosity usually required for some operations.

```
$dom = new HTMLDocument(); $dom->loadHTML($html);
$dom = HTMLDocument::fromHTML($html);
$dom = HTMLDocument::loadFromFile($filePath);

$element = HTMLElement::fromNode($domNode);
$element = HTMLElement::fromHTML($html);

$element = $dom->createElement('p', 'This is a paragraph.');
$element = $dom->createElementFromNode($domNode);
$element = $dom->createElementFromHTML('This is a paragraph.');
```

### Additional behaviour to support HTML5

[](#additional-behaviour-to-support-html5)

The majority of the custom behaviour to allow DOMDocument to parse any HTML string comes from a series of "middleware" classes that manipulate the HTML before it's loaded and before it's emitted as a plain HTML string again.

These middleware do various things, such as:

- Assuming HTML5 behaviour if no `` is present, by adding one
- Ignoring LibXML errors (as LibXML complains about certain HTML5 tags even though it can parse them properly)
- Treating `` and `` tags as verbatim so their contents aren't changed by the rest of the document

These will be enabled by default if you use the `HTMLDocument` class, but you can disable them as needed.

- Calling `->withoutMiddleware()` without any arguments before loading the HTML will result in no middleware applying, essentially resulting in just the additional utility methods with none of the extra HTML5 support
- Calling `->withoutMiddleware(MiddlewareName::class)`, using the class name of a middleware, will disable that specific one

Getting a plain HTML string back out of `DOMDocument` can be a bit tricky if you need something specific like a specific element, so we have added some options to make it easier.

```
$html = (string) $dom; // Cast the HTMLDocument to a string
$html = $dom->saveHTML();

$html = (string) $element; // Cast the HTMLElement to a string
$html = $element->saveHTML();
$html = $element->getInnerHTML(); // Gets the HTML of the element without the wrapping node
$html = $element->getOuterHTML(); // Gets the HTML of the element with the wrapping node
```

### Check if HTML5

[](#check-if-html5)

If you need to know whether you're working with an HTML5 document or not, the `isHTML5()` method will tell you.

```
$dom->isHtml5(); // true
```

### Void elements

[](#void-elements)

If working with HTML5, you may want to know if a given node is a "void element", meaning it needs no closing tag. This can be checked with the `isVoidElement()` method.

```
$element->isVoidElement(); // true
```

Normally when saving the HTML, `DOMDocument` would output void elements as ``, but this package will output them as ``, even for custom elements, maintaining how they were input originally.

### Working with attributes

[](#working-with-attributes)

The `HTMLElement` class has a series of methods to help you work with attributes on elements.

```
$element->getAttributes(); // Returns an array of all attributes
$element->getAttribute('class'); // Returns the value of the class attribute

$element->setAttribute('class', 'foo'); // Sets the class attribute to "foo"
$element->addAttribute('class', 'foo'); // Adds the "foo" value as a space-separated value to the class attribute, appending it if the attribute already exists

$element->removeAttribute('ref'); // Removes the ref attribute entirely
$element->removeAttribute('ref', 'noreferrer'); // Removes the "noreferrer" value from the ref attribute if it exists - if the attribute is now empty, it will be removed entirely

$element->toggleAttribute('checked'); // Toggles the "checked" attribute
```

As we often work with CSS classes in HTML, there are also some methods to help with this.

```
$element->getClassList(); // Returns an array of CSS classes
$element->setClassList(['foo', 'bar']); // Sets the CSS classes
$element->hasClass('foo'); // Returns true if the element has the class "foo"
$element->addClass('baz'); // Adds the class "baz"
$element->removeClass('bar'); // Removes the class "bar"
```

### Removing parts of a document

[](#removing-parts-of-a-document)

There are some helpful utilities for quickly removing parts of a document as required.

```
$element->wihoutSelector('p'); // Removes all child `` element
$element->withoutComments(); // Removes all HTML comments
```

### Utility methods

[](#utility-methods)

There are a couple of additional utility methods to help build attribute strings from PHP arrays.

`Utility::attribute()` will take a single key/value pair and turn it into an HTML attribute, regardless of whether the value is a string, array, or boolean. A boolean value can be used to conditionally add attributes.

```
Utility::attribute('class', ['foo', 'bar']); // class="foo bar"
Utility::attribute('id', 'baz'); // id="baz"
Utility::attribute('required', true); // disabled
```

`Utility::attributes()` will take this further by doing the same with an array of key/value pairs, turning them into an HTML attribute string altogether.

```
Utility::attributes([
    'class' => ['foo', 'bar'],
    'id' => 'baz',
    'required' => true,
    'checked' => false,
]);

// class="foo bar" id="baz" required
```

`Utility::nodeMapRecursive()` gives the ability to run a callback on every node in a document, including all child nodes. You can use this callback to inspect the nodes, modify them, replace one node with another entirely, or remove them from the document.

This is also available on `HTMLElement` and `HTMLDocument` objects through the `mapRecursive` method.

```
$dom = HTMLDocument::fromHTML('foo');

// Make sure every element has a class of "bar"
$dom->mapRecursive(function ($node) {
    if ($node instanceof HTMLElement) {
        $node->setAttribute('class', 'bar');
    }
});

// foo
```

`Utility::countRootNodes()` will tell you how many root nodes are in a document.

```
Utility::countRootNodes('foo'); // 1
Utility::countRootNodes('foobar'); // 2
```

If working with source HTML that contains multiple root nodes, you can use the `Utility::wrap($html)` and `Utility::unwrap($html)` methods to ensure a single root node or remove the root node, respectively.

### Working with CSS classes

[](#working-with-css-classes)

The `HTMLElement` class has several methods to help you work with CSS classes.

```
$element->setClassList(['foo', 'bar']);
$element->getClassList(); // ['foo', 'bar']
$element->hasClass('foo'); // true
$element->addClass('foo'); // ['foo', 'bar', 'baz']
$element->removeClass('baz'); // ['foo', 'bar']
```

### Toggling boolean attributes

[](#toggling-boolean-attributes)

In the case where you need to toggle some boolean attributes on or off, the `toggleAttribute()` method is available.

```
$element = HTMLElement::fromString('');
$element->toggleAttribute('checked'); //
$element->toggleAttribute('checked'); //
```

### Querying on CSS selectors and XPath

[](#querying-on-css-selectors-and-xpath)

Most people working with HTML know how to use most CSS selectors, but many have never touched XPath. We've added handy `querySelector()` and `querySelectorAll()` methods to the `HTMLDocument` and `HTMLElement` classes, allowing you to use CSS selectors directly to get the needed elements, courtesy of the [Symfony CSS Selector](https://github.com/symfony/css-selector) package.

```
$dom->querySelector('head > title'); // Returns the first `` element
$dom->querySelectorAll('.foo'); // Returns all elements with the class `foo`
```

If you still need to work with XPath, there is a convenient `xpath()` method on both `HTMLDocument` and `HTMLElement` classes.

```
$dom->xpath('//a'); // Returns all `` elements
```

### Working with text nodes

[](#working-with-text-nodes)

Working with text nodes can be tricky if you ever want to change something in the text to another node entirely. The `replaceTextWithNode()` method on `HTMLText` lets you do just that.

This is particularly useful if you use the `Utility::nodeMapRecursive()` function, which will traverse through text nodes.

```
$textNode->replaceTextWithNode('example', HTMLElement::fromHTML('example'));
```

### Other Notes

[](#other-notes)

`HTMLDocument` also has some other benefits over `DOMDocument`:

- Tags with an XML-style namespace get maintained, whereas `DOMDocument` would typically only keep the last part of the tag name. This is useful when working with standards such as edge-side-includes and have markup such as ``
- Attributes starting with `@` get maintained, whereas `DOMDocument` would typically remove them. This is useful when working with HTML that has Alpine.js or Vue.js markup such as ``
- Any void tags on the input HTML will also be output as void tags

Drawbacks
---------

[](#drawbacks)

Because of all the extra checks and type conversions, this package is a bit slower than the native `DOMDocument` classes. However, the difference is negligible in most cases, and the benefits of the additional features and ease of use far outweigh the performance hit unless you are processing millions of large HTML documents at once.

Testing
-------

[](#testing)

```
composer test
```

Changelog
---------

[](#changelog)

Please see [CHANGELOG](CHANGELOG.md) for more information on what has changed recently.

Contributing
------------

[](#contributing)

Please see [CONTRIBUTING](https://github.com/spatie/.github/blob/main/CONTRIBUTING.md) for details.

Security Vulnerabilities
------------------------

[](#security-vulnerabilities)

Please review [our security policy](../../security/policy) on how to report security vulnerabilities.

Credits
-------

[](#credits)

- [Future PLC](https://github.com/futureplc)
- [Liam Hammett](https://github.com/imliam)
- [Chris Powell](https://github.com/ampedweb)
- [All Contributors](../../contributors)

License
-------

[](#license)

The MIT License (MIT). Please see [License File](LICENSE.md) for more information.

###  Health Score

31

—

LowBetter than 66% of packages

Maintenance33

Infrequent updates — may be unmaintained

Popularity7

Limited adoption so far

Community11

Small or concentrated contributor base

Maturity63

Established project with proven stability

 Bus Factor1

Top contributor holds 54.8% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Unknown

Total

1

Last Release

672d ago

### Community

Maintainers

![](https://www.gravatar.com/avatar/3e8a14b9f997cf85aacea7d39da9dc33c38cc05fe03360578327ea9bcb25f4d9?d=identicon)[ImLiam](/maintainers/ImLiam)

---

Top Contributors

[![ampedweb](https://avatars.githubusercontent.com/u/22029493?v=4)](https://github.com/ampedweb "ampedweb (17 commits)")[![imliam](https://avatars.githubusercontent.com/u/4326337?v=4)](https://github.com/imliam "imliam (13 commits)")[![dependabot[bot]](https://avatars.githubusercontent.com/in/29110?v=4)](https://github.com/dependabot[bot] "dependabot[bot] (1 commits)")

---

Tags

futureplchtml-dom-document

###  Code Quality

TestsPHPUnit

Code StylePHP CS Fixer

### Embed Badge

![Health badge](/badges/futureplc-html-dom-document/health.svg)

```
[![Health](https://phpackages.com/badges/futureplc-html-dom-document/health.svg)](https://phpackages.com/packages/futureplc-html-dom-document)
```

###  Alternatives

[craftcms/cms

Craft CMS

3.6k3.6M3.1k](/packages/craftcms-cms)[blackfire/player

A powerful web crawler and web scraper with Blackfire support

49617.1k](/packages/blackfire-player)[spatie/laravel-pjax

A pjax middleware for Laravel 5

523386.8k11](/packages/spatie-laravel-pjax)[dominikb/composer-license-checker

Utility to check for licenses of dependencies and block/allow them.

574.6M14](/packages/dominikb-composer-license-checker)[drupal/core-dev

require-dev dependencies from drupal/drupal; use in addition to drupal/core-recommended to run tests from drupal/core.

2022.6M342](/packages/drupal-core-dev)[fusonic/opengraph

PHP library for consuming and publishing Open Graph resources.

104401.8k5](/packages/fusonic-opengraph)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
