PHPackages                             daniesy/dominator - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Parsing &amp; Serialization](/categories/parsing)
4. /
5. daniesy/dominator

ActiveLibrary[Parsing &amp; Serialization](/categories/parsing)

daniesy/dominator
=================

A simple HTML parser for PHP.

v0.0.24(6mo ago)0348MITPHPPHP ^8.4CI passing

Since May 16Pushed 6mo ago1 watchersCompare

[ Source](https://github.com/daniesy/DOMinator)[ Packagist](https://packagist.org/packages/daniesy/dominator)[ Docs](https://github.com/daniesy/dominator)[ RSS](/packages/daniesy-dominator/feed)WikiDiscussions main Synced today

READMEChangelog (10)Dependencies (1)Versions (25)Used By (0)

DOMinator
=========

[](#dominator)

A robust, fast, and fully-featured HTML5 parser and query engine for PHP. Parse, traverse, manipulate, and query HTML documents with ease, supporting all modern HTML5 features, error recovery, namespaces, and more.

Features
--------

[](#features)

- **Full HTML5 parsing**: Handles all standard HTML5 elements, void/self-closing tags, comments, CDATA, script/style content, and multiple doctype variants.
- **XML declaration support**: Preserves and exports XML declarations (e.g., ``) at the start of the document.
- **Error recovery**: Gracefully parses malformed or broken HTML, just like browsers do.
- **Entity decoding**: Decodes HTML entities in text and attributes.
- **Whitespace normalization**: Optionally normalizes whitespace in text nodes.
- **Namespaces**: Supports XML/HTML namespaces (e.g., `svg:rect`).
- **Attribute handling**: Handles quoted/unquoted and boolean attributes.
- **Query engine**: Powerful CSS-like selectors for finding nodes (call `querySelectorAll`, `querySelector`, or `getElementsByTagName` directly on any `Node`).
- **Node manipulation**: Add, remove, set attributes, change text, and more (`Node`).
- **Performance**: Optimized for large and deeply nested documents.
- **Comprehensive tests**: Includes extensive tests for all features and edge cases.
- **Robust CSS parser**: Parses CSS rules and at-rules (including nested and simple at-rules like `@media` and `@font-face`).
- **CSS inlining**: Optionally inlines CSS styles as `style` attributes when exporting HTML (`Node::toInlinedHtml`).
- **Pretty-print and minify HTML**: Export HTML as minified (default) or pretty-printed (indented, human-readable) with `Node::toHtml(false)`.

Installation
------------

[](#installation)

Install via Composer:

```
composer require daniesy/dominator

```

Or include the `src/` files directly in your project.

Usage
-----

[](#usage)

### Basic Parsing

[](#basic-parsing)

```
use Daniesy\DOMinator\DOMinator;

$html = '\nHello World';
$root = DOMinator::read($html);
```

### Traversing the DOM

[](#traversing-the-dom)

```
foreach ($root->children as $child) {
    echo $child->tag; // e.g., 'div'
}
```

### Querying with CSS Selectors (DOM-like)

[](#querying-with-css-selectors-dom-like)

```
// Find all elements with class="foo"
$nodes = $root->querySelectorAll('.foo');
// Attribute selectors:
// Exact match
$adminNodes = $root->querySelectorAll('[data-role="admin"]');
// Space-separated word match
$adminWordNodes = $root->querySelectorAll('[data-role~="admin"]');
// Substring match
$adminSubstringNodes = $root->querySelectorAll('[data-role*="admin"]');
// Attribute presence
$withPlaceholder = $root->querySelectorAll('[placeholder]');
// Comma-separated (OR) selectors
$iconLinks = $root->querySelectorAll('link[rel="shortcut icon"], link[rel="icon"]');

// Access by index using item() method
echo $nodes->item(0)->innerText;
// Get the number of nodes
echo $nodes->length;
// Iterate over nodes
foreach ($nodes as $node) {
    echo $node->innerText;
}

// Find the first  element
$span = $root->querySelector('span');
if ($span) {
    echo $span->innerText;
}

// Find all  elements (case-insensitive)
$divs = $root->getElementsByTagName('div');
// Access by index
echo $divs->item(0)->innerText;
// Iterate over nodes
foreach ($divs as $div) {
    echo $div->innerText;
}
```

### Manipulating Nodes

[](#manipulating-nodes)

```
$node = $nodes->item(0);
$node->setAttribute('id', 'new-id');
$node->innerText = 'Updated text';
$node->remove(); // Remove from parent
```

### CSS Parsing and Selector Matching

[](#css-parsing-and-selector-matching)

```
use Daniesy\DOMinator\CssParser;

$css = '@media (max-width:600px) { body { background: #fff; } }\n@font-face { font-family: test; src: url(test.woff); }\ndiv.foo#bar { color: green; }';
$rules = CssParser::parse($css);
// $rules[0]['type'] === 'at' for @media, $rules[1]['type'] === 'at' for @font-face, $rules[2]['type'] === 'rule' for div.foo#bar

// Selector matching:
use Daniesy\DOMinator\Nodes\Node;
$node = new Node('div', ['class' => 'foo bar', 'id' => 'bar']);
CssParser::matches('div.foo#bar', $node); // true
CssParser::matches('.baz', $node); // false
```

- `CssParser::parse($css)` parses a CSS string into an array of rules and at-rules (including nested and simple at-rules).
- `CssParser::matches($selector, $node)` checks if a node matches a CSS selector (supports tag, class, id, compound, and descendant selectors).

### Exporting Back to HTML

[](#exporting-back-to-html)

```
// Minified (default)
$html = $root->toHtml();
// Pretty-printed (indented, human-readable)
$prettyHtml = $root->toHtml(false);
// Inline CSS styles (simple selectors only)
$inlinedHtml = $root->toInlinedHtml();
$prettyInlinedHtml = $root->toInlinedHtml(false);
```

### Handling Namespaces

[](#handling-namespaces)

```
$html = '';
$root = DOMinator::read($html);
$svg = $root->children->item(0); // Direct access to children array is still available
echo $svg->namespace; // 'svg'
echo $svg->tag;       // 'rect'

// Alternatively, you can use querySelector
$svg = $root->querySelector('svg\\:rect');
echo $svg->namespace; // 'svg'
echo $svg->tag;       // 'rect'
```

### Parsing Options

[](#parsing-options)

- `DOMinator::read($html, $normalizeWhitespace = false, $preprocess = null)`
    - Parses the given HTML string into a DOM tree.
    - Supports input with an XML declaration (e.g., ``).
    - **New:** Accepts an optional `$preprocess` callback. If provided, this function will be called with the HTML string before parsing. You can use this to preprocess, sanitize, or transform the HTML as needed.
    - Example: ```
        $root = DOMinator::read($html, false, function($input) {
            // Preprocess or sanitize $input
            return str_replace('foo', 'bar', $input);
        });
        ```
- `Node::toHtml($minify = true)`
    - If `$minify` is `false`, outputs pretty-printed HTML with indentation and newlines.
    - If `$minify` is `true` (default), outputs minified HTML.
- `Node::toInlinedHtml($minify = true)`
    - Inlines simple CSS rules from &lt;style&gt; tags as inline `style` attributes.
    - Supports only tag, class, and id selectors (no combinators or advanced selectors).
    - Removes &lt;style&gt; tags from the output.
    - Use `$minify = false` for pretty-printed output.

API Reference
-------------

[](#api-reference)

### `DOMinator`

[](#dominator-1)

- `DOMinator::read(string $html, bool $normalizeWhitespace = false): Node`
    - Parses HTML and returns the root node.

### `Node`

[](#node)

- Properties:
    - `tag`: Tag name (e.g., 'div')
    - `namespace`: Namespace prefix (e.g., 'svg')
    - `attributes`: Associative array of attributes
    - `children`: Array of child nodes
    - `innerText`: Text content (for text, comment, or CDATA nodes)
    - `isComment`, `isCdata`: Node type flags
    - `parent`: Parent node
    - `xmlDeclaration`: XML declaration string if present (e.g., ``)
- Methods:
    - `setAttribute($name, $value)`: Set or update attribute
    - `removeAttribute($name)`: Remove attribute
    - `remove()`: Remove node from parent
    - `toHtml()`: Export node and children as HTML
    - `toInlinedHtml($minify = true)`: Export HTML with inlined CSS styles
    - `querySelectorAll($selector)`: Returns a NodeList of matching nodes (CSS selector)
    - `querySelector($selector)`: Returns the first matching node or null
    - `getElementsByTagName($tag)`: Returns a NodeList of nodes with the given tag name (case-insensitive)

### `NodeList`

[](#nodelist)

- Properties:
    - `length`: The number of nodes in the list
- Methods:
    - `item($index)`: Returns the node at the specified index, or null if the index is out of range
    - `count()`: Returns the number of nodes in the list (implements Countable)
    - `getIterator()`: Returns an iterator for the nodes in the list (implements IteratorAggregate)

Testing
-------

[](#testing)

Run all tests with PHPUnit:

```
vendor/bin/phpunit tests

```

Examples
--------

[](#examples)

See `tests/DOMinatorTest.php`, `tests/NodeTest.php`, and `tests/QueryTest.php` for comprehensive usage and edge cases.

License
-------

[](#license)

MIT License. See LICENSE file.

---

**Author:** Daniesy

Contributions and issues welcome!

###  Health Score

38

—

LowBetter than 83% of packages

Maintenance66

Regular maintenance activity

Popularity14

Limited adoption so far

Community7

Small or concentrated contributor base

Maturity54

Maturing project, gaining track record

 Bus Factor1

Top contributor holds 100% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~9 days

Recently: every ~25 days

Total

24

Last Release

205d ago

PHP version history (2 changes)v0.0.1PHP ^8.3

v0.0.13PHP ^8.4

### Community

Maintainers

![](https://www.gravatar.com/avatar/e4cb151c603cc686b4ba9ac6a6301f5188cbc8ce8625a2c91ff4274ceeb04090?d=identicon)[Daniesy](/maintainers/Daniesy)

---

Top Contributors

[![daniesy](https://avatars.githubusercontent.com/u/3399101?v=4)](https://github.com/daniesy "daniesy (1 commits)")

---

Tags

phpparserhtml

###  Code Quality

TestsPHPUnit

### Embed Badge

![Health badge](/badges/daniesy-dominator/health.svg)

```
[![Health](https://phpackages.com/badges/daniesy-dominator/health.svg)](https://phpackages.com/packages/daniesy-dominator)
```

###  Alternatives

[simplehtmldom/simplehtmldom

A fast, simple and reliable HTML document parser for PHP.

1931.4M15](/packages/simplehtmldom-simplehtmldom)[ressio/pharse

Fastest PHP HTML Parser

8684.0k](/packages/ressio-pharse)[corveda/php-sandbox

A PHP library that can be used to run PHP code in a sandboxed environment

23796.2k2](/packages/corveda-php-sandbox)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
