PHPackages                             hexydec/htmldoc - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Parsing &amp; Serialization](/categories/parsing)
4. /
5. hexydec/htmldoc

ActiveLibrary[Parsing &amp; Serialization](/categories/parsing)

hexydec/htmldoc
===============

A token based HTML document parser and minifier. Minify HTML documents including inline CSS, Javascript, and SVG's on the fly. Extract document text, attributes, and fragments. Full test suite.

1.8.8(1mo ago)2610.3k↓21.9%4[1 PRs](https://github.com/hexydec/htmldoc/pulls)3MITPHPPHP &gt;=8.1CI passing

Since Feb 25Pushed 1mo ago3 watchersCompare

[ Source](https://github.com/hexydec/htmldoc)[ Packagist](https://packagist.org/packages/hexydec/htmldoc)[ Docs](https://github.com/hexydec/htmldoc)[ RSS](/packages/hexydec-htmldoc/feed)WikiDiscussions master Synced 1mo ago

READMEChangelog (10)Dependencies (10)Versions (34)Used By (3)

HTMLDoc: PHP HTML Document Parser and Minifier
==============================================

[](#htmldoc-php-html-document-parser-and-minifier)

A tokeniser based HTML document parser and minifier, written in PHP.

[![Licence](https://camo.githubusercontent.com/1be08aa7893603196bc3465a2235df43f54fab679741e5cdb07b9f513c5dd5d7/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4c6963656e63652d4d49542d6c69676874677265792e737667)](LICENCE)[![Status: Stable](https://camo.githubusercontent.com/af83a8dcabf200b08620cd9f1e69904403bdef64b5962d6ff3126c88446bf9e2/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f5374617475732d537461626c652d477265656e2e737667)](https://camo.githubusercontent.com/af83a8dcabf200b08620cd9f1e69904403bdef64b5962d6ff3126c88446bf9e2/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f5374617475732d537461626c652d477265656e2e737667)[![Tests Status](https://github.com/hexydec/htmldoc/actions/workflows/tests.yml/badge.svg)](https://github.com/hexydec/htmldoc/actions/workflows/tests.yml)[![Code Coverage](https://camo.githubusercontent.com/4b3df4567ed0eab93e3864346ca0d9fd6bf252c4b9968b5176f7bcede54dc8ff/68747470733a2f2f636f6465636f762e696f2f67682f686578796465632f68746d6c646f632f6272616e63682f6d61737465722f67726170682f62616467652e737667)](https://app.codecov.io/gh/hexydec/htmldoc)

Description
-----------

[](#description)

An HTML parser, primarily designed for minifying HTML documents, it also enables the document structure to be queried allowing attribute and textnode values to be extracted.

The parser is designed around a tokeniser to make the document processing more reliable than regex based minifiers, which are a bit blunt and can be problematic if they match patterns in the wrong places.

The software is also capable of processing and minifying SVG documents.

Usage
-----

[](#usage)

To minify an HTML document:

```
use hexydec\html\htmldoc;

$doc = new htmldoc();

// load from a variable
if ($doc->load($html) {

	// minify the document
	$doc->minify();

	// compile back to HTML
	echo $doc->save();
}
```

You can test out the minifier online at , or run the supplied `index.php` file after installation.

To extract data from an HTML document:

```
use hexydec\html\htmldoc;

$doc = new htmldoc();

// load from a URL this time
if ($doc->open($url) {

	// extract text
	$text = $doc->find('.article__body')->text();

	// extract attribute
	$attr = $doc->find('.article__author-image')->attr('src');

	// extract HTML
	$html = $doc->find('.article__body')->html();
}
```

Installation
------------

[](#installation)

The easiest way to get up and running is to use composer:

```
$ composer install hexydec/htmldoc

```

HTMLdoc requires [\\hexydec\\token\\tokenise](https://github.com/hexydec/tokenise) to run, which you can install manually if not using composer. Optionally you can also install [CSSdoc](https://github.com/hexydec/cssdoc) and [JSlite](https://github.com/hexydec/jslite) to perform inline CSS and Javascript minification respectively.

All these dependencies will be installed through composer.

Test Suite
----------

[](#test-suite)

You can run the test suite like this:

### Linux

[](#linux)

```
$ vendor/bin/phpunit

```

### Windows

[](#windows)

```
> vendor\bin\phpunit

```

Documentation
-------------

[](#documentation)

- [How it works](docs/how-it-works.md)
- [How to use and examples](docs/how-to-use.md)
- [API Reference](docs/api/readme.md)
- [Mitigating Side Effects of Minification](docs/mitigating-side-effects.md)
- [About Document Recycling](docs/recycling.md)
- [Object Performance](docs/performance.md)

Support
-------

[](#support)

HTMLdoc supports PHP version 8.0+.

Contributing
------------

[](#contributing)

If you find an issue with HTMLdoc, please create an issue in the tracker.

If you wish to fix an issue yourself, please fork the code, fix the issue, then create a pull request, and I will evaluate your submission.

Licence
-------

[](#licence)

The MIT License (MIT). Please see [License File](LICENCE) for more information.

###  Health Score

59

—

FairBetter than 99% of packages

Maintenance89

Actively maintained with recent releases

Popularity35

Limited adoption so far

Community20

Small or concentrated contributor base

Maturity76

Established project with proven stability

 Bus Factor1

Top contributor holds 97.9% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~74 days

Recently: every ~121 days

Total

31

Last Release

55d ago

Major Versions

0.9.0 → 1.0.02021-03-13

PHP version history (4 changes)1.2.0PHP &gt;=7.3

1.6.1PHP &gt;=7.4

1.8.0PHP &gt;=8.0

1.8.5PHP &gt;=8.1

### Community

Maintainers

![](https://www.gravatar.com/avatar/5257ffcb9e543c03368e1090d353092b40b1491d15cd3126d14d3d32975b6384?d=identicon)[hexydec](/maintainers/hexydec)

---

Top Contributors

[![hexydec](https://avatars.githubusercontent.com/u/743478?v=4)](https://github.com/hexydec "hexydec (284 commits)")[![dependabot[bot]](https://avatars.githubusercontent.com/in/29110?v=4)](https://github.com/dependabot[bot] "dependabot[bot] (3 commits)")[![jensscherbl](https://avatars.githubusercontent.com/u/1640033?v=4)](https://github.com/jensscherbl "jensscherbl (2 commits)")[![andremacola](https://avatars.githubusercontent.com/u/3408809?v=4)](https://github.com/andremacola "andremacola (1 commits)")

---

Tags

htmlhtml-dom-parserhtml-parserhtml5minificationminifyminify-htmlphpsimplehtmldomsvgtokenizetokenizerxmljavascriptparsercsshtmldomminifyminifiersvgcompilersimplehtmldom

###  Code Quality

TestsPHPUnit

Static AnalysisPHPStan

Type Coverage Yes

### Embed Badge

![Health badge](/badges/hexydec-htmldoc/health.svg)

```
[![Health](https://phpackages.com/badges/hexydec-htmldoc/health.svg)](https://phpackages.com/packages/hexydec-htmldoc)
```

###  Alternatives

[masterminds/html5

An HTML5 parser and serializer.

1.8k242.8M229](/packages/masterminds-html5)[simplehtmldom/simplehtmldom

A fast, simple and reliable HTML document parser for PHP.

1921.3M14](/packages/simplehtmldom-simplehtmldom)[scotteh/php-dom-wrapper

Simple DOM wrapper to select nodes using either CSS or XPath expressions and manipulate results quickly and easily.

1471.9M10](/packages/scotteh-php-dom-wrapper)[rct567/dom-query

DomQuery is a PHP library that allows easy 'jQuery like' DOM traversing and manipulation

134261.0k4](/packages/rct567-dom-query)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
