PHPackages                             tacman/php-readability - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Parsing &amp; Serialization](/categories/parsing)
4. /
5. tacman/php-readability

ActiveLibrary[Parsing &amp; Serialization](/categories/parsing)

tacman/php-readability
======================

Automatic article extraction from HTML, fork of j0k3r/php-readability

2.0.3(1y ago)05011Apache-2.0PHPPHP ^8.2

Since Oct 11Pushed 1y agoCompare

[ Source](https://github.com/tacman/php-readability)[ Packagist](https://packagist.org/packages/tacman/php-readability)[ GitHub Sponsors](https://github.com/j0k3r)[ RSS](/packages/tacman-php-readability/feed)WikiDiscussions tac Synced 1mo ago

READMEChangelog (4)Dependencies (9)Versions (6)Used By (1)

composer config repositories.readability '{"type": "path", "url": "~/g/tacman/php-readability"}' composer req tacman/php-readability:"\*@dev"

composer config repositories.graby '{"type": "path", "url": "~/g/tacman/graby"}' composer req tacman/graby:"\*@dev"

Readability
===========

[](#readability)

[![CI](https://github.com/j0k3r/php-readability/workflows/CI/badge.svg)](https://github.com/j0k3r/php-readability/workflows/CI/badge.svg)[![Coverage Status](https://camo.githubusercontent.com/c92d010c7997dbe711a107d4100ec05a02e9e31ca48dd8850dfc7ec907f7cc53/68747470733a2f2f636f766572616c6c732e696f2f7265706f732f6a306b33722f7068702d726561646162696c6974792f62616467652e7376673f6272616e63683d6d617374657226736572766963653d676974687562)](https://coveralls.io/github/j0k3r/php-readability/?branch=master)[![Total Downloads](https://camo.githubusercontent.com/6233cf1a62b478cca6ebe79cd700aaa531eced0973b6712e87f13c99d9984eb4/68747470733a2f2f706f7365722e707567782e6f72672f6a306b33722f7068702d726561646162696c6974792f646f776e6c6f616473)](https://packagist.org/packages/j0k3r/php-readability)[![License](https://camo.githubusercontent.com/7407e48deac60fba0b4765edf26eb8e1217e5bc57699ae77b81f74db96130e2d/68747470733a2f2f706f7365722e707567782e6f72672f6a306b33722f7068702d726561646162696c6974792f6c6963656e7365)](https://packagist.org/packages/j0k3r/php-readability)

This is an extract of the Readability class from this [full-text-rss](https://github.com/Dither/full-text-rss) fork. It can be defined as a better version of the original [php-readability](https://bitbucket.org/fivefilters/php-readability).

Differences
-----------

[](#differences)

The default php-readability lib is really old and needs to be improved. I found a great fork of full-text-rss from [@Dither](https://github.com/Dither/full-text-rss) which improve the Readability class.

- I've extracted the class from its fork to be able to use it out of the box
- I've added some simple tests
- and changed the CS, run `php-cs-fixer` and added a namespace

**But** the code is still really hard to understand / read ...

Requirements
------------

[](#requirements)

By default, this lib will use the [Tidy extension](https://github.com/htacg/tidy-html5) if it's available. Tidy is only used to cleanup the given HTML and avoid problems with bad HTML structure, etc .. It'll be suggested by Composer.

Also, if you got problem from parsing a content without Tidy installed, please install it and try again.

Usage
-----

[](#usage)

```
use Readability\Readability;

$url = 'http://www.medialens.org/index.php/alerts/alert-archive/alerts-2013/729-thatcher.html';

// you can use whatever you want to retrieve the html content (Guzzle, Buzz, cURL ...)
$html = file_get_contents($url);

$readability = new Readability($html, $url);
// or without Tidy
// $readability = new Readability($html, $url, 'libxml', false);
$result = $readability->init();

if ($result) {
    // display the title of the page
    echo $readability->getTitle()->textContent;
    // display the *readability* content
    echo $readability->getContent()->textContent;
} else {
    echo 'Looks like we couldn\'t find the content. :(';
}
```

If you want to debug it, or check what's going on, you can inject a logger (which must follow `Psr\Log\LoggerInterface`, Monolog for example):

```
use Readability\Readability;
use Monolog\Logger;
use Monolog\Handler\StreamHandler;

$url = 'http://www.medialens.org/index.php/alerts/alert-archive/alerts-2013/729-thatcher.html';
$html = file_get_contents($url);

$logger = new Logger('readability');
$logger->pushHandler(new StreamHandler('path/to/your.log', Logger::DEBUG));

$readability = new Readability($html, $url);
$readability->setLogger($logger);
```

###  Health Score

32

—

LowBetter than 72% of packages

Maintenance40

Moderate activity, may be stable

Popularity9

Limited adoption so far

Community16

Small or concentrated contributor base

Maturity55

Maturing project, gaining track record

 Bus Factor1

Top contributor holds 73.2% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~16 days

Total

4

Last Release

500d ago

### Community

Maintainers

![](https://www.gravatar.com/avatar/21b39551f92ed4143772c622f9e571589c5a72c96ab3c53fe67489ce0d83e806?d=identicon)[tacman1123](/maintainers/tacman1123)

---

Top Contributors

[![j0k3r](https://avatars.githubusercontent.com/u/62333?v=4)](https://github.com/j0k3r "j0k3r (134 commits)")[![jtojnar](https://avatars.githubusercontent.com/u/705123?v=4)](https://github.com/jtojnar "jtojnar (26 commits)")[![Kdecherf](https://avatars.githubusercontent.com/u/226063?v=4)](https://github.com/Kdecherf "Kdecherf (10 commits)")[![tacman](https://avatars.githubusercontent.com/u/619585?v=4)](https://github.com/tacman "tacman (8 commits)")[![Simounet](https://avatars.githubusercontent.com/u/582666?v=4)](https://github.com/Simounet "Simounet (3 commits)")[![nicofrand](https://avatars.githubusercontent.com/u/3419050?v=4)](https://github.com/nicofrand "nicofrand (1 commits)")[![peter279k](https://avatars.githubusercontent.com/u/9021747?v=4)](https://github.com/peter279k "peter279k (1 commits)")

---

Tags

htmlcontentextractionarticlearticle extractioncontent extraction

###  Code Quality

TestsPHPUnit

Static AnalysisPHPStan, Rector

Code StylePHP CS Fixer

Type Coverage Yes

### Embed Badge

![Health badge](/badges/tacman-php-readability/health.svg)

```
[![Health](https://phpackages.com/badges/tacman-php-readability/health.svg)](https://phpackages.com/packages/tacman-php-readability)
```

###  Alternatives

[j0k3r/php-readability

Automatic article extraction from HTML

186808.8k6](/packages/j0k3r-php-readability)[masterminds/html5

An HTML5 parser and serializer.

1.8k242.8M229](/packages/masterminds-html5)[league/html-to-markdown

An HTML-to-markdown conversion helper for PHP

1.9k28.6M199](/packages/league-html-to-markdown)[paquettg/php-html-parser

An HTML DOM parser. It allows you to manipulate HTML. Find tags on an HTML page with selectors just like jQuery.

2.4k7.9M123](/packages/paquettg-php-html-parser)[querypath/querypath

HTML/XML querying and processing (like jQuery)

8197.0M27](/packages/querypath-querypath)[sunra/php-simple-html-dom-parser

Composer adaptation of: A HTML DOM parser written in PHP5+ let you manipulate HTML in a very easy way! Require PHP 5+. Supports invalid HTML. Find tags on an HTML page with selectors just like jQuery. Extract contents from HTML in a single line.

1.3k9.4M61](/packages/sunra-php-simple-html-dom-parser)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
