PHPackages                             j0k3r/php-readability - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Utility &amp; Helpers](/categories/utility)
4. /
5. j0k3r/php-readability

ActiveLibrary[Utility &amp; Helpers](/categories/utility)

j0k3r/php-readability
=====================

Automatic article extraction from HTML

2.0.8(1mo ago)186808.8k—5.8%38[5 PRs](https://github.com/j0k3r/php-readability/pulls)5Apache-2.0PHPPHP &gt;=7.4.0CI passing

Since Dec 12Pushed 7mo ago7 watchersCompare

[ Source](https://github.com/j0k3r/php-readability)[ Packagist](https://packagist.org/packages/j0k3r/php-readability)[ GitHub Sponsors](https://github.com/j0k3r)[ RSS](/packages/j0k3r-php-readability/feed)WikiDiscussions master Synced 1mo ago

READMEChangelog (10)Dependencies (16)Versions (50)Used By (5)

Readability
===========

[](#readability)

[![CI](https://github.com/j0k3r/php-readability/workflows/CI/badge.svg)](https://github.com/j0k3r/php-readability/workflows/CI/badge.svg)[![Coverage Status](https://camo.githubusercontent.com/c92d010c7997dbe711a107d4100ec05a02e9e31ca48dd8850dfc7ec907f7cc53/68747470733a2f2f636f766572616c6c732e696f2f7265706f732f6a306b33722f7068702d726561646162696c6974792f62616467652e7376673f6272616e63683d6d617374657226736572766963653d676974687562)](https://coveralls.io/github/j0k3r/php-readability/?branch=master)[![Total Downloads](https://camo.githubusercontent.com/6233cf1a62b478cca6ebe79cd700aaa531eced0973b6712e87f13c99d9984eb4/68747470733a2f2f706f7365722e707567782e6f72672f6a306b33722f7068702d726561646162696c6974792f646f776e6c6f616473)](https://packagist.org/packages/j0k3r/php-readability)[![License](https://camo.githubusercontent.com/7407e48deac60fba0b4765edf26eb8e1217e5bc57699ae77b81f74db96130e2d/68747470733a2f2f706f7365722e707567782e6f72672f6a306b33722f7068702d726561646162696c6974792f6c6963656e7365)](https://packagist.org/packages/j0k3r/php-readability)

This is an extract of the Readability class from this [full-text-rss](https://github.com/Dither/full-text-rss) fork. It can be defined as a better version of the original [php-readability](https://bitbucket.org/fivefilters/php-readability).

Differences
-----------

[](#differences)

The default php-readability lib is really old and needs to be improved. I found a great fork of full-text-rss from [@Dither](https://github.com/Dither/full-text-rss) which improve the Readability class.

- I've extracted the class from its fork to be able to use it out of the box
- I've added some simple tests
- and changed the CS, run `php-cs-fixer` and added a namespace

**But** the code is still really hard to understand / read ...

Requirements
------------

[](#requirements)

By default, this lib will use the [Tidy extension](https://github.com/htacg/tidy-html5) if it's available. Tidy is only used to cleanup the given HTML and avoid problems with bad HTML structure, etc .. It'll be suggested by Composer.

Also, if you got problem from parsing a content without Tidy installed, please install it and try again.

Usage
-----

[](#usage)

```
use Readability\Readability;

$url = 'http://www.medialens.org/index.php/alerts/alert-archive/alerts-2013/729-thatcher.html';

// you can use whatever you want to retrieve the html content (Guzzle, Buzz, cURL ...)
$html = file_get_contents($url);

$readability = new Readability($html, $url);
// or without Tidy
// $readability = new Readability($html, $url, 'libxml', false);
$result = $readability->init();

if ($result) {
    // display the title of the page
    echo $readability->getTitle()->textContent;
    // display the *readability* content
    echo $readability->getContent()->textContent;
} else {
    echo 'Looks like we couldn\'t find the content. :(';
}
```

If you want to debug it, or check what's going on, you can inject a logger (which must follow `Psr\Log\LoggerInterface`, Monolog for example):

```
use Readability\Readability;
use Monolog\Logger;
use Monolog\Handler\StreamHandler;

$url = 'http://www.medialens.org/index.php/alerts/alert-archive/alerts-2013/729-thatcher.html';
$html = file_get_contents($url);

$logger = new Logger('readability');
$logger->pushHandler(new StreamHandler('path/to/your.log', Logger::DEBUG));

$readability = new Readability($html, $url);
$readability->setLogger($logger);
```

###  Health Score

62

—

FairBetter than 99% of packages

Maintenance74

Regular maintenance activity

Popularity56

Moderate usage in the ecosystem

Community28

Small or concentrated contributor base

Maturity76

Established project with proven stability

 Bus Factor1

Top contributor holds 73.5% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~89 days

Recently: every ~74 days

Total

47

Last Release

52d ago

Major Versions

1.2.9 → 2.0.02022-02-15

1.2.10 → 2.0.12022-06-13

1.2.11 → 2.0.62025-03-04

1.2.12 → 2.0.72025-06-03

1.x-dev → 2.0.82026-03-27

PHP version history (6 changes)v1.0PHP &gt;=5.4

v1.0.2PHP &gt;=5.3.3

1.2.0PHP &gt;=5.6.0

2.0.0PHP &gt;=7.2.0

2.0.4PHP &gt;=7.4.0

1.3.0PHP &gt;=7.2

### Community

Maintainers

![](https://www.gravatar.com/avatar/57fabcfbe3c772be6acc02eb7117d6e0c003a75bc25cc42535412db4e11902cf?d=identicon)[j0k3r](/maintainers/j0k3r)

---

Top Contributors

[![j0k3r](https://avatars.githubusercontent.com/u/62333?v=4)](https://github.com/j0k3r "j0k3r (139 commits)")[![jtojnar](https://avatars.githubusercontent.com/u/705123?v=4)](https://github.com/jtojnar "jtojnar (35 commits)")[![Kdecherf](https://avatars.githubusercontent.com/u/226063?v=4)](https://github.com/Kdecherf "Kdecherf (10 commits)")[![Simounet](https://avatars.githubusercontent.com/u/582666?v=4)](https://github.com/Simounet "Simounet (3 commits)")[![nicofrand](https://avatars.githubusercontent.com/u/3419050?v=4)](https://github.com/nicofrand "nicofrand (1 commits)")[![peter279k](https://avatars.githubusercontent.com/u/9021747?v=4)](https://github.com/peter279k "peter279k (1 commits)")

---

Tags

contentextract-websitehacktoberfestphpphp-libraryreadabilitytext-rsstidyhtmlcontentextractionarticlearticle extractioncontent extraction

###  Code Quality

Static AnalysisPHPStan, Rector

Code StylePHP CS Fixer

Type Coverage Yes

### Embed Badge

![Health badge](/badges/j0k3r-php-readability/health.svg)

```
[![Health](https://phpackages.com/badges/j0k3r-php-readability/health.svg)](https://phpackages.com/packages/j0k3r-php-readability)
```

###  Alternatives

[fivefilters/readability.php

A PHP port of Readability.js

311826.8k5](/packages/fivefilters-readabilityphp)[mtownsend/read-time

A PHP package to show users how long it takes to read content.

283571.1k2](/packages/mtownsend-read-time)[urodoz/truncate-html

Handle truncate action on HTML strings

21294.0k6](/packages/urodoz-truncate-html)[magyarandras/amp-converter

A library to convert HTML articles, blog posts or similar content to AMP (Accelerated Mobile Pages).

65150.3k](/packages/magyarandras-amp-converter)[judev/php-htmltruncator

HTML Truncation library, ported from the html\_truncator rubygem

26286.5k6](/packages/judev-php-htmltruncator)[aedart/athenaeum

Athenaeum is a mono repository; a collection of various PHP packages

245.2k](/packages/aedart-athenaeum)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
