PHPackages                             markuspoerschke/extractum - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. markuspoerschke/extractum

ActiveLibrary

markuspoerschke/extractum
=========================

Extract information from web pages.

1.0.3(4y ago)41.1k1[6 PRs](https://github.com/markuspoerschke/extractum/pulls)MITPHPPHP ^7.4 || ^8.0

Since Dec 23Pushed 1y ago2 watchersCompare

[ Source](https://github.com/markuspoerschke/extractum)[ Packagist](https://packagist.org/packages/markuspoerschke/extractum)[ RSS](/packages/markuspoerschke-extractum/feed)WikiDiscussions 1.x Synced today

READMEChangelog (4)Dependencies (11)Versions (11)Used By (0)

Extractum
=========

[](#extractum)

*Extractum* is a PHP library that extracts information from web pages.

Getting Started
---------------

[](#getting-started)

### Installation

[](#installation)

```
composer require markuspoerschke/extractum
```

### Usage

[](#usage)

```
$uri = 'https://www.example.com/';
$html = file_get_contents($uri);

$extractor = new Extractum\Extractor();
$essence = $extractor->extract($html, $uri);
```

Extracted Information
---------------------

[](#extracted-information)

The extracted information are returned as an object of type `Extractum\Essence`.

PropertyDescription`date`The date when the web page was published.`description`Normally the meta description or any other excerpt.`image`The URL to the preview image. Normally defined as a Open Graph attribute.`language`The two character language code of the HTML tag.`links`All links within the main content.`parsedDate`A `DateTimeImmutable` object if `date``text`Unformatted text of the main content. All new lines and not needed spaces are removed.`title`The web pages’s title. This is normally the content of the first `h1` tag.License
-------

[](#license)

This package is released under the [MIT license](LICENSE).

###  Health Score

35

↑

LowBetter than 80% of packages

Maintenance31

Infrequent updates — may be unmaintained

Popularity20

Limited adoption so far

Community13

Small or concentrated contributor base

Maturity64

Established project with proven stability

 Bus Factor1

Top contributor holds 88.5% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~328 days

Total

5

Last Release

650d ago

### Community

Maintainers

![](https://www.gravatar.com/avatar/ea45298e271c0c0fc3c7869aff5f563d1e9c35be86c2505a4dda3f1da0004bc0?d=identicon)[markuspoerschke](/maintainers/markuspoerschke)

---

Top Contributors

[![dependabot[bot]](https://avatars.githubusercontent.com/in/29110?v=4)](https://github.com/dependabot[bot] "dependabot[bot] (753 commits)")[![dependabot-preview[bot]](https://avatars.githubusercontent.com/in/2141?v=4)](https://github.com/dependabot-preview[bot] "dependabot-preview[bot] (63 commits)")[![markuspoerschke](https://avatars.githubusercontent.com/u/1222377?v=4)](https://github.com/markuspoerschke "markuspoerschke (31 commits)")[![markuspoerschke-bot](https://avatars.githubusercontent.com/u/79374170?v=4)](https://github.com/markuspoerschke-bot "markuspoerschke-bot (4 commits)")

---

Tags

extractorhacktoberfesthtml-parserinformation-extractionreadabilityextractorscraperreadability

###  Code Quality

TestsPHPUnit

Static AnalysisPsalm

Code StylePHP CS Fixer

Type Coverage Yes

### Embed Badge

![Health badge](/badges/markuspoerschke-extractum/health.svg)

```
[![Health](https://phpackages.com/badges/markuspoerschke-extractum/health.svg)](https://phpackages.com/packages/markuspoerschke-extractum)
```

###  Alternatives

[vdb/php-spider

A configurable and extensible PHP web spider

1.4k181.0k7](/packages/vdb-php-spider)[crwlr/crawler

Web crawling and scraping library.

37214.8k2](/packages/crwlr-crawler)[helgesverre/extractor

AI-Powered Data Extraction for your Laravel application.

22128.0k](/packages/helgesverre-extractor)[outscraper/outscraper

PHP bindings for the Outscraper API

1822.2k](/packages/outscraper-outscraper)[outscraper/google-maps-scraper-php

PHP bindings for the Outscraper API

185.1k](/packages/outscraper-google-maps-scraper-php)[laurentvw/scrapher

A web scraper for PHP to easily extract data from web pages

192.5k1](/packages/laurentvw-scrapher)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
