PHPackages                             xatham/text-extraction - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Utility &amp; Helpers](/categories/utility)
4. /
5. xatham/text-extraction

ActiveLibrary[Utility &amp; Helpers](/categories/utility)

xatham/text-extraction
======================

Easy text extraction for many different file types

0.0.2(4y ago)117MITPHPPHP &gt;=7.4

Since Dec 21Pushed 4y ago1 watchersCompare

[ Source](https://github.com/xatham/text-extraction)[ Packagist](https://packagist.org/packages/xatham/text-extraction)[ RSS](/packages/xatham-text-extraction/feed)WikiDiscussions main Synced today

READMEChangelogDependencies (12)Versions (4)Used By (0)

[![PHP Composer](https://github.com/xatham/text-extraction/workflows/PHP%20Composer/badge.svg)](https://github.com/xatham/text-extraction/workflows/PHP%20Composer/badge.svg)

text-extraction
===============

[](#text-extraction)

About
-----

[](#about)

This PHP-Library let's you extract plain text from various document types.

Currently supported file mime-types for extraction are:

`text/plain`

`text/csv`

`application/vnd.ms-excel`

`application/vnd.oasis.opendocument.text`

`application/pdf`

`application/msword'`

Install
-------

[](#install)

```
composer require xatham/text-extraction
```

Usage
-----

[](#usage)

```
/**
 * Extracting only pdf files, without ocr capturing
 */
$textExtractor = (new TextExtractionBuilder())->buildTextExtractor(
    [
        'withOcr' => false,
        'validMimeTypes' =>  ['application/pdf'],
    ],
);

$target = dirname(__DIR__) . '/examples/sample.pdf';
$plainTextDocument = $textExtractor->extractByFilePath($target);
if ($plainTextDocument === null) {
    exit('Could not extract any data');
}
$texts = $plainTextDocument->getTextItems();

foreach ($texts as $text) {
    var_dump($text);
}
```

License
-------

[](#license)

text-extraction is licensed under [MIT](https://github.com/xatham/text-extraction/blob/main/LICENSE).

###  Health Score

22

—

LowBetter than 21% of packages

Maintenance20

Infrequent updates — may be unmaintained

Popularity7

Limited adoption so far

Community7

Small or concentrated contributor base

Maturity46

Maturing project, gaining track record

 Bus Factor1

Top contributor holds 100% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~278 days

Total

2

Last Release

1741d ago

### Community

Maintainers

![](https://avatars.githubusercontent.com/u/9279227?v=4)[SKirejewski](/maintainers/xatham)[@xatham](https://github.com/xatham)

---

Top Contributors

[![xatham](https://avatars.githubusercontent.com/u/9279227?v=4)](https://github.com/xatham "xatham (38 commits)")

###  Code Quality

TestsPHPUnit

Static AnalysisPHPStan

Code StylePHP CS Fixer

Type Coverage Yes

### Embed Badge

![Health badge](/badges/xatham-text-extraction/health.svg)

```
[![Health](https://phpackages.com/badges/xatham-text-extraction/health.svg)](https://phpackages.com/packages/xatham-text-extraction)
```

###  Alternatives

[civicrm/civicrm-core

Open source constituent relationship management for non-profits, NGOs and advocacy organizations.

751291.4k41](/packages/civicrm-civicrm-core)[laravel/framework

The Laravel Framework.

34.8k543.8M19.9k](/packages/laravel-framework)[solspace/craft-freeform

The most flexible and user-friendly form building plugin!

54681.3k17](/packages/solspace-craft-freeform)[kimai/kimai

Kimai - Time Tracking

4.8k9.0k1](/packages/kimai-kimai)[coenjacobs/mozart

Composes all dependencies as a package inside a WordPress plugin

4814.0M25](/packages/coenjacobs-mozart)[brianhenryie/strauss

Prefixes dependencies namespaces so they are unique to your plugin

190438.1k35](/packages/brianhenryie-strauss)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
