PHPackages                             silverstripe/textextraction - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [PDF &amp; Document Generation](/categories/documents)
4. /
5. silverstripe/textextraction

ActiveSilverstripe-vendormodule[PDF &amp; Document Generation](/categories/documents)

silverstripe/textextraction
===========================

Text Extraction API for SilverStripe CMS (mostly used with 'fulltextsearch' module)

5.0.1(11mo ago)10187.2k↓80.7%23[2 issues](https://github.com/silverstripe/silverstripe-textextraction/issues)5BSD-3-ClausePHPPHP ^8.3CI passing

Since Apr 30Pushed 7mo ago11 watchersCompare

[ Source](https://github.com/silverstripe/silverstripe-textextraction)[ Packagist](https://packagist.org/packages/silverstripe/textextraction)[ Docs](http://silverstripe.org)[ RSS](/packages/silverstripe-textextraction/feed)WikiDiscussions 5 Synced 2d ago

READMEChangelog (10)Dependencies (9)Versions (57)Used By (5)

Text extraction module
======================

[](#text-extraction-module)

[![CI](https://github.com/silverstripe/silverstripe-textextraction/actions/workflows/ci.yml/badge.svg)](https://github.com/silverstripe/silverstripe-textextraction/actions/workflows/ci.yml)[![Silverstripe supported module](https://camo.githubusercontent.com/9b7e93d393a01f6d3091fb30983b870aa863ef076858115faaa1c74b995854ec/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f73696c7665727374726970652d737570706f727465642d3030373143342e737667)](https://www.silverstripe.org/software/addons/silverstripe-commercially-supported-module-list/)

Provides a text extraction API for file content, that can hook into different extractor engines based on availability and the parsed file format. The output returned is always a string of the file content.

Via the `FileTextExtractable` extension, this logic can be used to cache the extracted content on a `DataObject` subclass (usually `File`).

The module supports text extraction on the following file formats:

- HTML (built-in)
- PDF (with XPDF or Solr)
- Microsoft Word, Excel, Powerpoint (Solr)
- OpenOffice (Solr)
- CSV (Solr)
- RTF (Solr)
- EPub (Solr)
- Many others (Tika)

Read more in the [documentation](https://docs.silverstripe.org/en/optional_features/text-extraction).

Installation
------------

[](#installation)

```
composer require silverstripe/textextraction
```

Bugtracker
----------

[](#bugtracker)

Bugs are tracked in the issues section of this repository. Before submitting an issue please read over existing issues to ensure yours is unique.

If the issue does look like a new bug:

- Create a new issue
- Describe the steps required to reproduce your issue, and the expected outcome. Unit tests, screenshots and screencasts can help here.
- Describe your environment as detailed as possible: Silverstripe version, Browser, PHP version, Operating System, any installed Silverstripe modules.

Please report security issues to  directly. Please don't file security issues in the bugtracker.

Development and contribution
----------------------------

[](#development-and-contribution)

If you would like to make contributions to the module please ensure you raise a pull request and discuss with the module maintainers.

###  Health Score

60

—

FairBetter than 98% of packages

Maintenance55

Moderate activity, may be stable

Popularity42

Moderate usage in the ecosystem

Community38

Small or concentrated contributor base

Maturity92

Battle-tested with a long release history

 Bus Factor3

3 contributors hold 50%+ of commits

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~70 days

Recently: every ~33 days

Total

56

Last Release

213d ago

Major Versions

3.4.x-dev → 4.0.0-beta12023-01-22

3.5.0 → 4.0.0-rc12023-03-29

3.x-dev → 4.0.02023-04-27

4.1.1 → 5.0.0-alpha12024-12-02

4.x-dev → 5.0.12025-07-23

PHP version history (5 changes)2.0.0PHP &gt;=5.3.2

3.3.0PHP ^7.3 || ^8.0

3.4.0PHP ^7.4 || ^8.0

4.0.0-beta1PHP ^8.1

5.0.0-alpha1PHP ^8.3

### Community

Maintainers

![](https://avatars.githubusercontent.com/u/654636?v=4)[Aaron Carlino](/maintainers/unclecheese)[@unclecheese](https://github.com/unclecheese)

![](https://www.gravatar.com/avatar/b0cba8b534e20e6ab4fff555a97b237a18436ebca1446fc0b29c8a8b504038b9?d=identicon)[GuySartorelli](/maintainers/GuySartorelli)

![](https://avatars.githubusercontent.com/u/111025?v=4)[Ingo Schommer](/maintainers/chillu)[@chillu](https://github.com/chillu)

![](https://www.gravatar.com/avatar/a25bc04c5720a36869d5a39c6449dde7eb43e19b7c8e666d5f632d6a9ab440b1?d=identicon)[emteknetnz](/maintainers/emteknetnz)

![](https://www.gravatar.com/avatar/afbb3dcc9ef29c1a6eedd6addcae5fce9ab1271915a85a4c349301b71237368d?d=identicon)[silverstripe-machine01](/maintainers/silverstripe-machine01)

![](https://www.gravatar.com/avatar/be6648e60fbab6f70bfc34dd8c14259562d28a47510a934ea9c01fe98633f3c2?d=identicon)[sminnee](/maintainers/sminnee)

![](https://avatars.githubusercontent.com/u/1168676?v=4)[Maxime Rainville](/maintainers/maxime-rainville)[@maxime-rainville](https://github.com/maxime-rainville)

---

Top Contributors

[![robbieaverill](https://avatars.githubusercontent.com/u/5170590?v=4)](https://github.com/robbieaverill "robbieaverill (49 commits)")[![emteknetnz](https://avatars.githubusercontent.com/u/4809037?v=4)](https://github.com/emteknetnz "emteknetnz (43 commits)")[![GuySartorelli](https://avatars.githubusercontent.com/u/36352093?v=4)](https://github.com/GuySartorelli "GuySartorelli (38 commits)")[![chillu](https://avatars.githubusercontent.com/u/111025?v=4)](https://github.com/chillu "chillu (30 commits)")[![dhensby](https://avatars.githubusercontent.com/u/563596?v=4)](https://github.com/dhensby "dhensby (13 commits)")[![github-actions[bot]](https://avatars.githubusercontent.com/in/15368?v=4)](https://github.com/github-actions[bot] "github-actions[bot] (12 commits)")[![NightJar](https://avatars.githubusercontent.com/u/778003?v=4)](https://github.com/NightJar "NightJar (4 commits)")[![assertchris](https://avatars.githubusercontent.com/u/200609?v=4)](https://github.com/assertchris "assertchris (4 commits)")[![michalkleiner](https://avatars.githubusercontent.com/u/233342?v=4)](https://github.com/michalkleiner "michalkleiner (3 commits)")[![camfindlay](https://avatars.githubusercontent.com/u/367847?v=4)](https://github.com/camfindlay "camfindlay (2 commits)")[![ichaber](https://avatars.githubusercontent.com/u/929858?v=4)](https://github.com/ichaber "ichaber (2 commits)")[![sabina-talipova](https://avatars.githubusercontent.com/u/87288324?v=4)](https://github.com/sabina-talipova "sabina-talipova (2 commits)")[![ScopeyNZ](https://avatars.githubusercontent.com/u/3260989?v=4)](https://github.com/ScopeyNZ "ScopeyNZ (1 commits)")[![ishannz](https://avatars.githubusercontent.com/u/20032948?v=4)](https://github.com/ishannz "ishannz (1 commits)")[![jakedaleweb](https://avatars.githubusercontent.com/u/11186642?v=4)](https://github.com/jakedaleweb "jakedaleweb (1 commits)")[![jnv](https://avatars.githubusercontent.com/u/616767?v=4)](https://github.com/jnv "jnv (1 commits)")[![lozcalver](https://avatars.githubusercontent.com/u/1655548?v=4)](https://github.com/lozcalver "lozcalver (1 commits)")[![martinhipp](https://avatars.githubusercontent.com/u/108774?v=4)](https://github.com/martinhipp "martinhipp (1 commits)")

---

Tags

hacktoberfestpdfsilverstripefulltext

###  Code Quality

TestsPHPUnit

Code StylePHP\_CodeSniffer

### Embed Badge

![Health badge](/badges/silverstripe-textextraction/health.svg)

```
[![Health](https://phpackages.com/badges/silverstripe-textextraction/health.svg)](https://phpackages.com/packages/silverstripe-textextraction)
```

###  Alternatives

[aws/aws-sdk-php

AWS SDK for PHP - Use Amazon Web Services in your PHP project

6.3k543.5M2.6k](/packages/aws-aws-sdk-php)[silverstripe/framework

The SilverStripe framework

7313.7M2.8k](/packages/silverstripe-framework)[silverstripe/cms

The SilverStripe Content Management System

5253.6M1.4k](/packages/silverstripe-cms)[neuron-core/neuron-ai

The PHP Agentic Framework.

2.0k656.1k38](/packages/neuron-core-neuron-ai)[tencentcloud/tencentcloud-sdk-php

TencentCloudApi php sdk

3741.3M46](/packages/tencentcloud-tencentcloud-sdk-php)[silverstripe/admin

SilverStripe admin interface

262.8M384](/packages/silverstripe-admin)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
