PHPackages                             php-text-analysis/php-text-analysis - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Utility &amp; Helpers](/categories/utility)
4. /
5. php-text-analysis/php-text-analysis

Abandoned → [yooper/php-text-analysis](/?search=yooper%2Fphp-text-analysis)Library[Utility &amp; Helpers](/categories/utility)

php-text-analysis/php-text-analysis
===================================

PHP Text Analysis is a library for performing Information Retrieval (IR) and Natural Language Processing (NLP) tasks using the PHP language

1.9.2(1y ago)5328.1k↓80%91[8 issues](https://github.com/yooper/php-text-analysis/issues)MITPHPPHP &gt;=7.4CI failing

Since Sep 26Pushed 1y ago40 watchersCompare

[ Source](https://github.com/yooper/php-text-analysis)[ Packagist](https://packagist.org/packages/php-text-analysis/php-text-analysis)[ RSS](/packages/php-text-analysis-php-text-analysis/feed)WikiDiscussions master Synced 1w ago

READMEChangelog (10)Dependencies (6)Versions (45)Used By (0)

php-text-analysis
=================

[](#php-text-analysis)

[![alt text](https://camo.githubusercontent.com/58ce8a87d57fb5b31505520c7bbdfad17b0f397f5a2d756aad14ce17a4d5aef4/68747470733a2f2f7472617669732d63692e6f72672f796f6f7065722f7068702d746578742d616e616c797369732e7376673f6272616e63683d6d6173746572 "Build status")](https://camo.githubusercontent.com/58ce8a87d57fb5b31505520c7bbdfad17b0f397f5a2d756aad14ce17a4d5aef4/68747470733a2f2f7472617669732d63692e6f72672f796f6f7065722f7068702d746578742d616e616c797369732e7376673f6272616e63683d6d6173746572)

[![Latest Stable Version](https://camo.githubusercontent.com/5f50fcaf3548eaff505aec684cef1dd0baeef6fe7e61f3dfe8c8f8e4a21b5558/68747470733a2f2f706f7365722e707567782e6f72672f796f6f7065722f7068702d746578742d616e616c797369732f762f737461626c65)](https://packagist.org/packages/yooper/php-text-analysis)

[![Total Downloads](https://camo.githubusercontent.com/0b554ab1fda829dcbe74c8fddb0865cae894ac8f1cc99c0a3b869161cb6b6179/68747470733a2f2f706f7365722e707567782e6f72672f796f6f7065722f7068702d746578742d616e616c797369732f646f776e6c6f616473)](https://packagist.org/packages/yooper/php-text-analysis)

PHP Text Analysis is a library for performing Information Retrieval (IR) and Natural Language Processing (NLP) tasks using the PHP language. There are tools in this library that can perform:

- document classification
- sentiment analysis
- compare documents
- frequency analysis
- tokenization
- stemming
- collocations with Pointwise Mutual Information
- lexical diversity
- corpus analysis
- text summarization

All the documentation for this project can be found in the book and wiki.

PHP Text Analysis Book &amp; Wiki
=================================

[](#php-text-analysis-book--wiki)

A book is in the works and your contributions are needed. You can find the book at

Also, documentation for the library resides in the wiki, too.

Installation Instructions
=========================

[](#installation-instructions)

Add PHP Text Analysis to your project

```
composer require yooper/php-text-analysis

```

### Tokenization

[](#tokenization)

```
$tokens = tokenize($text);
```

You can customize which type of tokenizer to tokenize with by passing in the name of the tokenizer class

```
$tokens = tokenize($text, \TextAnalysis\Tokenizers\PennTreeBankTokenizer::class);
```

The default tokenizer is **\\TextAnalysis\\Tokenizers\\GeneralTokenizer::class** . Some tokenizers require parameters to be set upon instantiation.

### Normalization

[](#normalization)

By default, **normalize\_tokens** uses the function **strtolower** to lowercase all the tokens. To customize the normalize function, pass in either a function or a string to be used by array\_map.

```
$normalizedTokens = normalize_tokens(array $tokens);
```

```
$normalizedTokens = normalize_tokens(array $tokens, 'mb_strtolower');

$normalizedTokens = normalize_tokens(array $tokens, function($token){ return mb_strtoupper($token); });
```

### Frequency Distributions

[](#frequency-distributions)

The call to **freq\_dist** returns a [FreqDist](https://github.com/yooper/php-text-analysis/blob/master/src/Analysis/FreqDist.php) instance.

```
$freqDist = freq_dist(tokenize($text));
```

### Ngram Generation

[](#ngram-generation)

By default bigrams are generated.

```
$bigrams = ngrams($tokens);
```

Customize the ngrams

```
// create trigrams with a pipe delimiter in between each word
$trigrams = ngrams($tokens,3, '|');
```

### Stemming

[](#stemming)

By default stem method uses the Porter Stemmer.

```
$stemmedTokens = stem($tokens);
```

You can customize which type of stemmer to use by passing in the name of the stemmer class name

```
$stemmedTokens = stem($tokens, \TextAnalysis\Stemmers\MorphStemmer::class);
```

### Keyword Extract with Rake

[](#keyword-extract-with-rake)

There is a short cut method for using the Rake algorithm. You will need to clean your data prior to using. Second parameter is the ngram size of your keywords to extract.

```
$rake = rake($tokens, 3);
$results = $rake->getKeywordScores();
```

### Sentiment Analysis with Vader

[](#sentiment-analysis-with-vader)

Need Sentiment Analysis with PHP Use Vader,  . The PHP implementation can be invoked easily. Just normalize your data before hand.

```
$sentimentScores = vader($tokens);
```

### Document Classification with Naive Bayes

[](#document-classification-with-naive-bayes)

Need to do some document classification with PHP, trying using the Naive Bayes implementation. An example of classifying movie reviews can be found in the unit tests

```
$nb = naive_bayes();
$nb->train('mexican', tokenize('taco nacho enchilada burrito'));
$nb->train('american', tokenize('hamburger burger fries pop'));
$nb->predict(tokenize('my favorite food is a burrito'));
```

###  Health Score

51

—

FairBetter than 96% of packages

Maintenance40

Moderate activity, may be stable

Popularity46

Moderate usage in the ecosystem

Community29

Small or concentrated contributor base

Maturity76

Established project with proven stability

 Bus Factor1

Top contributor holds 80.3% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~71 days

Recently: every ~329 days

Total

43

Last Release

506d ago

PHP version history (7 changes)v1.0PHP &gt;=5.5

v1.2PHP &gt;=7.0

1.3.2PHP &gt;=7

1.4.9PHP &gt;=7.1

1.6PHP ~7.4

1.7PHP ~7.4|~8.0

1.9PHP &gt;=7.4

### Community

Maintainers

![](https://www.gravatar.com/avatar/4bdeb60011e17f1d758f5cade0da9dc005b51fefbba3902a55c28ff9ed3fa5ef?d=identicon)[yooper](/maintainers/yooper)

---

Top Contributors

[![yooper](https://avatars.githubusercontent.com/u/1064781?v=4)](https://github.com/yooper "yooper (192 commits)")[![Euak](https://avatars.githubusercontent.com/u/12688177?v=4)](https://github.com/Euak "Euak (28 commits)")[![carbon-cloud-deploy](https://avatars.githubusercontent.com/u/60390459?v=4)](https://github.com/carbon-cloud-deploy "carbon-cloud-deploy (6 commits)")[![thiagogomesverissimo](https://avatars.githubusercontent.com/u/908508?v=4)](https://github.com/thiagogomesverissimo "thiagogomesverissimo (4 commits)")[![ace411](https://avatars.githubusercontent.com/u/11040337?v=4)](https://github.com/ace411 "ace411 (2 commits)")[![evertharmeling](https://avatars.githubusercontent.com/u/308513?v=4)](https://github.com/evertharmeling "evertharmeling (1 commits)")[![maxguru](https://avatars.githubusercontent.com/u/8198049?v=4)](https://github.com/maxguru "maxguru (1 commits)")[![NeoBlack](https://avatars.githubusercontent.com/u/1128085?v=4)](https://github.com/NeoBlack "NeoBlack (1 commits)")[![nielsriekert](https://avatars.githubusercontent.com/u/8812322?v=4)](https://github.com/nielsriekert "nielsriekert (1 commits)")[![repat](https://avatars.githubusercontent.com/u/516807?v=4)](https://github.com/repat "repat (1 commits)")[![elievischel](https://avatars.githubusercontent.com/u/25434540?v=4)](https://github.com/elievischel "elievischel (1 commits)")[![cicnavi](https://avatars.githubusercontent.com/u/3176844?v=4)](https://github.com/cicnavi "cicnavi (1 commits)")

---

Tags

nlpphpphp-languagephp-text-analysistext-analysistokenizationnlpnatural language processingtext classificationirtext analysis

###  Code Quality

TestsPHPUnit

### Embed Badge

![Health badge](/badges/php-text-analysis-php-text-analysis/health.svg)

```
[![Health](https://phpackages.com/badges/php-text-analysis-php-text-analysis/health.svg)](https://phpackages.com/packages/php-text-analysis-php-text-analysis)
```

###  Alternatives

[yooper/php-text-analysis

PHP Text Analysis is a library for performing Information Retrieval (IR) and Natural Language Processing (NLP) tasks using the PHP language

539393.0k2](/packages/yooper-php-text-analysis)[rubix/ml

A high-level machine learning and deep learning library for the PHP language.

2.2k1.4M28](/packages/rubix-ml)[codewithkyrian/transformers

State-of-the-art Machine Learning for PHP. Run Transformers in PHP

749231.8k5](/packages/codewithkyrian-transformers)[nlp-tools/nlp-tools

NlpTools is a set of php 5.3+ classes for beginner to semi advanced natural language processing work.

774645.2k5](/packages/nlp-tools-nlp-tools)[nlgen/nlgen

A library for creating recursive-descent natural language generators.

56181.3k](/packages/nlgen-nlgen)[php-soap/wsdl

Deals with WSDLs

173.5M12](/packages/php-soap-wsdl)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
