PHPackages                             nlp-tools/nlp-tools - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Utility &amp; Helpers](/categories/utility)
4. /
5. nlp-tools/nlp-tools

ActiveLibrary[Utility &amp; Helpers](/categories/utility)

nlp-tools/nlp-tools
===================

NlpTools is a set of php 5.3+ classes for beginner to semi advanced natural language processing work.

v0.1.3(9y ago)774645.2k↓12.9%154[9 PRs](https://github.com/angeloskath/php-nlp-tools/pulls)5WTFPLPHPPHP &gt;=5.3

Since Jan 30Pushed 1y ago65 watchersCompare

[ Source](https://github.com/angeloskath/php-nlp-tools)[ Packagist](https://packagist.org/packages/nlp-tools/nlp-tools)[ RSS](/packages/nlp-tools-nlp-tools/feed)WikiDiscussions master Synced 1mo ago

READMEChangelogDependenciesVersions (6)Used By (5)

[PHP NlpTools](http://php-nlp-tools.com/)
=========================================

[](#php-nlptools)

NlpTools is a set of php 5.3+ classes for beginner to semi advanced natural language processing work.

Documentation
-------------

[](#documentation)

You can find documentation and code examples at the project's [homepage](http://php-nlp-tools.com/documentation/).

Contents
--------

[](#contents)

### Classification Models

[](#classification-models)

1. [Multinomial Naive Bayes](http://php-nlp-tools.com/documentation/bayesian-model.html)
2. [Maximum Entropy (Conditional Exponential model)](http://php-nlp-tools.com/documentation/maximum-entropy-model.html)

### Topic Modeling

[](#topic-modeling)

Lda is still experimental and quite slow but it works. [See an example](http://php-nlp-tools.com/posts/introducing-latent-dirichlet-allocation.html).

1. [Latent Dirichlet Allocation](http://php-nlp-tools.com/documentation/api/#NlpTools/Models/Lda)

### Clustering

[](#clustering)

1. [K-Means](http://php-nlp-tools.com/documentation/clustering.html)
2. [Hierarchical Agglomerative Clustering](http://php-nlp-tools.com/documentation/clustering.html)
    - SingleLink
    - CompleteLink
    - GroupAverage

### Tokenizers

[](#tokenizers)

1. [WhitespaceTokenizer](http://php-nlp-tools.com/documentation/api/#NlpTools/Tokenizers/WhitespaceTokenizer)
2. [WhitespaceAndPunctuationTokenizer](http://php-nlp-tools.com/documentation/api/#NlpTools/Tokenizers/WhitespaceAndPunctuationTokenizer)
3. [PennTreebankTokenizer](http://php-nlp-tools.com/documentation/api/#NlpTools/Tokenizers/PennTreebankTokenizer)
4. [RegexTokenizer](http://php-nlp-tools.com/documentation/api/#NlpTools%5CTokenizers%5CRegexTokenizer)
5. [ClassifierBasedTokenizer](http://php-nlp-tools.com/documentation/api/#NlpTools/Tokenizers/ClassifierBasedTokenizer)This tokenizer allows us to build a lot more complex tokenizers than the previous ones

### Documents

[](#documents)

1. [TokensDocument](http://php-nlp-tools.com/documentation/api/#NlpTools/Documents/TokensDocument)represents a bag of words model for a document.
2. [WordDocument](http://php-nlp-tools.com/documentation/api/#NlpTools/Documents/WordDocument)represents a single word with the context of a larger document.
3. [TrainingDocument](http://php-nlp-tools.com/documentation/api/#NlpTools/Documents/TrainingDocument)represents a document whose class is known.
4. [TrainingSet](http://php-nlp-tools.com/documentation/api/#NlpTools/Documents/TrainingSet)a collection of TrainingDocuments

### Feature factories

[](#feature-factories)

1. [FunctionFeatures](http://php-nlp-tools.com/documentation/api/#NlpTools/FeatureFactories/FunctionFeatures)Allows the creation of a feature factory from a number of callables
2. [DataAsFeatures](http://php-nlp-tools.com/documentation/api/#NlpTools/FeatureFactories/DataAsFeatures)Simply return the data as features.

### Similarity

[](#similarity)

1. [Jaccard Index](http://php-nlp-tools.com/documentation/api/#NlpTools/Similarity/JaccardIndex)
2. [Cosine similarity](http://php-nlp-tools.com/documentation/api/#NlpTools/Similarity/CosineSimilarity)
3. [Simhash](http://php-nlp-tools.com/documentation/api/#NlpTools/Similarity/Simhash)
4. [Euclidean](http://php-nlp-tools.com/documentation/api/#NlpTools/Similarity/Euclidean)
5. [HammingDistance](http://php-nlp-tools.com/documentation/api/#NlpTools/Similarity/HammingDistance)

### Stemmers

[](#stemmers)

1. [PorterStemmer](http://php-nlp-tools.com/documentation/api/#NlpTools/Stemmers/PorterStemmer)
2. [RegexStemmer](http://php-nlp-tools.com/documentation/api/#NlpTools/Stemmers/RegexStemmer)
3. [LancasterStemmer](http://php-nlp-tools.com/documentation/api/#NlpTools/Stemmers/LancasterStemmer)
4. [GreekStemmer](http://php-nlp-tools.com/documentation/api/#NlpTools/Stemmers/GreekStemmer)

### Optimizers (MaxEnt only)

[](#optimizers-maxent-only)

1. [A gradient descent optimizer](http://php-nlp-tools.com/documentation/api/#NlpTools/Optimizers/MaxentGradientDescent)(written in php) for educational use. It is a simple implementation for anyone wanting to know a bit more about either GD or MaxEnt models
2. A fast (faster than nltk-scipy), parallel gradient descent optimizer written in [Go](http://golang.org/). This optimizer resides in another [repo](https://github.com/angeloskath/nlp-maxent-optimizer), it is used via the [external optimizer](http://php-nlp-tools.com/documentation/api/#NlpTools/Optimizers/ExternalMaxentOptimizer). TODO: At least write a readme for the optimizer written in Go.

### Other

[](#other)

1. Idf Inverse document frequency
2. Stop words
3. Language based normalizers
4. Classifier based transformation for creating flexible preprocessing pipelines

###  Health Score

46

—

FairBetter than 93% of packages

Maintenance27

Infrequent updates — may be unmaintained

Popularity61

Solid adoption and visibility

Community33

Small or concentrated contributor base

Maturity52

Maturing project, gaining track record

 Bus Factor1

Top contributor holds 75% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~335 days

Total

4

Last Release

3483d ago

### Community

Maintainers

![](https://www.gravatar.com/avatar/48587a9fa54979047433100be42c76eddd18cb91e3f140168d95a00a76c928e3?d=identicon)[katharas](/maintainers/katharas)

---

Top Contributors

[![angeloskath](https://avatars.githubusercontent.com/u/1242043?v=4)](https://github.com/angeloskath "angeloskath (90 commits)")[![yooper](https://avatars.githubusercontent.com/u/1064781?v=4)](https://github.com/yooper "yooper (17 commits)")[![jtejido](https://avatars.githubusercontent.com/u/13869015?v=4)](https://github.com/jtejido "jtejido (10 commits)")[![apmatthews](https://avatars.githubusercontent.com/u/4661832?v=4)](https://github.com/apmatthews "apmatthews (1 commits)")[![JulienMalige](https://avatars.githubusercontent.com/u/3756105?v=4)](https://github.com/JulienMalige "JulienMalige (1 commits)")[![visualex](https://avatars.githubusercontent.com/u/3214697?v=4)](https://github.com/visualex "visualex (1 commits)")

---

Tags

nlpmachine learning

### Embed Badge

![Health badge](/badges/nlp-tools-nlp-tools/health.svg)

```
[![Health](https://phpackages.com/badges/nlp-tools-nlp-tools/health.svg)](https://phpackages.com/packages/nlp-tools-nlp-tools)
```

###  Alternatives

[rubix/ml

A high-level machine learning and deep learning library for the PHP language.

2.2k1.4M28](/packages/rubix-ml)[codewithkyrian/transformers

State-of-the-art Machine Learning for PHP. Run Transformers in PHP

749231.8k5](/packages/codewithkyrian-transformers)[davmixcool/php-sentiment-analyzer

PHP Sentiment Analyzer is a lexicon and rule-based sentiment analysis tool that is used to understand sentiments in a sentence using VADER (Valence Aware Dictionary and sentiment Reasoner).

138151.7k1](/packages/davmixcool-php-sentiment-analyzer)[yooper/php-text-analysis

PHP Text Analysis is a library for performing Information Retrieval (IR) and Natural Language Processing (NLP) tasks using the PHP language

539393.0k2](/packages/yooper-php-text-analysis)[php-ai/php-ml

PHP-ML - Machine Learning library for PHP

1061.7M12](/packages/php-ai-php-ml)[nlgen/nlgen

A library for creating recursive-descent natural language generators.

56181.3k](/packages/nlgen-nlgen)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
