PHPackages                             xcalder/languagedetector - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Utility &amp; Helpers](/categories/utility)
4. /
5. xcalder/languagedetector

ActiveLibrary[Utility &amp; Helpers](/categories/utility)

xcalder/languagedetector
========================

simple library to classify texts

v0.1.1(12y ago)019BSD-4-ClausePHP

Since Apr 11Pushed 7y ago1 watchersCompare

[ Source](https://github.com/xcalder/LanguageDetector)[ Packagist](https://packagist.org/packages/xcalder/languagedetector)[ RSS](/packages/xcalder-languagedetector/feed)WikiDiscussions master Synced 3d ago

READMEChangelogDependencies (1)Versions (6)Used By (0)

LanguageDetector [![Build Status](https://camo.githubusercontent.com/7ec31c8fcec9646e190597f381c156bc89bce7de23271fb3e1a6b6001e976b15/68747470733a2f2f7472617669732d63692e6f72672f63726f6461732f4c616e67756167654465746563746f722e706e67)](https://travis-ci.org/crodas/LanguageDetector) [![Flattr this git repo](https://camo.githubusercontent.com/7e3f46a36526479d701ef7f90a0f8c3ac2fbab3087446e2a9fceed75cd1ab802/687474703a2f2f6170692e666c617474722e636f6d2f627574746f6e2f666c617474722d62616467652d6c617267652e706e67)](https://flattr.com/submit/auto?user_id=crodas&url=https://github.com/crodas/LanguageDetector&title=Language%20Detector%20Library&language=en&tags=github&category=software)
===========================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================

[](#languagedetector--)

PHP Class to detect languages from any free text.

It follows the approach described in the [paper](http://scholar.google.com.py/scholar?q=N-Gram-Based+Text+Categorization), a given text is tokenized into [N-Grams](http://en.wikipedia.org/wiki/N-gram) (we cleanup whitespaces before doing this step). Then we sort the `tokens` and we compare against a language `model`.

How it works
------------

[](#how-it-works)

The first thing we need is a `language model` (which looks like [this file](https://github.com/crodas/LanguageDetector/blob/master/example/datafile.php)) that is used to compare the texts against at classification time. This process must done *before* anything, and it can be generated with an script similar to [this file](https://github.com/crodas/LanguageDetector/blob/master/example/learn.php).

```
// register the autoloader
require 'lib/LanguageDetector/autoload.php';

// it could use a little bit of memory, but it's fine
// because this process runs once.
ini_set('memory_limit', '1G');

// we load the configuration (which will be serialized
// later into our language model file
$config = new LanguageDetector\Config;

$c = new LanguageDetector\Learn($config);
foreach (glob(__DIR__ . '/samples/*') as $file) {
    // feed with examples ('language', 'text');
    $c->addSample(basename($file), file_get_contents($file));
}

// some callback so we know where the process is
$c->addStepCallback(function($lang, $status) {
    echo "Learning {$lang}: $status\n";
});

// save it in `datafile`.
// we currently support the `php` serialization but it's trivial
// to add other formats, just extend `\LanguageDetector\Format\AbstractFormat`.
//You can check example at https://github.com/crodas/LanguageDetector/blob/master/lib/LanguageDetector/Format/PHP.php
$c->save(AbstractFormat::initFormatByPath('language.php'));
```

Once we have our language model file (in this case `language.php`) we're ready to classify texts by their language.

```
// register the autoloader
require 'lib/LanguageDetector/autoload.php';

// we load the language model, it would create
// the $config object for us.
$detect = LanguageDetector\Detect::initByPath('language.php');

$lang = $detect->detect("Agricultura (-ae, f.), sensu latissimo,
est summa omnium artium et scientiarum et technologiarum quae de
terris colendis et animalibus creandis curant, ut poma, frumenta,
charas, carnes, textilia, et aliae res e terra bene producantur.
Specialius, agronomia est ars et scientia quae terris colendis student,
agricultio autem animalibus creandis.")

var_dump($lang);
```

And that's it.

Algorithms
----------

[](#algorithms)

The project is designed to work with modules, which means you can provide your own algorithm for `sorting` and `comparing` the N-Grams. By default the library implements the [PageRank](http://en.wikipedia.org/wiki/PageRank) as `sorting` algorithm, and *out of place* (described in the paper) as `comparing`.

In order to supply your own algorithms, you must change the `$config` at *learning stage* to load your own classes (which by the way should implement some interaces).

###  Health Score

26

—

LowBetter than 43% of packages

Maintenance20

Infrequent updates — may be unmaintained

Popularity6

Limited adoption so far

Community13

Small or concentrated contributor base

Maturity57

Maturing project, gaining track record

 Bus Factor1

Top contributor holds 78.3% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~0 days

Total

2

Last Release

4573d ago

### Community

Maintainers

![](https://avatars.githubusercontent.com/u/6959235?v=4)[xcalder](/maintainers/xcalder)[@xcalder](https://github.com/xcalder)

---

Top Contributors

[![crodas](https://avatars.githubusercontent.com/u/36463?v=4)](https://github.com/crodas "crodas (47 commits)")[![mente](https://avatars.githubusercontent.com/u/391997?v=4)](https://github.com/mente "mente (7 commits)")[![adam-lynch](https://avatars.githubusercontent.com/u/1427241?v=4)](https://github.com/adam-lynch "adam-lynch (2 commits)")[![xcalder](https://avatars.githubusercontent.com/u/6959235?v=4)](https://github.com/xcalder "xcalder (2 commits)")[![pborreli](https://avatars.githubusercontent.com/u/77759?v=4)](https://github.com/pborreli "pborreli (1 commits)")[![sasezaki](https://avatars.githubusercontent.com/u/42755?v=4)](https://github.com/sasezaki "sasezaki (1 commits)")

### Embed Badge

![Health badge](/badges/xcalder-languagedetector/health.svg)

```
[![Health](https://phpackages.com/badges/xcalder-languagedetector/health.svg)](https://phpackages.com/packages/xcalder-languagedetector)
```

###  Alternatives

[freshbitsweb/laratables

Ajax support of DataTables for Laravel

4871.1M3](/packages/freshbitsweb-laratables)[ahand/mobileesp

Since 2008, MobileESP provides web site developers an easy-to-use and lightweight API for detecting whether visitors are using a mobile device, and if so, what kind. The APIs provide simple boolean results ('true' or 'false') for identifying individual device categories (such as iPhone, BlackBerry, Android, and Windows Mobile), device capabilities (e.g., J2ME), and broad classes of devices, such as 'iPhone Tier' (iPhone/Android/Tizen) or 'Tablet Tier.' APIs are available in PHP, JavaScript, Java, C#, Ruby Python, and more.

174491.4k7](/packages/ahand-mobileesp)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
