PHPackages                             helturkey/dujana-arabic-nlp - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Utility &amp; Helpers](/categories/utility)
4. /
5. helturkey/dujana-arabic-nlp

ActiveLibrary[Utility &amp; Helpers](/categories/utility)

helturkey/dujana-arabic-nlp
===========================

A morphology-aware Arabic analysis package for PHP with Laravel support, providing normalization, tokenization, stemming, and root/pattern analysis.

v0.1.0(1mo ago)05MITPHPPHP ^8.4CI passing

Since Apr 30Pushed 1mo agoCompare

[ Source](https://github.com/helturkey/dujana-arabic-nlp)[ Packagist](https://packagist.org/packages/helturkey/dujana-arabic-nlp)[ RSS](/packages/helturkey-dujana-arabic-nlp/feed)WikiDiscussions main Synced 1w ago

READMEChangelogDependencies (6)Versions (2)Used By (0)

Dujana Arabic NLP
=================

[](#dujana-arabic-nlp)

**Dujana Arabic NLP** is a morphology-aware PHP package for Arabic text processing.
It is designed as a small Arabic NLP toolkit, not only as a stemmer.

It provides practical layers for:

- Tokenization
- Classification
- Light and moderate stemming
- Morphology-aware analysis
- Optional lexicon-backed root exploration

---

Installation
------------

[](#installation)

### Plain PHP / Composer

[](#plain-php--composer)

```
composer require helturkey/dujana-arabic-nlp
```

```
use Dujana\ArabicNlp\ArabicAnalyzer;
use Dujana\ArabicNlp\ArabicClassifier;
use Dujana\ArabicNlp\ArabicTokenizer;
use Dujana\ArabicNlp\Enums\StemmerModeEnum;

$tokens = ArabicTokenizer::make()->tokenize('أَحْلَامُهُمْ كَثِيرَةٌ.');

$classification = ArabicClassifier::make()->classify('أَحْلَامَهُمْ');

$analysis = ArabicAnalyzer::make()->analyze('أَحْلَامَهُمْ', StemmerModeEnum::Root);
```

### Laravel

[](#laravel)

Publish the configuration and resources:

```
php artisan vendor:publish --tag=dujana-arabic-nlp
```

Use the Laravel container so the package can read your published configuration and optional lexicon database path:

```
use Dujana\ArabicNlp\ArabicAnalyzer;
use Dujana\ArabicNlp\ArabicClassifier;
use Dujana\ArabicNlp\ArabicTokenizer;
use Dujana\ArabicNlp\Enums\StemmerModeEnum;

$tokens = app(ArabicTokenizer::class)->tokenize('أَحْلَامُهُمْ كَثِيرَةٌ.');

$classification = app(ArabicClassifier::class)->classify('أَحْلَامَهُمْ');

$analysis = app(ArabicAnalyzer::class)->analyze('أَحْلَامَهُمْ', StemmerModeEnum::Root);
```

Avoid using `ArabicAnalyzer::make()` inside Laravel application code unless you intentionally want to bypass Laravel configuration.

---

Public API Layers
-----------------

[](#public-api-layers)

Dujana exposes four main public layers:

```
$analyzer->tokenize($text);                         // Tokenization
$analyzer->classify($word);                         // Classification
$analyzer->stem($word, StemmerModeEnum::Moderate);  // Stemming
$analyzer->analyze($word, StemmerModeEnum::Root);   // Full analysis
```

Specialized APIs are also available:

```
ArabicTokenizer::make()->tokenize($text);
ArabicClassifier::make()->classify($word);
ArabicStemmer::make()->stem($word, StemmerModeEnum::Moderate);
ArabicAnalyzer::make()->analyze($word, StemmerModeEnum::Root);
```

---

Documentation
-------------

[](#documentation)

- [English documentation](docs/en/README.md)
- [التوثيق العربي](docs/ar/README.md)
- [References, dictionaries, and morphology sources](docs/references/books-and-dictionaries.md)

---

Why “Dujana”?
-------------

[](#why-dujana)

The name **Dujana** is an authentic Arabic name with several meanings rooted in Arabic language and nature.
The meaning chosen for this package is **the great, abundant rain**: rain that falls heavily, spreads widely, and covers the earth.

This image fits the purpose of the package. Dujana was created to serve Arabic text broadly: to help normalize, tokenize, classify, stem, and analyze Arabic words in a way that can benefit many projects, not only one private codebase.

### لماذا اسم “دُجانة”؟

[](#لماذا-اسم-دُجانة)

اسم **دُجانة** اسم عربي أصيل، له أكثر من معنى في اللغة. والمعنى الذي اخترناه أساسًا للتسمية هو **المطر العظيم**؛ أي المطر الشديد الواسع الذي يغمر الأرض وينتشر خيره.

وهذا المعنى يلائم غاية الحزمة؛ فقد أُنشئت دُجانة لتخدم النص العربي على نطاق واسع: تطبيعًا، وتقطيعًا، وتصنيفًا، واشتقاقًا، وتحليلًا صرفيًا، بحيث ينتفع بها أكثر من مشروع، ولا تبقى حبيسة شيفرة داخلية خاصة.

---

Origin
------

[](#origin)

Dujana Arabic NLP started as part of the core Arabic text-processing code behind [Poetspedia — موسوعة الشعراء](https://poetspedia.com), a platform dedicated to Arabic poetry, poets, rhyme, and literary exploration.

It was originally built to support real production needs in a large Arabic poetry platform: normalization, tokenization, stemming, classification, lexicon-backed analysis, and morphology-aware root exploration for classical and modern Arabic poetry.

After being tested and refined inside Poetspedia, Dujana was extracted into a standalone open-source package so Arabic developers, researchers, linguists, and NLP engineers can reuse it, improve it, and build on top of it.

Dujana is not a closed internal tool. It is a contribution from Poetspedia to the Arabic open-source ecosystem, with the hope that it helps make Arabic NLP tooling more accessible, practical, and production-ready.

### النشأة

[](#النشأة)

بدأت دُجانة بوصفها جزءًا من الشيفرة الأساسية لمعالجة النصوص العربية في [موسوعة الشعراء](https://poetspedia.com)، وهي منصة متخصصة في الشعر العربي والشعراء والقافية والاستكشاف الأدبي.

نشأت الحاجة إليها داخل مشروع عربي حقيقي يتعامل مع الشعر العربي على نطاق واسع؛ من التطبيع والتقطيع والاشتقاق والتصنيف، إلى التحليل المعجمي واستكشاف الجذور بطريقة تراعي طبيعة العربية وأوزانها وسياقاتها في الشعر القديم والحديث.

وبعد استخدامها وتطويرها داخل موسوعة الشعراء، فُصلت دُجانة في حزمة مستقلة مفتوحة المصدر، لتكون متاحة للمطورين والباحثين واللغويين والمهتمين بالمعالجة الحاسوبية للغة العربية.

ليست دُجانة أداة داخلية مغلقة، بل مساهمة من موسوعة الشعراء في إثراء البرمجيات العربية مفتوحة المصدر، على أمل أن تساعد في جعل أدوات معالجة العربية أقرب إلى الواقع، وأسهل استخدامًا، وأصلح للتطبيقات الإنتاجية.

###  Health Score

37

—

LowBetter than 81% of packages

Maintenance92

Actively maintained with recent releases

Popularity4

Limited adoption so far

Community6

Small or concentrated contributor base

Maturity41

Maturing project, gaining track record

 Bus Factor1

Top contributor holds 100% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Unknown

Total

1

Last Release

40d ago

### Community

Maintainers

![](https://www.gravatar.com/avatar/7a7658e6b20d82e76e6b4ae5da38a4dd759f2d3053993a5180617171c777f399?d=identicon)[helturkey](/maintainers/helturkey)

---

Top Contributors

[![helturkey](https://avatars.githubusercontent.com/u/13208801?v=4)](https://github.com/helturkey "helturkey (1 commits)")

---

Tags

arabic-languagearabic-nlpmorphological-analysismorphologystemminglaravelnlparabicnormalizationstemmerstemmingmorphologytokenizationarabic-nlparabic-morphologyroot-extractiondujana

###  Code Quality

TestsPest

Static AnalysisPHPStan

Code StyleLaravel Pint

Type Coverage Yes

### Embed Badge

![Health badge](/badges/helturkey-dujana-arabic-nlp/health.svg)

```
[![Health](https://phpackages.com/badges/helturkey-dujana-arabic-nlp/health.svg)](https://phpackages.com/packages/helturkey-dujana-arabic-nlp)
```

###  Alternatives

[psalm/plugin-laravel

Psalm plugin for Laravel

3325.1M337](/packages/psalm-plugin-laravel)[livewire/flux

The official UI component library for Livewire.

9466.8M119](/packages/livewire-flux)[friendsoftypo3/content-blocks

TYPO3 CMS Content Blocks - Content Types API | Define reusable components via YAML

101466.4k44](/packages/friendsoftypo3-content-blocks)[phel-lang/phel-lang

Phel is a functional programming language that compiles to PHP

5085.1k16](/packages/phel-lang-phel-lang)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
