PHPackages                             textualization/ropherta-tokenizer - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Utility &amp; Helpers](/categories/utility)
4. /
5. textualization/ropherta-tokenizer

ActiveLibrary[Utility &amp; Helpers](/categories/utility)

textualization/ropherta-tokenizer
=================================

GPT3Tokenizer (BPE) with Roberta-base vocabulary.

v0.0.7(2y ago)14481Apache-2.0PHP

Since Aug 8Pushed 2y agoCompare

[ Source](https://github.com/Textualization/RophertaTokenizer)[ Packagist](https://packagist.org/packages/textualization/ropherta-tokenizer)[ Fund](https://ko-fi.com/textualization)[ RSS](/packages/textualization-ropherta-tokenizer/feed)WikiDiscussions main Synced 3w ago

READMEChangelogDependencies (3)Versions (8)Used By (1)

BPE Tokenizer for Ropherta (subclass of GPT3Tokenizer)
======================================================

[](#bpe-tokenizer-for-ropherta-subclass-of-gpt3tokenizer)

This is just a wrapper around [GPT3Tokenizer](https://packagist.org/packages/gioni06/gpt3-tokenizer) using [the HuggingFace RoBERTa vocab and merge files](https://github.com/huggingface/transformers/blob/main/src/transformers/models/roberta/tokenization_roberta.py).

See [GPT3 documentation](https://github.com/Gioni06/GPT3Tokenizer/blob/main/README.md) for example use (or the generated test case under `tests/`).

XLM Tokenizer
-------------

[](#xlm-tokenizer)

To use the multilingual version, the [SentencePiece dependency](https://packagist.org/packages/textualization/sentencepiece) needs to be initialized and an aditional model file needs to be downloaded:

```
composer exec -- php -r "require 'vendor/autoload.php'; Textualization\SentencePiece\Vendor::check();"
composer exec -- php -r "require 'vendor/autoload.php'; Textualization\Ropherta\Tokenizer\Vendor::check();"

```

Sponsors
--------

[](#sponsors)

We thank our sponsor:

[![](https://camo.githubusercontent.com/fa0fa4bf60ec8eeb4d2a15311000fc2b3767fb96f788f6bf502d1117a734a221/68747470733a2f2f65766f6c75646174612e636f6d2f646973706c6179323038)](https://evoludata.com/)

###  Health Score

24

—

LowBetter than 31% of packages

Maintenance20

Infrequent updates — may be unmaintained

Popularity19

Limited adoption so far

Community8

Small or concentrated contributor base

Maturity39

Early-stage or recently created project

 Bus Factor1

Top contributor holds 100% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~33 days

Recently: every ~2 days

Total

7

Last Release

857d ago

### Community

Maintainers

![](https://avatars.githubusercontent.com/u/70865842?v=4)[Textualization Software Ltd.](/maintainers/textualization)[@Textualization](https://github.com/Textualization)

---

Top Contributors

[![DrDub](https://avatars.githubusercontent.com/u/315403?v=4)](https://github.com/DrDub "DrDub (13 commits)")

---

Tags

natural-language-processingphp-libraryroberta-tokenizer

###  Code Quality

TestsPHPUnit

### Embed Badge

![Health badge](/badges/textualization-ropherta-tokenizer/health.svg)

```
[![Health](https://phpackages.com/badges/textualization-ropherta-tokenizer/health.svg)](https://phpackages.com/packages/textualization-ropherta-tokenizer)
```

###  Alternatives

[cognesy/instructor-php

The complete AI toolkit for PHP: unified LLM API, structured outputs, agents, and coding agent control

318117.1k1](/packages/cognesy-instructor-php)[ralphjsmit/laravel-filament-components

A collection of reusable components for Filament.

11075.3k4](/packages/ralphjsmit-laravel-filament-components)[mindplay/unbox

Fast, simple, easy-to-use DI container

4818.6k3](/packages/mindplay-unbox)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
