PHPackages                             yethee/tiktoken - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Parsing &amp; Serialization](/categories/parsing)
4. /
5. yethee/tiktoken

ActiveLibrary[Parsing &amp; Serialization](/categories/parsing)

yethee/tiktoken
===============

PHP version of tiktoken

1.1.1(2mo ago)1583.1M—1.1%30[4 issues](https://github.com/yethee/tiktoken-php/issues)11MITPHPPHP ^8.1CI passing

Since Apr 5Pushed 2mo ago4 watchersCompare

[ Source](https://github.com/yethee/tiktoken-php)[ Packagist](https://packagist.org/packages/yethee/tiktoken)[ RSS](/packages/yethee-tiktoken/feed)WikiDiscussions master Synced 1mo ago

READMEChangelog (10)Dependencies (14)Versions (21)Used By (11)

tiktoken-php
============

[](#tiktoken-php)

[![Packagist Version](https://camo.githubusercontent.com/5147925659d825e37776022435ad1bd5843c5df25b43571fdd2740e760dce064/68747470733a2f2f696d672e736869656c64732e696f2f7061636b61676973742f762f7965746865652f74696b746f6b656e)](https://camo.githubusercontent.com/5147925659d825e37776022435ad1bd5843c5df25b43571fdd2740e760dce064/68747470733a2f2f696d672e736869656c64732e696f2f7061636b61676973742f762f7965746865652f74696b746f6b656e)[![Build status](https://camo.githubusercontent.com/4cab2c0c6ec906945697771dcbb10103959d6a013621566137e05e2983f5c8a8/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f616374696f6e732f776f726b666c6f772f7374617475732f7965746865652f74696b746f6b656e2d7068702f63692e796d6c3f6272616e63683d6d6173746572)](https://camo.githubusercontent.com/4cab2c0c6ec906945697771dcbb10103959d6a013621566137e05e2983f5c8a8/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f616374696f6e732f776f726b666c6f772f7374617475732f7965746865652f74696b746f6b656e2d7068702f63692e796d6c3f6272616e63683d6d6173746572)[![Code Coverage](https://camo.githubusercontent.com/1f445673a014d9b0ad8f58ee593ddcaad07a009f60ea7a5f2a7d0643445041d4/68747470733a2f2f6170702e636f646163792e636f6d2f70726f6a6563742f62616467652f436f7665726167652f34396563333830336234383034373863616563613839303362376666306136393f6272616e63683d6d6173746572)](https://camo.githubusercontent.com/1f445673a014d9b0ad8f58ee593ddcaad07a009f60ea7a5f2a7d0643445041d4/68747470733a2f2f6170702e636f646163792e636f6d2f70726f6a6563742f62616467652f436f7665726167652f34396563333830336234383034373863616563613839303362376666306136393f6272616e63683d6d6173746572)[![License](https://camo.githubusercontent.com/ff379c0e29a40323abd0e16072d33f494461020c112a2209a258341dd3ffd5e1/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f6c6963656e73652f7965746865652f74696b746f6b656e2d706870)](https://camo.githubusercontent.com/ff379c0e29a40323abd0e16072d33f494461020c112a2209a258341dd3ffd5e1/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f6c6963656e73652f7965746865652f74696b746f6b656e2d706870)

This is a port of the [tiktoken](https://github.com/openai/tiktoken).

Installation
------------

[](#installation)

```
$ composer require yethee/tiktoken
```

Usage
-----

[](#usage)

```
use Yethee\Tiktoken\EncoderProvider;

$provider = new EncoderProvider();

$encoder = $provider->getForModel('gpt-3.5-turbo-0301');
$tokens = $encoder->encode('Hello world!');
print_r($tokens);
// OUT: [9906, 1917, 0]

$encoder = $provider->get('p50k_base');
$tokens = $encoder->encode('Hello world!');
print_r($tokens);
// OUT: [15496, 995, 0]
```

Cache
-----

[](#cache)

The encoder uses an external vocabularies, so caching is used by default to avoid performance issues.

By default, the [directory for temporary files](https://www.php.net/manual/en/function.sys-get-temp-dir.php) is used. You can override the directory for cache via environment variable `TIKTOKEN_CACHE_DIR`or use `EncoderProvider::setVocabCache()`:

```
use Yethee\Tiktoken\EncoderProvider;

$encProvider = new EncoderProvider();
$encProvider->setVocabCache('/path/to/cache');

// Using the provider
```

Lib mode
--------

[](#lib-mode)

**Experimental**

You can use [tiktoken-rs](https://github.com/zurawiki/tiktoken-rs) library via FFI binding. This can improve performance when need to encode medium or large texts. However, the overhead of data marshalling can lead to poor performance for small texts.

```
use Yethee\Tiktoken\Encoder\LibEncoder;
use Yethee\Tiktoken\EncoderProvider;

// LibEncoder::init('/path/to/lib');

$encProvider = new EncoderProvider(true); // Force using the lib encoder
```

You need to provide path to the lib before using the provider. There are several ways to do this:

- Use `Yethee\Tiktoken\Encoder\LibEncoder::init()` method.
- Use `Yethee\Tiktoken\Encoder\LibEncoder::preload()` method, inside opcache preload script.
- Use environment variable `TIKTOKEN_LIB_PATH` or `LD_LIBRARY_PATH`

### Build lib

[](#build-lib)

#### Requirements

[](#requirements)

- [Rust](https://www.rust-lang.org/) &gt;= 1.85

```
git clone git@github.com:yethee/tiktoken-php.git
cd tiktoken-php
cargo build --release
```

Copy binary from `target/release`:

- `libtiktoken_php.so` for linux
- `libtiktoken_php.dylib` for MacOS
- `tiktoken_php.dll` for Windows

**NOTE:** You can see `.docker/Dockefile` for an example.

### Benchmark

[](#benchmark)

You can see benchmark result in [\#27](https://github.com/yethee/tiktoken-php/pull/27) or run it locally:

```
composer bench
```

### TODO

[](#todo)

- Add implementation for `Yethee\Tiktoken\Encoder\LibEncoder::encodeInChunks()` method

Limitations
-----------

[](#limitations)

- Encoding for GPT-2 is not supported.
- Special tokens (like ``) are not supported.

License
-------

[](#license)

[MIT](./LICENSE)

###  Health Score

63

—

FairBetter than 99% of packages

Maintenance86

Actively maintained with recent releases

Popularity61

Solid adoption and visibility

Community29

Small or concentrated contributor base

Maturity62

Established project with proven stability

 Bus Factor1

Top contributor holds 92.1% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~59 days

Recently: every ~53 days

Total

19

Last Release

69d ago

Major Versions

0.12.0 → 1.0.02025-11-18

### Community

Maintainers

![](https://www.gravatar.com/avatar/dc980e5ca8dc88b78be4602c85381321d152e1a42f5011647f056c74e5f06e4b?d=identicon)[yethee](/maintainers/yethee)

---

Top Contributors

[![yethee](https://avatars.githubusercontent.com/u/559488?v=4)](https://github.com/yethee "yethee (70 commits)")[![clnt](https://avatars.githubusercontent.com/u/19330442?v=4)](https://github.com/clnt "clnt (2 commits)")[![iamarsenibragimov](https://avatars.githubusercontent.com/u/6703684?v=4)](https://github.com/iamarsenibragimov "iamarsenibragimov (2 commits)")[![KumaVolt](https://avatars.githubusercontent.com/u/15013036?v=4)](https://github.com/KumaVolt "KumaVolt (1 commits)")[![snapshotpl](https://avatars.githubusercontent.com/u/312655?v=4)](https://github.com/snapshotpl "snapshotpl (1 commits)")

---

Tags

encodedecodeopenaitokenizerbpetiktoken

###  Code Quality

TestsPHPUnit

Static AnalysisPsalm

Type Coverage Yes

### Embed Badge

![Health badge](/badges/yethee-tiktoken/health.svg)

```
[![Health](https://phpackages.com/badges/yethee-tiktoken/health.svg)](https://phpackages.com/packages/yethee-tiktoken)
```

###  Alternatives

[gioni06/gpt3-tokenizer

PHP package for Byte Pair Encoding (BPE) used by GPT-3.

85537.5k8](/packages/gioni06-gpt3-tokenizer)[rajentrivedi/tokenizer-x

TokenizerX calculates required tokens for given prompt

91214.0k3](/packages/rajentrivedi-tokenizer-x)[kherge/json

Encodes, decodes, and validates JSON data.

61226.6k6](/packages/kherge-json)[devium/toml

A PHP encoder/decoder for TOML compatible with specification 1.0.0

3968.9k13](/packages/devium-toml)[danny50610/bpe-tokeniser

PHP port for openai/tiktoken (most)

10422.0k1](/packages/danny50610-bpe-tokeniser)[rickselby/nbt

Parser/Writer for the NBT file format

171.2k1](/packages/rickselby-nbt)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
