PHPackages                             cybercog/php-unicode - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. cybercog/php-unicode

ActiveLibrary

cybercog/php-unicode
====================

PHP Unicode library

2.0.0(2mo ago)110.4k—5.6%[1 issues](https://github.com/cybercog/php-unicode/issues)1MITPHPPHP ^8.1CI passing

Since Jan 4Pushed 2mo ago1 watchersCompare

[ Source](https://github.com/cybercog/php-unicode)[ Packagist](https://packagist.org/packages/cybercog/php-unicode)[ Fund](https://paypal.me/antonkomarev)[ RSS](/packages/cybercog-php-unicode/feed)WikiDiscussions master Synced 1mo ago

READMEChangelog (3)Dependencies (4)Versions (14)Used By (1)

PHP Unicode
===========

[](#php-unicode)

[![Releases](https://camo.githubusercontent.com/045418d8a65b8f868ef4f94b3173588955cc0ea983a10306569b7943d694d28d/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f72656c656173652f6379626572636f672f7068702d756e69636f64652e7376673f7374796c653d666c61742d737175617265)](https://github.com/cybercog/php-unicode/releases)[![Build](https://camo.githubusercontent.com/9afd9007afc7f68f27bcb099adeaecddc04c9068625fb934ff3297e5bf6a99f2/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f616374696f6e732f776f726b666c6f772f7374617475732f6379626572636f672f7068702d756e69636f64652f74657374732e796d6c3f7374796c653d666c61742d737175617265)](https://github.com/cybercog/php-unicode/actions/workflows/tests.yml)[![License](https://camo.githubusercontent.com/8a2eb02a9b747d1ca65f3b52ffd32dc58e26adbac0a4ccfb871782a4a18eb0d5/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f6c6963656e73652f6379626572636f672f7068702d756e69636f64652e7376673f7374796c653d666c61742d737175617265)](https://github.com/cybercog/php-unicode/blob/master/LICENSE)

Introduction
------------

[](#introduction)

Streamline Unicode strings, code points and grapheme clusters manipulations. Object oriented implementation.

The library provides two levels of abstraction:

- **Code point level** (`CodePoint`, `UnicodeString`) — works with individual Unicode code points. Requires `ext-mbstring`.
- **Grapheme level** (`Grapheme`, `GraphemeString`) — works with user-perceived characters (grapheme clusters). Requires `ext-intl`.

Requirements
------------

[](#requirements)

ClassRequired Extensions`CodePoint``ext-mbstring``UnicodeString``ext-mbstring``Grapheme``ext-mbstring`, `ext-intl``GraphemeString``ext-mbstring`, `ext-intl`PHP 8.1 or higher is required.

Installation
------------

[](#installation)

Pull in the package through Composer.

```
composer require cybercog/php-unicode
```

For grapheme cluster support, install the `intl` PHP extension.

Usage
-----

[](#usage)

### Code Point

[](#code-point)

```
$codePoint = \Cog\Unicode\CodePoint::of('ÿ');

$codePoint = \Cog\Unicode\CodePoint::ofDecimal(255);

$codePoint = \Cog\Unicode\CodePoint::ofHexadecimal('U+00FF');

$codePoint = \Cog\Unicode\CodePoint::ofHtmlEntity('&yuml;');

$codePoint = \Cog\Unicode\CodePoint::ofXmlEntity('&#xff;');
```

### Represent Code Point in any format

[](#represent-code-point-in-any-format)

```
$codePoint = \Cog\Unicode\CodePoint::of('ÿ');

echo strval($codePoint); // (string) "ÿ"

echo $codePoint->toDecimal(); // (int) 255

echo $codePoint->toHexadecimal(); // (string) "U+00FF"

echo $codePoint->toHtmlEntity(); // (string) "&yuml;"

echo $codePoint->toXmlEntity(); // (string) "&#xff;"
```

### Unicode String (code point level)

[](#unicode-string-code-point-level)

```
$string = \Cog\Unicode\UnicodeString::of('Hello');
```

`UnicodeString` object will contain a list of code points.

For example, the Unicode string "Hello" is represented by the code points:

- U+0048 (H)
- U+0065 (e)
- U+006C (l)
- U+006C (l)
- U+006F (o)

```
echo strval($string); // (string) "Hello"

$codePointList = $string->codePointList; // list
```

### Grapheme (grapheme cluster level)

[](#grapheme-grapheme-cluster-level)

Requires `ext-intl`.

```
$grapheme = \Cog\Unicode\Grapheme::of('👨‍👩‍👧‍👦');

echo strval($grapheme); // (string) "👨‍👩‍👧‍👦"

$codePointList = $grapheme->codePointList; // list
```

### Grapheme String (grapheme cluster level)

[](#grapheme-string-grapheme-cluster-level)

Requires `ext-intl`.

```
$string = \Cog\Unicode\GraphemeString::of('Ае👨‍👩‍👧‍👦');

$graphemeList = $string->graphemeList; // list
// 'А', 'е', '👨‍👩‍👧‍👦' — 3 graphemes (not 9 code points)

echo strval($string); // (string) "Ае👨‍👩‍👧‍👦"
```

### Real-world examples

[](#real-world-examples)

#### Convert a character to all supported formats

[](#convert-a-character-to-all-supported-formats)

```
$codePoint = \Cog\Unicode\CodePoint::of('©');

echo $codePoint->toDecimal();     // 169
echo $codePoint->toHexadecimal(); // "U+00A9"
echo $codePoint->toHtmlEntity();  // "&copy;"
echo $codePoint->toXmlEntity();   // "&#xA9;"
```

#### Round-trip between entity formats

[](#round-trip-between-entity-formats)

```
$cp = \Cog\Unicode\CodePoint::ofHtmlEntity('&hearts;');

echo $cp->toXmlEntity(); // "&#x2665;"
echo $cp->toDecimal();   // 9829

$cp2 = \Cog\Unicode\CodePoint::ofDecimal($cp->toDecimal());
echo strval($cp2); // "♥"
```

#### Inspect code points in a string

[](#inspect-code-points-in-a-string)

```
$string = \Cog\Unicode\UnicodeString::of('café');

foreach ($string->codePointList as $cp) {
    echo $cp->toHexadecimal() . ' ';
}
// U+0063 U+0061 U+0066 U+00E9
```

#### Code points vs. graphemes — why it matters

[](#code-points-vs-graphemes--why-it-matters)

```
// Flag emoji: 2 code points, but 1 visible character
$flag = \Cog\Unicode\UnicodeString::of('🇦🇶');
echo count($flag->codePointList); // 2

$flag = \Cog\Unicode\GraphemeString::of('🇦🇶');
echo count($flag->graphemeList); // 1

// Family emoji: 7 code points (persons + ZWJ), 1 visible character
$family = \Cog\Unicode\GraphemeString::of('👨‍👩‍👧‍👦');
echo count($family->graphemeList); // 1

$familyGrapheme = $family->graphemeList[0];
echo count($familyGrapheme->codePointList); // 7
```

#### Detect combining marks

[](#detect-combining-marks)

```
$acute = \Cog\Unicode\CodePoint::of("\u{0301}"); // combining acute accent
echo $acute->isCombining(); // true

$a = \Cog\Unicode\CodePoint::of('A');
echo $a->isCombining(); // false
```

Why this library?
-----------------

[](#why-this-library)

PHP provides `mb_*` and `grapheme_*` functions, but they are procedural and return raw strings. This library wraps them in immutable, type-safe value objects with two key benefits:

- **Two levels of abstraction.** `CodePoint` / `UnicodeString` work with individual Unicode code points. `Grapheme` / `GraphemeString` work with user-perceived characters (grapheme clusters). Choose the right level for your use case instead of mixing `mb_strlen` and `grapheme_strlen` calls.
- **Format conversion.** `CodePoint` converts between character, decimal, hexadecimal (`U+XXXX`), HTML entity, and XML entity formats in a single object. No need to chain `mb_ord`, `dechex`, `htmlentities` manually.

```
// Procedural
$char = '©';
$dec = mb_ord($char);
$hex = 'U+' . strtoupper(sprintf('%04X', $dec));
$html = htmlentities($char, ENT_HTML5 | ENT_QUOTES);

// With this library
$cp = \Cog\Unicode\CodePoint::of('©');
$dec = $cp->toDecimal();
$hex = $cp->toHexadecimal();
$html = $cp->toHtmlEntity();
```

License
-------

[](#license)

- `PHP Unicode` package is open-sourced software licensed under the [MIT license](LICENSE) by [Anton Komarev](https://komarev.com).

About CyberCog
--------------

[](#about-cybercog)

[CyberCog](https://cybercog.su) is a Social Unity of enthusiasts. Research the best solutions in product &amp; software development is our passion.

- [Follow us on Twitter](https://twitter.com/cybercog)

[![CyberCog](https://cloud.githubusercontent.com/assets/1849174/18418932/e9edb390-7860-11e6-8a43-aa3fad524664.png)](https://cybercog.su)

###  Health Score

43

—

FairBetter than 91% of packages

Maintenance65

Regular maintenance activity

Popularity27

Limited adoption so far

Community9

Small or concentrated contributor base

Maturity58

Maturing project, gaining track record

 Bus Factor1

Top contributor holds 100% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~393 days

Total

3

Last Release

78d ago

Major Versions

1.0.1 → 2.0.02026-03-01

### Community

Maintainers

![](https://www.gravatar.com/avatar/b3fddc40462126bbc119e373ed6a3f942a90a400f30c076c447c0625b841c4ef?d=identicon)[antonkomarev](/maintainers/antonkomarev)

---

Top Contributors

[![antonkomarev](https://avatars.githubusercontent.com/u/1849174?v=4)](https://github.com/antonkomarev "antonkomarev (15 commits)")

---

Tags

unicodegraphemeemojisymbolcogcode-pointhtml-entityxml-entity

###  Code Quality

TestsPHPUnit

Static AnalysisPHPStan

Type Coverage Yes

### Embed Badge

![Health badge](/badges/cybercog-php-unicode/health.svg)

```
[![Health](https://phpackages.com/badges/cybercog-php-unicode/health.svg)](https://phpackages.com/packages/cybercog-php-unicode)
```

###  Alternatives

[symfony/string

Provides an object-oriented API to strings and deals with bytes, UTF-8 code points and grapheme clusters in a unified way

1.8k724.1M827](/packages/symfony-string)[symfony/polyfill-intl-grapheme

Symfony polyfill for intl's grapheme\_\* functions

1.7k702.8M27](/packages/symfony-polyfill-intl-grapheme)[nette/utils

🛠 Nette Utils: lightweight utilities for string &amp; array manipulation, image handling, safe JSON encoding/decoding, validation, slug or strong password generating etc.

2.1k394.3M1.5k](/packages/nette-utils)[joypixels/emoji-toolkit

JoyPixels is a complete set of emoji designed for the web. The emoji-toolkit includes libraries to easily convert unicode characters to shortnames (:smile:) and shortnames to JoyPixels emoji images. PNG formats provided for the emoji images.

465817.1k7](/packages/joypixels-emoji-toolkit)[twig/string-extra

A Twig extension for Symfony String

22046.0M133](/packages/twig-string-extra)[gettext/languages

gettext languages with plural rules

7530.3M11](/packages/gettext-languages)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
