PHPackages                             wikimedia/utfnormal - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Utility &amp; Helpers](/categories/utility)
4. /
5. wikimedia/utfnormal

ActiveLibrary[Utility &amp; Helpers](/categories/utility)

wikimedia/utfnormal
===================

Contains Unicode normalization routines, including both pure PHP implementations and automatic use of the 'intl' PHP extension when present

4.0.0(3y ago)96.0M—9.9%24GPL-2.0-or-laterPHPPHP &gt;=7.4.3

Since Feb 25Pushed 2mo ago14 watchersCompare

[ Source](https://github.com/wikimedia/utfnormal)[ Packagist](https://packagist.org/packages/wikimedia/utfnormal)[ Docs](https://www.mediawiki.org/wiki/utfnormal)[ RSS](/packages/wikimedia-utfnormal/feed)WikiDiscussions master Synced 1mo ago

READMEChangelogDependencies (7)Versions (11)Used By (4)

[![Latest Stable Version](https://camo.githubusercontent.com/4d0a7f99875a3268b826a92fde2d5117cd35babb742abe68509d6d18c56e2440/68747470733a2f2f706f7365722e707567782e6f72672f77696b696d656469612f7574666e6f726d616c2f762f737461626c652e737667)](https://packagist.org/packages/wikimedia/utfnormal) [![License](https://camo.githubusercontent.com/e029783a4a313eff2e09a30f03e9601b06d0e23c5412b7accc0c6a508b5fc79c/68747470733a2f2f706f7365722e707567782e6f72672f77696b696d656469612f7574666e6f726d616c2f6c6963656e73652e737667)](https://packagist.org/packages/wikimedia/utfnormal)

utfnormal
=========

[](#utfnormal)

utfnormal is a library that contains Unicode normalization routines, including both pure PHP implementations and automatic use of the 'intl' PHP extension when present.

The main function to care about is UtfNormal\\Validator::cleanUp(). This will strip illegal UTF-8 sequences and characters that are illegal in XML, and if necessary convert to normalization form C.

If you know the string is already valid UTF-8, you can directly call UtfNormal\\Validator::toNFC(), toNFK(), or toNFKC(); this will convert a given UTF-8 string to Normalization Form C, K, or KC if it's not already such. The function assumes that the input string is already valid UTF-8; if there are corrupt characters this may produce erroneous results.

Performance is kind of stinky in absolute terms, though it should be speedy on pure ASCII text. ;) On text that can be determined quickly to already be in NFC it's not too awful but it can quickly get uncomfortably slow, particularly for Korean text (the hangul decomposition/composition code is extra slow).

Bugs should be filed in [Wikimedia's Phabricator](https://phabricator.wikimedia.org/maniphest/task/create/?projects=utfnormal) under the "utfnormal" project.

Regenerating data tables
------------------------

[](#regenerating-data-tables)

UtfNormalData.inc and UtfNormalDataK.inc are generated from the Unicode Character Database by the script "generate.php". Run "composer generate" to rebuild the tables. To fetch updated unicode data from the internet, run "composer generate -- --fetch".

Testing
-------

[](#testing)

Running "composer test" will run a syntax checker, PHPUnit conformance tests, and run some benchmarks using sample texts from Wikipedia. Take all benchmark numbers with large grains of salt.

PHP module extension
--------------------

[](#php-module-extension)

If the 'intl' PHP extension is present, ICU library functions are used which are *MUCH* faster than doing this work in pure PHP code.

It is strongly recommended to enable this module if possible:

Older versions of this library supported a one-off custom PHP extension, which has been dropped. If you were using this, please migrate to the intl extension.

History
-------

[](#history)

This library was first introduced in [MediaWiki 1.3](https://www.mediawiki.org/wiki/MediaWiki_1.3) ([r4965](https://www.mediawiki.org/wiki/Special:Code/MediaWiki/4965)). It was split out of the MediaWiki codebase and published as an independent library during the [MediaWiki 1.25](https://www.mediawiki.org/wiki/MediaWiki_1.25) development cycle.

---

###  Health Score

55

—

FairBetter than 98% of packages

Maintenance57

Moderate activity, may be stable

Popularity51

Moderate usage in the ecosystem

Community30

Small or concentrated contributor base

Maturity69

Established project with proven stability

 Bus Factor3

3 contributors hold 50%+ of commits

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~330 days

Recently: every ~469 days

Total

10

Last Release

1127d ago

Major Versions

v1.1.0 → v2.0.02018-02-27

v2.0.0 → 3.0.02021-01-30

3.0.2 → 4.0.02023-04-17

PHP version history (4 changes)v1.0.0PHP &gt;=5.3.3

v2.0.0PHP &gt;=5.5.9

3.0.0PHP &gt;=7.2.9

4.0.0PHP &gt;=7.4.3

### Community

Maintainers

![](https://www.gravatar.com/avatar/b94d9718c06ec7c3fd7a104bc44966fb7464b9ec7411582343ea35a7a6a85f08?d=identicon)[mediawiki](/maintainers/mediawiki)

![](https://www.gravatar.com/avatar/716c86d71cbf921e7912a505f89d799de398fc0a3af0bd4c8862834b2d642bd7?d=identicon)[wikimedia](/maintainers/wikimedia)

---

Top Contributors

[![legoktm](https://avatars.githubusercontent.com/u/81392?v=4)](https://github.com/legoktm "legoktm (23 commits)")[![umherirrender](https://avatars.githubusercontent.com/u/1174884?v=4)](https://github.com/umherirrender "umherirrender (15 commits)")[![jdforrester](https://avatars.githubusercontent.com/u/881572?v=4)](https://github.com/jdforrester "jdforrester (12 commits)")[![reedy](https://avatars.githubusercontent.com/u/67615?v=4)](https://github.com/reedy "reedy (10 commits)")[![Krinkle](https://avatars.githubusercontent.com/u/156867?v=4)](https://github.com/Krinkle "Krinkle (9 commits)")[![bvibber](https://avatars.githubusercontent.com/u/103075?v=4)](https://github.com/bvibber "bvibber (4 commits)")[![cscott](https://avatars.githubusercontent.com/u/156080?v=4)](https://github.com/cscott "cscott (4 commits)")[![anomiex](https://avatars.githubusercontent.com/u/1030580?v=4)](https://github.com/anomiex "anomiex (2 commits)")[![Daimona](https://avatars.githubusercontent.com/u/38216014?v=4)](https://github.com/Daimona "Daimona (2 commits)")[![xSavitar](https://avatars.githubusercontent.com/u/4872561?v=4)](https://github.com/xSavitar "xSavitar (1 commits)")[![MarcoAurelioWM](https://avatars.githubusercontent.com/u/30000615?v=4)](https://github.com/MarcoAurelioWM "MarcoAurelioWM (1 commits)")[![MaxSem](https://avatars.githubusercontent.com/u/1260606?v=4)](https://github.com/MaxSem "MaxSem (1 commits)")[![paladox](https://avatars.githubusercontent.com/u/5727000?v=4)](https://github.com/paladox "paladox (1 commits)")[![thiemowmde](https://avatars.githubusercontent.com/u/6576639?v=4)](https://github.com/thiemowmde "thiemowmde (1 commits)")

###  Code Quality

TestsPHPUnit

### Embed Badge

![Health badge](/badges/wikimedia-utfnormal/health.svg)

```
[![Health](https://phpackages.com/badges/wikimedia-utfnormal/health.svg)](https://phpackages.com/packages/wikimedia-utfnormal)
```

###  Alternatives

[pocketmine/math

PHP library containing math related code used in PocketMine-MP

45573.2k14](/packages/pocketmine-math)[nathancox/minify

Minifies CSS requirements using Minify (http://code.google.com/p/minify/)

28143.4k1](/packages/nathancox-minify)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
