PHPackages                             railt/lexer - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Parsing &amp; Serialization](/categories/parsing)
4. /
5. railt/lexer

Abandoned → [phplrt/lexer](/?search=phplrt%2Flexer)ArchivedLibrary[Parsing &amp; Serialization](/categories/parsing)

railt/lexer
===========

Fast implementation of the stateful and stateless lexers

1.3.2(7y ago)63.4k4MITPHPPHP &gt;=7.1.3

Since Jul 10Pushed 7y ago1 watchersCompare

[ Source](https://github.com/railt/lexer)[ Packagist](https://packagist.org/packages/railt/lexer)[ Docs](http://railt.org)[ RSS](/packages/railt-lexer/feed)WikiDiscussions 1.4.x Synced 3d ago

READMEChangelog (5)Dependencies (2)Versions (8)Used By (4)

 [![Railt](https://camo.githubusercontent.com/0b50c2bf674e95ba4fc92f5ac922a73a5b58bf284e5b44808fe5e30c674b2ca5/68747470733a2f2f7261696c742e6f72672f696d616765732f6c6f676f2d6461726b2e737667)](https://camo.githubusercontent.com/0b50c2bf674e95ba4fc92f5ac922a73a5b58bf284e5b44808fe5e30c674b2ca5/68747470733a2f2f7261696c742e6f72672f696d616765732f6c6f676f2d6461726b2e737667)

 [![Travis CI](https://camo.githubusercontent.com/66b6b21e01fc3aef50b69012cb1224dd101c664842efa8c96c54b57dc28178a9/68747470733a2f2f7472617669732d63692e6f72672f7261696c742f6c657865722e7376673f6272616e63683d312e342e78)](https://travis-ci.org/railt/lexer) [![](https://camo.githubusercontent.com/23b1b42a433c9717725f3789fbb9816dae471bab16165ae36ac115d4b9dee7f8/68747470733a2f2f6170692e636f6465636c696d6174652e636f6d2f76312f6261646765732f38663462306532383932386266326234343562322f746573745f636f766572616765)](https://codeclimate.com/github/railt/lexer/test_coverage) [![](https://camo.githubusercontent.com/fa9105789ce553d5cc02c70814c9015c07bdfe3fe3f2f3e32b6ee70e1b04580c/68747470733a2f2f6170692e636f6465636c696d6174652e636f6d2f76312f6261646765732f38663462306532383932386266326234343562322f6d61696e7461696e6162696c697479)](https://codeclimate.com/github/railt/lexer/maintainability)

 [![PHP 7.1+](https://camo.githubusercontent.com/82f44fe3f392d956a0a6c330c62d48cfd37cc0fb6e21c39dad8f86c6c356dd30/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f5048502d372e312b2d3666346361352e737667)](https://packagist.org/packages/railt/lexer) [![railt.org](https://camo.githubusercontent.com/e454d7e2800bf9ae8d0c4373f3312eb36201bb6987a75cf26c7afb56c121d2f3/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6f6666696369616c2d736974652d3666346361352e737667)](https://railt.org) [![Discord](https://camo.githubusercontent.com/138a9e4e12a3ce4d76600b77fb52ee35a53342b8abaae49177c4db37f194465a/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f646973636f72642d636861742d3666346361352e737667)](https://discord.gg/ND7SpD4) [![Latest Stable Version](https://camo.githubusercontent.com/3d7453b17e549a1ba6e77e4aa516d55eeb0db5d94b1faafa766fd07ef6809051/68747470733a2f2f706f7365722e707567782e6f72672f7261696c742f6c657865722f76657273696f6e)](https://packagist.org/packages/railt/lexer) [![Total Downloads](https://camo.githubusercontent.com/c5f24eededac5e57df9dfa3abe916fb79e7a1686d5b6ca759049289a1556ca79/68747470733a2f2f706f7365722e707567782e6f72672f7261696c742f6c657865722f646f776e6c6f616473)](https://packagist.org/packages/railt/lexer) [![License MIT](https://camo.githubusercontent.com/f1e632711a3780114d0beb4309cca5a0f825a880c5da667bc05bcbf7bc370b8b/68747470733a2f2f706f7365722e707567782e6f72672f7261696c742f6c657865722f6c6963656e7365)](https://raw.githubusercontent.com/railt/lexer/1.4.x/LICENSE.md)

Lexer
=====

[](#lexer)

> Note: All questions and issues please send to

> Note: Tests can not always pass correctly. This may be due to the inaccessibility of PPA servers for updating gcc and g++. The lexertl build requires the support of a modern compiler inside Travis CI. In this case, a gray badge will be displayed with the message "Build Error".

In order to quickly understand how it works - just write ~4 lines of code:

```
$lexer = Railt\Component\Lexer\Factory::create(['T_WHITESPACE' => '\s+', 'T_DIGIT' => '\d+'], ['T_WHITESPACE']);

foreach ($lexer->lex(Railt\Component\Io\File::fromSources('23 42')) as $token) {
    echo $token . "\n";
}
```

This example will read the source text and return the set of tokens from which it is composed:

1. `T_DIGIT` with value "23"
2. `T_DIGIT` with value "42"

The second argument to the `Factory` class is the list of token names that are ignored in the `lex` method result. That's why we only got two significant tokens `T_DIGIT`. Although this is not entirely true, the answer contains a `T_EOI` (End Of Input) token which can also be removed from the output by adding an array of the second argument of `Factory` class.

...and now let's try to understand more!

The lexer contains two types of runtime:

1. [`Basic`](#basic) - Set of algorithms with one state.
2. [`Multistate`](#multistate) - Set of algorithms with the possibility of state transition between tokens.

> In connection with the fact that there were almost no differences in speed between several implementations (Stateful vs Stateless) of the same algorithm, it was decided to abandon the immutable stateful lexers.

```
use Railt\Component\Lexer\Factory;

/**
 * List of available tokens in format "name => pcre"
 */
$tokens = ['T_DIGIT' => '\d+', 'T_WHITESPACE' => '\s+'];

/**
 * List of skipped tokens
 */
$skip   = ['T_WHITESPACE'];

/**
 * Options:
 *   0 - Nothing.
 *   2 - With PCRE lookahead support.
 *   4 - With multistate support.
 */
$flags = Factory::LOOKAHEAD | Factory::MULTISTATE;

/**
 * Create lexer and tokenize sources.
 */
$lexer = Factory::create($tokens, $skip, $flags);
```

In order to tokenize the source text, you must use the method `->lex(...)`, which returns iterator of the `TokenInterface` objects.

```
foreach ($lexer->lex(File::fromSources('23 42')) as $token) {
    echo $token . "\n";
}
```

A `TokenInterface` provides a convenient API to obtain information about a token:

```
interface TokenInterface
{
    public function getName(): string;
    public function getOffset(): int;
    public function getValue(int $group = 0): ?string;
    public function getGroups(): iterable;
    public function getBytes(): int;
    public function getLength(): int;
}
```

Drivers
-------

[](#drivers)

The factory returns one of the available implementations, however you can create it yourself.

### Basic

[](#basic)

#### NativeRegex

[](#nativeregex)

`NativeRegex` implementation is based on the built-in php PCRE functions.

```
use Railt\Component\Lexer\Driver\NativeRegex;
use Railt\Component\Io\File;

$lexer = new NativeRegex(['T_WHITESPACE' => '\s+', 'T_DIGIT' => '\d+'], ['T_WHITESPACE', 'T_EOI']);

foreach ($lexer->lex(File::fromSources('23 42')) as $token) {
    echo $token->getName() . ' -> ' . $token->getValue() . ' at ' . $token->getOffset() . "\n";
}

// Outputs:
// T_DIGIT -> 23 at 0
// T_DIGIT -> 42 at 3
```

#### Lexertl

[](#lexertl)

Experimental lexer based on the [C++ lexertl library](https://github.com/BenHanson/lexertl). To use it, you need support for [Parle extension](http://php.net/manual/en/book.parle.php).

```
use Railt\Component\Lexer\Driver\ParleLexer;
use Railt\Component\Io\File;

$lexer = new ParleLexer(['T_WHITESPACE' => '\s+', 'T_DIGIT' => '\d+'], ['T_WHITESPACE', 'T_EOI']);

foreach ($lexer->lex(File::fromSources('23 42')) as $token) {
    echo $token->getName() . ' -> ' . $token->getValue() . ' at ' . $token->getOffset() . "\n";
}

// Outputs:
// T_DIGIT -> 23 at 0
// T_DIGIT -> 42 at 3
```

> Be careful: The library is not fully compatible with the PCRE regex syntax. See the [official documentation](http://www.benhanson.net/lexertl.html).

### Multistate

[](#multistate)

This functionality is not yet implemented.

###  Health Score

33

—

LowBetter than 75% of packages

Maintenance20

Infrequent updates — may be unmaintained

Popularity23

Limited adoption so far

Community15

Small or concentrated contributor base

Maturity62

Established project with proven stability

 Bus Factor1

Top contributor holds 98.7% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~56 days

Recently: every ~70 days

Total

6

Last Release

2585d ago

PHP version history (2 changes)1.2.0PHP &gt;=7.1.3

1.3.x-devPHP ^7.1.3

### Community

Maintainers

![](https://avatars.githubusercontent.com/u/150420?v=4)[Ruslan Sharipov](/maintainers/Serafim)[@serafim](https://github.com/serafim)

---

Top Contributors

[![SerafimArts](https://avatars.githubusercontent.com/u/2461257?v=4)](https://github.com/SerafimArts "SerafimArts (76 commits)")[![kefzce](https://avatars.githubusercontent.com/u/8298339?v=4)](https://github.com/kefzce "kefzce (1 commits)")

---

Tags

lexlexemelexerlexical-analysispcreregextokenizertokensphplanguagelexer

###  Code Quality

TestsPHPUnit

### Embed Badge

![Health badge](/badges/railt-lexer/health.svg)

```
[![Health](https://phpackages.com/badges/railt-lexer/health.svg)](https://phpackages.com/packages/railt-lexer)
```

###  Alternatives

[doctrine/lexer

PHP Doctrine Lexer parser library that can be used in Top-Down, Recursive Descent Parsers.

11.2k910.8M118](/packages/doctrine-lexer)[nikic/phlexy

Lexing experiments in PHP

162570.9k13](/packages/nikic-phlexy)[rajentrivedi/tokenizer-x

TokenizerX calculates required tokens for given prompt

91214.0k3](/packages/rajentrivedi-tokenizer-x)[type-lang/parser

Library for parsing and validating TypeLang syntax and converting it into AST nodes

5158.4k6](/packages/type-lang-parser)[corveda/php-sandbox

A PHP library that can be used to run PHP code in a sandboxed environment

23483.5k2](/packages/corveda-php-sandbox)[bupy7/xml-constructor

The array-like constructor of XML document structure.

1337.9k](/packages/bupy7-xml-constructor)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
