PHPackages                             ryangjchandler/lexical - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Utility &amp; Helpers](/categories/utility)
4. /
5. ryangjchandler/lexical

ActiveLibrary[Utility &amp; Helpers](/categories/utility)

ryangjchandler/lexical
======================

Attribute-driven tokenisation for PHP.

v1.0.2(4mo ago)533712[1 PRs](https://github.com/ryangjchandler/lexical/pulls)3MITPHPPHP ^8.3CI passing

Since Jun 7Pushed 3mo ago2 watchersCompare

[ Source](https://github.com/ryangjchandler/lexical)[ Packagist](https://packagist.org/packages/ryangjchandler/lexical)[ Docs](https://github.com/ryangjchandler/lexical)[ GitHub Sponsors](https://github.com/ryangjchandler)[ RSS](/packages/ryangjchandler-lexical/feed)WikiDiscussions main Synced 1mo ago

READMEChangelog (9)Dependencies (3)Versions (11)Used By (3)

Lexical
=======

[](#lexical)

[![Latest Version on Packagist](https://camo.githubusercontent.com/fbf9ee3cac9310242383ff17ce9307cd217f8be9b7652250c828b7a798af18ce/68747470733a2f2f696d672e736869656c64732e696f2f7061636b61676973742f762f7279616e676a6368616e646c65722f6c65786963616c2e7376673f7374796c653d666c61742d737175617265)](https://packagist.org/packages/ryangjchandler/lexical)[![Tests](https://camo.githubusercontent.com/ed2597b0b1e585d307025fa99dab420c6f9d04805b6e223f330335bf44cdb2b1/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f616374696f6e732f776f726b666c6f772f7374617475732f7279616e676a6368616e646c65722f6c65786963616c2f72756e2d74657374732e796d6c3f6272616e63683d6d61696e266c6162656c3d7465737473267374796c653d666c61742d737175617265)](https://github.com/ryangjchandler/lexical/actions/workflows/run-tests.yml)[![Total Downloads](https://camo.githubusercontent.com/b35810243e5c6188e0877c279d245c0d70fb52aa01583e871af39fad075554bf/68747470733a2f2f696d672e736869656c64732e696f2f7061636b61676973742f64742f7279616e676a6368616e646c65722f6c65786963616c2e7376673f7374796c653d666c61742d737175617265)](https://packagist.org/packages/ryangjchandler/lexical)

Installation
------------

[](#installation)

You can install the package via Composer:

```
composer require ryangjchandler/lexical
```

Usage
-----

[](#usage)

Let's write a simple lexer for mathematical expressions. The expressions can contain numbers (only integers) and a handful of operators (`+`, `-`, `*`, `/`).

Begin by creating a new enumeration that describes the token types.

```
enum TokenType
{
    case Number;
    case Add;
    case Subtract;
    case Multiply;
    case Divide;
}
```

Lexical provides a set of attributes that can be added to each case in an enumeration:

- `Regex` - accepts a single regular expression.
- `Literal` - accepts a string of continuous characters.
- `Error` - designates a specific enumeration case as the "error" type.

Using those attributes with `TokenType` looks like this.

```
enum TokenType
{
    #[Regex("[0-9]+")]
    case Number;

    #[Literal("+")]
    case Add;

    #[Literal("-")]
    case Subtract;

    #[Literal("*")]
    case Multiply;

    #[Literal("/")]
    case Divide;
}
```

With the attributes in place, we can start to build a lexer using the `LexicalBuilder`.

```
$lexer = (new LexicalBuilder)
    ->readTokenTypesFrom(TokenType::class)
    ->build();
```

The `readTokenTypesFrom()` method is used to tell the builder where we should look for the various tokenising attributes. The `build()` method will take those attributes and return an object that implements `LexerInterface`, configured to look for the specified token types.

Then it's just a case of calling the `tokenise()` method on the lexer object to retrieve an array of tokens.

```
$tokens = $lexer->tokenise('1+2'); // -> [[TokenType::Number, '1', Span(0, 1)], [TokenType::Add, '+', Span(1, 2)], [TokenType::Number, '2', Span(2, 3)]]
```

The `tokenise()` method returns a list of tuples, where the first item is the "type" (`TokenType` in this example), the second item is the "literal" (a string containing the matched characters) and the third item is the "span" of the token (the start and end positions in the original string).

### Skipping whitespace and other patterns

[](#skipping-whitespace-and-other-patterns)

Continuing with the example of a mathematical expression, the lexer currently understands `1+2` but it would fail to tokenise `1 + 2` (added whitespace). This is because by default it expects each and every possible character to fall into a pattern.

The whitespace is insignificant in this case, so can be skipped safely. To do this, we need to add a new `Lexer` attribute to the `TokenType` enumeration and pass through a regular expression that matches the characters we want to skip.

```
#[Lexer(skip: "[ \t\n\f]+")]
enum TokenType
{
    // ...
}
```

Now the lexer will skip over any whitespace characters and successfully tokenise `1 + 2`.

### Error handling

[](#error-handling)

When a lexer encounters an unexpected character, it will throw an `UnexpectedCharacterException`.

```
try {
    $tokens = $lexer->tokenise();
} catch (UnexpectedCharacterException $e) {
    dd($e->character, $e->position);
}
```

As mentioned above, there is an `Error` attribute that can be used to mark an enum case as the "error" type.

```
enum TokenType
{
    // ...

    #[Error]
    case Error;
}
```

Now when the input is tokenised, the unrecognised character will be consumed like other tokens and will have a type of `TokenType::Error`.

```
$tokens = $lexer->tokenise('1 % 2'); // -> [[TokenType::Number, '1'], [TokenType::Error, '%'], [TokenType::Number, '2']]
```

### Custom `Token` objects

[](#custom-token-objects)

If you prefer to work with dedicated objects instead of Lexical's default tuple values for each token, you can provide a custom callback to map the matched token type and literal into a custom object.

```
class Token
{
    public function __construct(
        public readonly TokenType $type,
        public readonly string $literal,
        public readonly Span $span,
    ) {}
}

$lexer = (new LexicalBuilder)
    ->readTokenTypesFrom(TokenType::class)
    ->produceTokenUsing(fn (TokenType $type, string $literal, Span $span) => new Token($type, $literal, $span))
    ->build();

$lexer->tokenise('1 + 2'); // -> [Token { type: TokenType::Number, literal: "1" }, ...]
```

### Token Producers

[](#token-producers)

Regular expressions and literal tokens can get you quite far when it comes to tokenisation, but there are scenarios where it would be easier to write "real" code to tokenise your input.

Lexical makes this possible by providing a Token Producer API. Token producers are regular PHP objects that implement the `RyanChandler\Lexical\Contracts\TokenProviderInterface` or `RyanChandler\Lexical\Contracts\TolerantTokenProviderInterface` interfaces.

They are attached to your token types using the `RyanChandler\Lexical\Attributes\Custom` attribute, passing through the fully-qualified name of the token producer class.

```
use RyanChandler\Lexical\Attributes\Custom;
use RyanChandler\Lexical\InputSource;

enum Literals
{
    #[Custom(StringTokenProducer::class)]
    case String;
}

class StringTokenProducer implements TokenProducerInterface
{
    public function produce(InputSource $source): ?string
    {
        //
    }
}
```

The `InputSource` object provided to the `produce()` method can be used to determine whether or not a token can be produced at the current offset. It comes with a range of utility methods such as `current()`, `peek()` and `match()`.

If your token type has an `Error` case defined, your token producer will need to implement the `TolerantTokenProducerInterface` instead. This interface has an additional method, `canProduce()`, which is used to determine whether or not the token can be seen anywhere in the remaining input.

Here's an example token producer that tokenises double-quoted strings.

```
class StringTokenProducer implements TolerantTokenProducerInterface
{
    public function canProduce(InputSource $source): int|false
    {
        $matches = $source->match('/"/', PREG_OFFSET_CAPTURE);

        if (! $matches) {
            return false;
        }

        return $matches[0][1];
    }

    public function produce(InputSource $source): ?string
    {
        // If we're not looking at a double quote, return since we can't produce a token here.
        if ($source->current() !== '"') {
            return null;
        }

        // Place an offset marker in case we need to rewind at any point.
        $source->mark();

        // Consume the " character.
        $token = $source->consume();

        while ($source->current() !== '"') {
            // If we reach the end of the file before we find a closing double-quote,
            // we can rewind to the marker and return early.
            if ($source->isEof()) {
                $source->rewind();

                return null;
            }

            // Consume the current character.
            $token .= $source->consume();
        }

        // If we reach this point, we must be at a double-quote character since the
        // loop above has finished and we haven't returned yet.
        $token .= $source->consume();

        // Return the consumed text and let the Lexer handle the rest.
        return $token;
    }
}
```

Testing
-------

[](#testing)

```
composer test
```

Changelog
---------

[](#changelog)

Please see [CHANGELOG](CHANGELOG.md) for more information on what has changed recently.

Contributing
------------

[](#contributing)

Please see [CONTRIBUTING](https://github.com/spatie/.github/blob/main/CONTRIBUTING.md) for details.

Security Vulnerabilities
------------------------

[](#security-vulnerabilities)

Please review [our security policy](../../security/policy) on how to report security vulnerabilities.

Credits
-------

[](#credits)

- [Ryan Chandler](https://github.com/ryangjchandler)
- [All Contributors](../../contributors)

License
-------

[](#license)

The MIT License (MIT). Please see [License File](LICENSE.md) for more information.

###  Health Score

50

—

FairBetter than 95% of packages

Maintenance81

Actively maintained with recent releases

Popularity26

Limited adoption so far

Community19

Small or concentrated contributor base

Maturity65

Established project with proven stability

 Bus Factor1

Top contributor holds 76.5% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~115 days

Recently: every ~215 days

Total

9

Last Release

144d ago

Major Versions

v0.2.1 → v1.0.02025-01-20

PHP version history (2 changes)v0.1.0PHP ^8.1

v1.0.0PHP ^8.3

### Community

Maintainers

![](https://www.gravatar.com/avatar/568d485d441c691b0358b9091254a6a671fef8f76b73f28af1180ad568d142b2?d=identicon)[ryangjchandler](/maintainers/ryangjchandler)

---

Top Contributors

[![ryangjchandler](https://avatars.githubusercontent.com/u/41837763?v=4)](https://github.com/ryangjchandler "ryangjchandler (26 commits)")[![dependabot[bot]](https://avatars.githubusercontent.com/in/29110?v=4)](https://github.com/dependabot[bot] "dependabot[bot] (5 commits)")[![github-actions[bot]](https://avatars.githubusercontent.com/in/15368?v=4)](https://github.com/github-actions[bot] "github-actions[bot] (2 commits)")[![mdavis1982](https://avatars.githubusercontent.com/u/199807?v=4)](https://github.com/mdavis1982 "mdavis1982 (1 commits)")

---

Tags

ryangjchandlerlexical

###  Code Quality

TestsPest

Code StyleLaravel Pint

### Embed Badge

![Health badge](/badges/ryangjchandler-lexical/health.svg)

```
[![Health](https://phpackages.com/badges/ryangjchandler-lexical/health.svg)](https://phpackages.com/packages/ryangjchandler-lexical)
```

###  Alternatives

[ryangjchandler/filament-progress-column

Add a progress bar column to your Filament tables.

68117.5k1](/packages/ryangjchandler-filament-progress-column)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
