PHPackages                             codeinc/query-tokens-extractor - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Utility &amp; Helpers](/categories/utility)
4. /
5. codeinc/query-tokens-extractor

AbandonedArchivedLibrary[Utility &amp; Helpers](/categories/utility)

codeinc/query-tokens-extractor
==============================

Extract tokens from a query

1.0.0(2y ago)035MITPHPPHP &gt;=8.2

Since Aug 21Pushed 2y ago1 watchersCompare

[ Source](https://github.com/codeinchq/query-tokens-extractor)[ Packagist](https://packagist.org/packages/codeinc/query-tokens-extractor)[ Docs](https://github.com/codeinchq/query-tokens-extractor)[ RSS](/packages/codeinc-query-tokens-extractor/feed)WikiDiscussions main Synced 1mo ago

READMEChangelog (4)Dependencies (2)Versions (5)Used By (0)

Query tokens extractor
======================

[](#query-tokens-extractor)

Extract tokens from a query using regex defined tokens. The library is written in [PHP 8.2](https://www.php.net/releases/8.2/en.php).

Installation
------------

[](#installation)

The package is [available on Packagist](https://packagist.org/packages/codeinc/query-tokens-extractor) and can be installed using Composer:

```
composer req codeinc/query-token-extractor
```

Usage
-----

[](#usage)

```
use CodeInc\QueryTokenExtractor\QueryTokenExtractor;
use CodeInc\QueryTokensExtractor\Type\RegexType;
use CodeInc\QueryTokensExtractor\Type\WordType;
use CodeInc\QueryTokensExtractor\Type\FrenchPhoneNumberType;
use CodeInc\QueryTokensExtractor\Type\FrenchPostalCodeType;
use CodeInc\QueryTokensExtractor\Type\YearType;
use CodeInc\QueryTokensExtractor\Dto\QueryToken;

$tokensExtractor = new QueryTokensExtractor([
    new FrenchPhoneNumberType(),
    new FrenchPostalCodeType(),
    new YearType(),
    new RegexType('my_custom_token', '/^this a custom token/ui'),
    new WordType(),
]);

$tokens = $tokensExtractor->extract('paris (75001) these are words 01.00.00.00.00 this a custom token 2023');

/** @var QueryToken $token */
foreach ($tokens as $token) {
    echo "Position: " . $token->position . "\n"
        ."Class: " . get_class($token->type) . "\n"
        ."Name: " . $token->type->name . "\n"
        ."Value: " . $token->value . "\n";
}
```

The above exemple will generate the following output:

```
Position: 0
Class: CodeInc\QueryTokensExtractor\Type\WordType
Name: word
Value: paris

Position: 1
Class: CodeInc\QueryTokensExtractor\Type\FrenchPostalCodeType
Name: french_postal_code
Value: 75001

Position: 2
Class: CodeInc\QueryTokensExtractor\Type\WordType
Name: word
Value: these

Position: 3
Class: CodeInc\QueryTokensExtractor\Type\WordType
Name: word
Value: are

Position: 4
Class: CodeInc\QueryTokensExtractor\Type\WordType
Name: word
Value: words

Position: 5
Class: CodeInc\QueryTokensExtractor\Type\FrenchPhoneNumberType
Name: french_phone_number
Value: 01 00 00 00 00 (the original value without punctuation)

Position: 6
Class: CodeInc\QueryTokensExtractor\Type\CustomTokenType
Name: my_custom_token
Value: this a custom token

Position: 7
Class: CodeInc\QueryTokensExtractor\Type\YearType
Name: year
Value: 2023

```

Token types
-----------

[](#token-types)

### Available token types

[](#available-token-types)

- `WordType`: extract words from the query
- `YearType`: extract years from the query
- `FrenchPhoneNumberType`: extract French phone numbers from the query
- `FrenchPostalCodeType`: extract French postal codes from the query
- `HashtagType`: extract hashtags from the query
- `RegexTokenType`: extract tokens from the query using a regex

### Token type priority

[](#token-type-priority)

The token type priority is determined by the order in which the token types are passed to the `QueryTokensExtractor` constructor.

The priority is used to determine the order in which the tokens are extracted. The higher the priority, the sooner the token will be extracted.

⚠️ The `WordType` should always be used last as it will match any string.

```
use CodeInc\QueryTokenExtractor\QueryTokenExtractor;
use CodeInc\QueryTokensExtractor\Type\WordType;
use CodeInc\QueryTokensExtractor\Type\FrenchPhoneNumberType;
use CodeInc\QueryTokensExtractor\Type\FrenchPostalCodeType;
use CodeInc\QueryTokensExtractor\Type\YearType;

$tokensExtractor = new QueryTokensExtractor([
    new FrenchPhoneNumberType(), // highest priority
    new FrenchPostalCodeType(),
    new YearType(),
    new WordType(), // lowest priority
]);
```

### Creating custom token types

[](#creating-custom-token-types)

Custom token types can be created by instantiating or extending `RegexTokenType`. The constructor of `RegexTokenType` takes four arguments:

- `string $name`: the name of the token type
- `string $regex`: the regex used to extract the token
- `\Closure $valueFormatter`: a closure used to format the extracted value (optional)

The regexp `value` capturing group is used as the extracted value (for instance the `HashtagType` type uses the regex `'/^#(?.[a-z0-9_]+)/ui'`). If no group named `value` is defined, the whole match is used as the token value.

The regexp should always start with `^` and do not constrain the end of the string with `$` as the query is split into tokens using the [`preg_replace_callback()`](https://www.php.net/manual/en/function.preg-replace-callback.php) function.

```
use CodeInc\QueryTokensExtractor\Type\RegexType;

class MyCustomTokenType extends RegexType
{
    public function __construct()
    {
        parent::__construct(
            name: 'my_custom_token',
            regexp: '/^this a custom token/ui'
        );
    }
}

// alternatively tokens can be defined directly using the RegexType class
$myCustomToken2 = new RegexType(
    name: 'my_custom_token',
    regexp: '/^this a custom token/ui'
);
```

### Token value formatting

[](#token-value-formatting)

The extracted token value can be formatted using the `valueFormatter` closure. The closure takes the extracted value as argument and must return the formatted value.

```
use CodeInc\QueryTokensExtractor\Type\RegexType;

$tokensExtractor = new QueryTokensExtractor([
    new RegexType(
        name: 'my_custom_token',
        regexp: '/^this a custom token/ui',
        // a simple closure called by QueryToken::getFormattedValue()
        valueFormatter: fn($value) => strtoupper($value)
    )
]);
$tokens = $tokensExtractor->extract('this a custom token');
$tokens->getByPosition(0)->getFormattedValue(); // THIS A CUSTOM TOKEN
```

License
-------

[](#license)

This library is published under the MIT license (see the [LICENSE](https://github.com/codeinchq/query-tokens-extractor/blob/main/LICENSE) file).

###  Health Score

26

—

LowBetter than 43% of packages

Maintenance20

Infrequent updates — may be unmaintained

Popularity7

Limited adoption so far

Community8

Small or concentrated contributor base

Maturity59

Maturing project, gaining track record

 Bus Factor1

Top contributor holds 100% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~0 days

Total

4

Last Release

1000d ago

### Community

Maintainers

![](https://www.gravatar.com/avatar/ce6d7fd91f537008e6af8eef5f5529e994cf74be8d78ee7b9d9ecac476ba1444?d=identicon)[joanfabregat](/maintainers/joanfabregat)

![](https://www.gravatar.com/avatar/8f72f949d7f70e400c02c225685f1934dec6219689f25edec38b4037df166d58?d=identicon)[codeinc](/maintainers/codeinc)

---

Top Contributors

[![joanfabregat](https://avatars.githubusercontent.com/u/4227907?v=4)](https://github.com/joanfabregat "joanfabregat (22 commits)")

###  Code Quality

TestsPHPUnit

### Embed Badge

![Health badge](/badges/codeinc-query-tokens-extractor/health.svg)

```
[![Health](https://phpackages.com/badges/codeinc-query-tokens-extractor/health.svg)](https://phpackages.com/packages/codeinc-query-tokens-extractor)
```

###  Alternatives

[producer/producer

Tools for releasing library packages; supports Git, Mercurial, Github, Gitlab, and Bitbucket.

10418.7k2](/packages/producer-producer)[fabianmichael/kirby-meta

Your all-in-one powerhouse for any SEO and metadata needs imaginable.

6910.7k1](/packages/fabianmichael-kirby-meta)[n98/headless-guillotine

Frontend routing whitelist configurations for headless setups.

2627.2k](/packages/n98-headless-guillotine)[silverstripe/multiuser-editing-alert

A module that indicates when people are editing the same page in the CMS

1530.7k1](/packages/silverstripe-multiuser-editing-alert)[evilfreelancer/openvpn-php

OpenVPN config generator writen on PHP

304.7k](/packages/evilfreelancer-openvpn-php)[lucapuddu/php-provably-fair

PhpProvablyFair is a library that generates and verifies provably fair games.

154.9k](/packages/lucapuddu-php-provably-fair)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
