PHPackages                             farafiri/php-parsing-tool - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. farafiri/php-parsing-tool

ActiveLibrary

farafiri/php-parsing-tool
=========================

library for parsing

v2.0.0(8y ago)29178.9k↑62%4[2 issues](https://github.com/farafiri/PHP-parsing-tool/issues)MITPHPPHP &gt;=7.0

Since Aug 10Pushed 2y ago6 watchersCompare

[ Source](https://github.com/farafiri/PHP-parsing-tool)[ Packagist](https://packagist.org/packages/farafiri/php-parsing-tool)[ RSS](/packages/farafiri-php-parsing-tool/feed)WikiDiscussions master Synced 1mo ago

READMEChangelog (4)Dependencies (1)Versions (9)Used By (0)

This library:
-------------

[](#this-library)

- tries to be easy to use like regular expression (no compiling required, no extra file keeping grammar required)
- is powerful like others grammar compilers
- returns easy to manipulate syntax tree objects
- uses convenient syntax which combines best things from BNF notation (idea) and regular expression (\* and + for repetitions)
- allows you to deal with any context free grammar

Example
-------

[](#example)

Let say you have string with dates in format d.m.y or y-m-d separated by comma.

```
  $dates = '2012-03-04,2013-02-08,23.06.2012';

  $parser = new \ParserGenerator\Parser('start     :=> datesList.
                                         datesList :=> date "," datesList
                                                   :=> date.
                                         date      :=> year "-" month "-" day
                                                   :=> day "." month "." year.
                                         year      :=> /\d{4}/.
                                         month     :=> /\d{2}/.
                                         day       :=> /\d{2}/.');

  $parsed = $parser->parse($dates);

  //and now you want to get list of years
  foreach($parsed->findAll('year') as $year) {
    echo $year;
  }

  //this time you want to print all months form year 2012
  foreach($parsed->findAll('date') as $date) {
    if ((string) $date->findFirst('year') === '2012') {
      echo $date->findFirst('month');
    }
  }
```

### Even more examples

[](#even-more-examples)

There are a few different example parsers implemented in the `ParserGenerator\Example` namespace, and you'll also find tests for them, e.g.:

- `\ParserGenerator\Examples\CSVParser` (and `\ParserGenerator\Tests\Examples\CSVParserTest`)
- `\ParserGenerator\Examples\JSONParser` (and `\ParserGenerator\Tests\Examples\JSONParserTest`)

Branch types
------------

[](#branch-types)

You could declare the previous grammar as PEG and it would improve speed x10. To do so, add the option `['defaultBranchType' => 'PEG']` as a second argument to `Parser`. Note that not every grammar can be parsed with PEG packrat algorithm.

```
  // by adding 'defaultBranchType' with 'PEG' value into options we declare grammar as PEG
  $parser = new \ParserGenerator\Parser('start :=> start "x"
                                               :=> "x".', array('defaultBranchType' => 'PEG'));
  // but PEG grammar cannot be left recursive, call parse will run infinite loop in this case
  //You have 2 solutions now
  //1-st: you can change grammar a bit:
  $parser = new \ParserGenerator\Parser('start :=> "x" start
                                               :=> "x".', array('defaultBranchType' => 'PEG'));

  //2-nd: use default branch type
  $parser = new \ParserGenerator\Parser('start :=> start "x"
                                               :=> "x".');
```

Error handling
--------------

[](#error-handling)

If your input cannot be parsed, `\ParserGenerator\Parser::parse` will return `false`.

Use `\ParserGenerator\Parser::getErrorString()` and provide your input data to get a human-readable error description.

Alternatively you can use `\ParserGenerator\Parser::getError()` and directly work with error nodes.

Symbols
-------

[](#symbols)

##### "text"

[](#text)

Matches text. You can also use single quotes. You can use escape sequences so "\\n" will match new line.

##### /regular|expression/

[](#regularexpression)

Matches given regular expression. You can use pattern modifiers. Grammar like "start :=&gt; /\[a-z\]/i." will also match upper case letters. Regular expression cannot be backtracked. They work like the first match is the only match. For example: "start :=&gt; /a+/ 'a'.", when we try to parse string "aa" regular expression will capture both characters and the string will be not matched.

##### symbolName

[](#symbolname)

Will match the defined symbol. The following example will match any pair of letters, followed by digits.

```
start  :=> letter digit.
letter :=> /\w/.
digit  :=> /\d/.

```

##### whiteSpace, space, newLine, tab

[](#whitespace-space-newline-tab)

whiteSpace matches space, tabulator or new line character If ignoreWhitespaces mode is off these symbols work same as /\\s/, " ", /\\t/, /\\n/. When ignoreWhitespaces mode is on then /\\s/, " ", "\\t", "\\n" won't work and you must use whiteSpace, space, etc symbols. In ignoreWhitespaces mode these symbols check context and not consuming characters from input. For example sequence: 'a' newLine space space 'b' will match characters 'a' and 'b' separated by at least one space and at least one new line symbol

##### text

[](#text-1)

match any text

##### symbol+

[](#symbol)

Will try to match symbol several times (at least once). For example start :=&gt; "a"+. will match "a" "aa" "aaa" but not ""

##### symbol?

[](#symbol-1)

Symbol is optional. For example start :=&gt; "a"?. wil match "a" and "" but not "aa"

##### symbol\*

[](#symbol-2)

Will try to match symbol several times (symbol is optional) For example start :=&gt; "a"\*. will match "a" "aa" "aaa" and ""

##### symbol++, symbol\*\*, symbol??

[](#symbol-symbol-symbol)

Same as adequate symbol+, symbol\* and symbol\* but consumes it in a greedy way. Example:

```
$nonGreedy = new \ParserGenerator\Parser('start :=> "a"* "a"*.');
$nonGreedy->parse("aaa")->getSubnode(0)->toString(); // "" first "a"* takes nothing
$nonGreedy->parse("aaa")->getSubnode(1)->toString(); // "aaa" so second must consume all left

$greedy = new \ParserGenerator\Parser('start :=> "a"** "a"**.');
$greedy->parse("aaa")->getSubnode(0)->toString(); // "aaa" first "a"** takes all
$greedy->parse("aaa")->getSubnode(1)->toString(); // "" so nothing left for second

$greedy = new \ParserGenerator\Parser('start :=> "a"** "a"+.');
// "aa" "a"** tries to take all but then parsing would fail and he must leave last char for "a"+
$greedy->parse("aaa")->getSubnode(0)->toString();
$greedy->parse("aaa")->getSubnode(1)->toString(); // "a"
```

If 'defaultBranchType' is set to 'PEG' then symbol\* is equal to symbol\*\* (always greedy). Same with "+" and "?". In this mode, the last case will fail (PEG cannot parse it)

##### ?symbol

[](#symbol-3)

Lookahead. Check if symbol can be parsed but do not capture it. For example "start :=&gt; 'a' ?/.{3}/ integer. integer :=&gt; /\\d+/." will match "a" followed by at least 3 digit number.

##### !symbol

[](#symbol-4)

Negative lookahead. Similar to ?symbol but continue parsing only if cannot match symbol

##### symbol1+symbol2

[](#symbol1symbol2)

Several symbol1 occurrences separated by symbol2 (similar for \*, ++, \*\*)

```
$parser = new \ParserGenerator\Parser('start :=> word+",".
                                       word  :=> /\w+/.');
foreach($parser->parse("a,bc,d")->getSubnode(0)->getSubnodes() as $subnode) {
  echo $subnode . ' ';
} //prints "a , bc , d "

foreach($parser->parse("a,bc,d")->getSubnode(0)->getMainNodes() as $subnode) {
  echo $subnode . ' ';
} //prints "a bc d "
```

Note that symbol1+ symbol2 is something different than symbol1+symbol2. This space between + and symbol2 is crucial "a"+ "b" matches: "aaaab" but not "ababa" "a"+"b" matches: "ababa" but not :aaaab"

##### (symbol1 | symbol2)

[](#symbol1--symbol2)

Choice, match symbol1 or symbol2. For example "start :=&gt; ('a' | 'b') 'c'." will parse strings "ac" and "bc"

##### string

[](#string)

Syntax sugar for regex like: /"(\[^\\\\\]|\\.)\*"/ Matches quoted strings Example:

```
$parser = new \ParserGenerator\Parser('start :=> string.');
$stringNode = $parser->parse('"a\tb\"c"')->getSubnode(0);
echo (string) $stringNode; //prints:"a\tb\"c"
echo $stringNode->getValue(); //prints:a    b"c
```

By default string may be quoted by quotation or apostrophe. string/apostrophe : can be quoted only by apostrophe string/quotation : can be quoted only by quotation string/simple : can be quoted only by quotation, no characters escaping by , quotation character by repetition (style used in Pascal or CSV)

##### numbers

[](#numbers)

Of course you you can use /\\d+/ but using build-in toolkit for numbers is much easier and readable

```
//parser matching only integers from 3 to 17 (inclusive)
$parser = new \ParserGenerator\Parser('start :=> 3..17 .');
$parser->parse('2'); //false
$parser->parse('18'); //false
$parser->parse('12'); //syntax tree object

//parser matching only integers > 0
$parser = new \ParserGenerator\Parser('start :=> 1..infinity .');

//parser matching integers in hex decimal and oct
$parser = new \ParserGenerator\Parser('start :=> -inf..inf/hdo .');
$parser->parse('0x21')->getSubnode(0)->getValue(); // 33
$parser->parse('21')->getSubnode(0)->getValue(); //21
$parser->parse('021')->getSubnode(0)->getValue(); //17

//matching month number with leading 0 for < 10
$parser = new \ParserGenerator\Parser('start :=> 01..12 .');
$parser->parse('4'); //false
$parser->parse('04'); //syntax tree object
```

##### time()

[](#time)

Matching time in the given format:

```
$parser = new \ParserGenerator\Parser('start :=> (time(Y-m-d) | time(d.m.Y)) .');
$parser->parse('2017-01-02')->getSubnode(0)->getValue(); // equal to new \DateTime('2017-01-02')
$parser->parse('03.05.2014')->getSubnode(0)->getValue(); // equal to new \DateTime('2014-05-03')
```

##### contain, is

[](#contain-is)

Sometimes you may want to do extra checks on the parsed node. Thanks to these constructs, you can check if node contain some text or if matches a pattern:

```
$parser = new \ParserGenerator\Parser('start   :=> word not is keyword.
                                       word    :=> /\w+/.
                                       keyword :=> ("do" | "while" | "if").');
$parser->parse('do'); //false
$parser->parse('doSomething'); // syntax tree object
```

It is possible to make some basic logic operations on the check and put them into braces

```
$parser = new \ParserGenerator\Parser('start   :=> word not(is keyword or
                                                            is ("p" text /* we don`t want words starting with "p" */) or
                                                            is /./ /* we don`t want one letter words */ ).
                                       word    :=> /\w+/.
                                       keyword :=> ("do" | "while" | "if").');
$parser->parse('do'); //false
$parser->parse('d'); //false
$parser->parse('post'); //false
$parser->parse('doSomething'); // syntax tree object
```

##### unorder(separator, choice1, choice2...)

[](#unorderseparator-choice1-choice2)

unorder should be used when you expect several elements in any order

```
x :=> unorder(s, A, B, C).
/* is equivalent to */
x :=> A s B s C
  :=> A s C s B
  :=> B s A s C
  :=> B s C s A
  :=> C s A s B
  :=> C s B s A.

```

By default each element is expected exactly once but you can change it:

```
x :=> unorder(s, ?A, *B, +C).
A is optional
B may be used multiple (or zero) times
C may be used multiple (at least once) times

```

At least one element is required:

```
$parser = new \ParserGenerator\Parser('start   :=> unorder("", ?"a", ?"b").');

$parser->parse('a'); //syntax tree object
$parser->parse('b'); //syntax tree object
$parser->parse(''); //false
```

###  Health Score

39

—

LowBetter than 86% of packages

Maintenance19

Infrequent updates — may be unmaintained

Popularity42

Moderate usage in the ecosystem

Community17

Small or concentrated contributor base

Maturity63

Established project with proven stability

 Bus Factor1

Top contributor holds 66.4% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~290 days

Total

4

Last Release

3064d ago

Major Versions

v1.0.2 → v2.0.02017-12-28

PHP version history (2 changes)v1.0.0PHP &gt;=5.3.0

v2.0.0PHP &gt;=7.0

### Community

Maintainers

![](https://www.gravatar.com/avatar/ec7b6fe75e8ba3d550f2dc80526a74a90afad386140ecaa3fa9a808e5b5d83c2?d=identicon)[farafiri](/maintainers/farafiri)

---

Top Contributors

[![farafiri](https://avatars.githubusercontent.com/u/6407542?v=4)](https://github.com/farafiri "farafiri (73 commits)")[![mfn](https://avatars.githubusercontent.com/u/87493?v=4)](https://github.com/mfn "mfn (32 commits)")[![pthiers](https://avatars.githubusercontent.com/u/1180952?v=4)](https://github.com/pthiers "pthiers (2 commits)")[![wrossmann](https://avatars.githubusercontent.com/u/4967671?v=4)](https://github.com/wrossmann "wrossmann (2 commits)")[![fbpttdede](https://avatars.githubusercontent.com/u/25581816?v=4)](https://github.com/fbpttdede "fbpttdede (1 commits)")

###  Code Quality

TestsPHPUnit

### Embed Badge

![Health badge](/badges/farafiri-php-parsing-tool/health.svg)

```
[![Health](https://phpackages.com/badges/farafiri-php-parsing-tool/health.svg)](https://phpackages.com/packages/farafiri-php-parsing-tool)
```

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
