PHPackages                             opencat/srx - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Parsing &amp; Serialization](/categories/parsing)
4. /
5. opencat/srx

ActiveLibrary[Parsing &amp; Serialization](/categories/parsing)

opencat/srx
===========

SRX 2.0 segmentation rule parser for the OpenCAT Framework

00PHP

Since May 9Pushed 1mo agoCompare

[ Source](https://github.com/shaikhammar/opencat-srx)[ Packagist](https://packagist.org/packages/opencat/srx)[ RSS](/packages/opencat-srx/feed)WikiDiscussions main Synced 1w ago

READMEChangelogDependenciesVersions (1)Used By (0)

opencat/srx
===========

[](#opencatsrx)

SRX 2.0 segmentation rule parser for the [OpenCAT Framework](https://github.com/shaikhammar/opencat-framework).

Parses `.srx` files into a `SegmentationRuleSet` that the [`opencat/segmentation`](https://github.com/shaikhammar/opencat-framework/tree/main/packages/segmentation) engine uses to split text into sentences. You only need this package directly if you want to load custom SRX files; the segmentation engine loads the bundled default automatically.

Installation
------------

[](#installation)

```
composer require opencat/srx
```

Requires `ext-dom` and `ext-libxml`.

Usage
-----

[](#usage)

```
use CatFramework\Srx\SrxParser;

$parser = new SrxParser();
$ruleSet = $parser->parse('/path/to/rules.srx');

// Look up rules for a given BCP 47 language code
$languageRule = $ruleSet->rulesFor('en-US');

foreach ($languageRule->rules as $rule) {
    echo $rule->break ? 'break' : 'no-break';
    echo '  before: ' . $rule->before;
    echo '  after: '  . $rule->after;
}
```

Bundled default SRX
-------------------

[](#bundled-default-srx)

The package ships a `data/default.srx` file covering:

- English (`EN.*`)
- Hindi (`HI.*`) — Devanagari Purna Viram `।`
- Urdu (`UR.*`) — Arabic Full Stop `۔`
- Arabic (`AR.*`)
- French (`FR.*`)
- German (`DE.*`)
- Spanish (`ES.*`)
- Chinese / Japanese (`ZH.*`, `JA.*`)
- `default` fallback rule (period followed by space and uppercase)

Get its path via the static helper:

```
$path = SrxParser::defaultSrxPath();
```

SRX format overview
-------------------

[](#srx-format-overview)

SRX 2.0 is an XML format. A rule set contains:

1. **``** blocks — named sets of break/no-break rules for a language
2. **``** entries — BCP 47 regex patterns mapped to rule names

The parser resolves a language code by scanning `` entries in document order and returning the first match. If no rule matches, an empty `LanguageRule` is returned (no segmentation).

```

```

Each `` inside a `` has:

- `break="yes|no"` — whether this position is a sentence boundary
- `` — regex that must match text *before* the candidate break position
- `` — regex that must match text *after*

Rules are evaluated in order — the first matching rule wins.

Classes
-------

[](#classes)

ClassPurpose`SrxParser`Parses an SRX file into a `SegmentationRuleSet``SegmentationRuleSet`Holds all language rules and maps a BCP 47 code to a `LanguageRule``LanguageRule`A named list of `SegmentationRule` objects for one language`SegmentationRule`A single break/no-break rule with `before` and `after` patternsWriting custom SRX rules
------------------------

[](#writing-custom-srx-rules)

```

          \b(Mr|Mrs|Dr|Prof)\.
          \s

          [.!?]
          \s+[A-Z]

```

Related packages
----------------

[](#related-packages)

- [`opencat/core`](https://github.com/shaikhammar/opencat-framework/tree/main/packages/core) — `SegmentationException` thrown on parse failure
- [`opencat/segmentation`](https://github.com/shaikhammar/opencat-framework/tree/main/packages/segmentation) — consumes `SegmentationRuleSet` to split `Segment` objects

###  Health Score

19

—

LowBetter than 10% of packages

Maintenance61

Regular maintenance activity

Popularity0

Limited adoption so far

Community6

Small or concentrated contributor base

Maturity11

Early-stage or recently created project

 Bus Factor1

Top contributor holds 100% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

### Community

Maintainers

![](https://www.gravatar.com/avatar/c6b9044413071860fd6199e56c661a24f449797b77a1b130f0df7e98be8481f4?d=identicon)[shaikhammar](/maintainers/shaikhammar)

---

Top Contributors

[![actions-user](https://avatars.githubusercontent.com/u/65916846?v=4)](https://github.com/actions-user "actions-user (4 commits)")

### Embed Badge

![Health badge](/badges/opencat-srx/health.svg)

```
[![Health](https://phpackages.com/badges/opencat-srx/health.svg)](https://phpackages.com/packages/opencat-srx)
```

###  Alternatives

[mck89/peast

Peast is PHP library that generates AST for JavaScript code

19037.7M41](/packages/mck89-peast)[karriere/json-decoder

JsonDecoder implementation that allows you to convert your JSON data into PHP class objects

141439.4k12](/packages/karriere-json-decoder)[sauladam/shipment-tracker

Parses tracking information for several carriers, like UPS, USPS, DHL and GLS by simply scraping the data. No need for any kind of API access.

9642.0k](/packages/sauladam-shipment-tracker)[jstewmc/rtf

Read and write Rich Text Format (RTF) documents with PHP

45143.1k6](/packages/jstewmc-rtf)[json-mapper/laravel-package

The JsonMapper package for Laravel

25188.9k3](/packages/json-mapper-laravel-package)[jamesmoss/toml

A parser for TOML implemented in PHP.

3231.7k15](/packages/jamesmoss-toml)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
