PHPackages                             opencat/translation-memory - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Utility &amp; Helpers](/categories/utility)
4. /
5. opencat/translation-memory

ActiveLibrary[Utility &amp; Helpers](/categories/utility)

opencat/translation-memory
==========================

SQLite-backed translation memory with fuzzy matching for the OpenCAT Framework

00PHP

Since May 9Pushed 1mo agoCompare

[ Source](https://github.com/shaikhammar/opencat-translation-memory)[ Packagist](https://packagist.org/packages/opencat/translation-memory)[ RSS](/packages/opencat-translation-memory/feed)WikiDiscussions main Synced 1w ago

READMEChangelogDependenciesVersions (1)Used By (0)

opencat/translation-memory
==========================

[](#opencattranslation-memory)

SQLite-backed translation memory with exact and fuzzy matching for the [OpenCAT Framework](https://github.com/shaikhammar/opencat-framework).

Stores `TranslationUnit` objects, looks them up by similarity against a source `Segment`, and imports/exports via TMX. A PostgreSQL backend is also available for multi-user deployments.

Installation
------------

[](#installation)

```
composer require opencat/translation-memory
```

Requires `ext-pdo`, `ext-pdo_sqlite`, `ext-intl`, and `ext-mbstring`.

For PostgreSQL: install `ext-pdo_pgsql` and enable the `pg_trgm` extension in the database.

SQLite TM
---------

[](#sqlite-tm)

```
use CatFramework\TranslationMemory\SqliteTranslationMemory;

$pdo = new PDO('sqlite:project.db');
$tm  = new SqliteTranslationMemory($pdo);
// Schema is created automatically on first instantiation
```

Storing translation units
-------------------------

[](#storing-translation-units)

```
use CatFramework\Core\Model\TranslationUnit;

$tm->store(new TranslationUnit(
    source: $sourceSegment,
    target: $targetSegment,
    sourceLanguage: 'en-US',
    targetLanguage: 'fr-FR',
    createdAt: new DateTimeImmutable(),
    createdBy: 'translator@example.com',
));
```

Duplicate entries (same language pair and normalised source text) are silently overwritten with the new translation.

Looking up matches
------------------

[](#looking-up-matches)

```
$matches = $tm->lookup(
    source: $segment,
    sourceLanguage: 'en-US',
    targetLanguage: 'fr-FR',
    minScore: 0.7,    // 0.0–1.0, default 0.7
    maxResults: 5,    // default 5
);

foreach ($matches as $match) {
    echo round($match->score * 100) . '%  ' . $match->type->name . PHP_EOL;
    echo $match->translationUnit->target->getPlainText() . PHP_EOL;
}
```

Results are sorted by score descending. `$match->type` is one of:

ScoreTypeMeaning1.0`EXACT`Identical text and identical inline codes1.0`EXACT_TEXT`Identical plain text, but inline codes differ&lt; 1.0`FUZZY`Character-level similarity above `$minScore`Importing and exporting TMX
---------------------------

[](#importing-and-exporting-tmx)

```
$count = $tm->import('memory.tmx');   // returns number of units imported
$count = $tm->export('backup.tmx');   // returns number of units exported
```

Import uses the streaming TMX reader, so large files are processed without loading everything into memory.

How fuzzy matching works
------------------------

[](#how-fuzzy-matching-works)

1. **Normalisation** — source text is normalised through a pipeline before storage and again at lookup: NFC Unicode → lowercase → collapse whitespace → trim. This makes matching robust to capitalisation and whitespace differences.
2. **Length pre-filter** — only candidates whose character count falls within `[sourceLen × minScore, sourceLen ÷ minScore]` are retrieved from the database. This is a fast index scan that avoids running Levenshtein on the entire TM.
3. **Levenshtein similarity** — for ASCII text, PHP's native `levenshtein()` is used. For multibyte text (Hindi, Urdu, Arabic, CJK), `ext-intl` grapheme-cluster arrays are used so that multi-byte characters count as single edit operations.

Custom normaliser pipeline
--------------------------

[](#custom-normaliser-pipeline)

```
use CatFramework\TranslationMemory\Normalizer\NormalizerInterface;

class MyNormalizer implements NormalizerInterface
{
    public function normalize(string $text): string
    {
        return mb_strtolower($text);  // custom logic
    }
}

$tm->setNormalizers([new MyNormalizer()]);
```

PostgreSQL TM
-------------

[](#postgresql-tm)

For multi-user or large-scale deployments:

```
use CatFramework\TranslationMemory\PostgresTranslationMemory;

$pdo = new PDO('pgsql:host=localhost;dbname=catdb', 'user', 'pass');
$tm  = new PostgresTranslationMemory($pdo);
```

Requires the `pg_trgm` extension enabled in PostgreSQL (`CREATE EXTENSION IF NOT EXISTS pg_trgm`). The PostgreSQL backend uses trigram similarity for fuzzy matching instead of Levenshtein, which scales better for large TMs.

Related packages
----------------

[](#related-packages)

- [`opencat/core`](https://github.com/shaikhammar/opencat-framework/tree/main/packages/core) — `TranslationUnit`, `Segment`, `MatchResult`, `TranslationMemoryInterface`
- [`opencat/tmx`](https://github.com/shaikhammar/opencat-framework/tree/main/packages/tmx) — `TmxReader` used by `import()`, `TmxWriter` used by `export()`
- [`opencat/workflow`](https://github.com/shaikhammar/opencat-framework/tree/main/packages/workflow) — uses `SqliteTranslationMemory` in the processing pipeline

###  Health Score

19

—

LowBetter than 10% of packages

Maintenance61

Regular maintenance activity

Popularity0

Limited adoption so far

Community6

Small or concentrated contributor base

Maturity11

Early-stage or recently created project

 Bus Factor1

Top contributor holds 100% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

### Community

Maintainers

![](https://www.gravatar.com/avatar/c6b9044413071860fd6199e56c661a24f449797b77a1b130f0df7e98be8481f4?d=identicon)[shaikhammar](/maintainers/shaikhammar)

---

Top Contributors

[![actions-user](https://avatars.githubusercontent.com/u/65916846?v=4)](https://github.com/actions-user "actions-user (4 commits)")

### Embed Badge

![Health badge](/badges/opencat-translation-memory/health.svg)

```
[![Health](https://phpackages.com/badges/opencat-translation-memory/health.svg)](https://phpackages.com/packages/opencat-translation-memory)
```

###  Alternatives

[team-nifty-gmbh/tall-datatables

Server-side rendered datatables for Laravel and Livewire

1319.7k3](/packages/team-nifty-gmbh-tall-datatables)[apanly/browser-detector

根据UA判断浏览器类型和版本，操作系统，设备型号

302.7k](/packages/apanly-browser-detector)[terabin/flarum-ext-sitemap

Generate a Sitemap for Flarum automatically

103.0k1](/packages/terabin-flarum-ext-sitemap)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
