PHPackages                             matchory/data-pipe - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Framework](/categories/framework)
4. /
5. matchory/data-pipe

ActiveLibrary[Framework](/categories/framework)

matchory/data-pipe
==================

A data processing pipeline framework

0.1.0(4y ago)212MITPHPPHP &gt;8CI passing

Since Jun 28Pushed 4y ago1 watchersCompare

[ Source](https://github.com/matchory/data-pipe)[ Packagist](https://packagist.org/packages/matchory/data-pipe)[ RSS](/packages/matchory-data-pipe/feed)WikiDiscussions main Synced 5d ago

READMEChangelog (2)Dependencies (10)Versions (3)Used By (0)

Data Pipe [![Latest Stable Version](https://camo.githubusercontent.com/e2df7e3038eea589727d7118a1a1690a009c92c59bca80643fbd57ba66cede80/687474703a2f2f706f7365722e707567782e6f72672f6d617463686f72792f646174612d706970652f76)](https://packagist.org/packages/matchory/data-pipe) [![Total Downloads](https://camo.githubusercontent.com/32029ae39fdb4d2763335dc9c7d0e18cfde7574fd5c5b12b95dc3fc22a7b2bf5/687474703a2f2f706f7365722e707567782e6f72672f6d617463686f72792f646174612d706970652f646f776e6c6f616473)](https://packagist.org/packages/matchory/data-pipe) [![Latest Unstable Version](https://camo.githubusercontent.com/45891e3bdd65c5ad34112ad6de1429a92254648ce938e14eda9b17e722f28117/687474703a2f2f706f7365722e707567782e6f72672f6d617463686f72792f646174612d706970652f762f756e737461626c65)](https://packagist.org/packages/matchory/data-pipe) [![License](https://camo.githubusercontent.com/a67ae16043a600497670424a914065e786a9cb5b5d37761669c45f3a6ebc42a5/687474703a2f2f706f7365722e707567782e6f72672f6d617463686f72792f646174612d706970652f6c6963656e7365)](https://packagist.org/packages/matchory/data-pipe)
=====================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================

[](#data-pipe----)

> An opinionated framework for building data enrichment pipelines in PHP

Data Pipe is a framework to create data enrichment pipelines in PHP. Such an application works by taking a piece of information, enriching it with additional data, and enhancing that data by applying transformations on them.

As a more tangible example, take a *customer* pipeline: It ingests the name of a customer, retrieves their *shopping history* and *age*, then enhances the record by removing old items from the shopping history, and assigning a targeting group to the customer.

While that, of course, merely describes some arbitrary business logic, Data Pipe helps you to describe this process with a set of reusable, composable, and encapsulated steps!

Preface
-------

[](#preface)

Please note that this package is still **under active development** and **NOT ready** to be used in production environments yet. We're still building our own workflow on top of data-pipe, so everything is subject to change until the 1.0 release. If you're interested in shaping the future of this library, you're very welcome to jump in!

Installation
------------

[](#installation)

Install the library as a dependency using composer:

```
php composer require matchory/data-pipe
```

### Symfony Usage

[](#symfony-usage)

This package includes a Symfony integration. Please [read the instructions](./src/Integration/Symfony/README.md) to get started.
The integration will add fully automatic pipeline configuration to your app.

### Laravel Usage

[](#laravel-usage)

This package includes an **incomplete** Laravel integration. Please [read the instructions](./src/Integration/Laravel/README.md) to get started.

> **Note:**We didn't implement Laravel support yet, because we don't currently need it. If you're interested in using `data-pipe` within a Laravel application, and would like to have automatic pipeline configuration as with Symfony, please [open an issue](https://github.com/matchory/data-pipe/issues).

Usage
-----

[](#usage)

> **Note:** Before getting started with Data Pipe, you should familiarize yourself with [its core concepts](#core-concepts).

Data Pipe works by setting up pipelines with a pre-configured set of inter-dependent nodes. There are currently two types of nodes: [Collector nodes](#collector-nodes) and [Transformer nodes](#transformer-nodes) (which are both variants of generic pipeline nodes).
Nodes take a payload object, modify and return it. Enriching nodes add new data, post-processing nodes transform existing values. This distinction might seem irrelevant, but it allows lots of runtime-optimizations.

### Creating nodes

[](#creating-nodes)

In its simplest form, an enriching node might look like this:

```
use Matchory\DataPipe\Nodes\AbstractCollector as Node;
use Matchory\DataPipe\PipelineContext;

class MyNode extends Node
{
    public function __construct(protected $yourInternalAgeApi) {}

    public function pipe(PipelineContext $context): PipelineContext
    {
        // Work with the data payload
        $email = $context->getPayload()->getAttribute('email');

        // Perform domain-specific work
        $age = $this->yourInternalAgeApi->query($email);

        // Update the payload
        if ($age) {
            $context->proposeChange($this, 'age', $age);
        }

        return $context;
    }
}
```

### Proposing changes

[](#proposing-changes)

Note that you cannot directly update the payload: Every node receives just a clone of the actual payload. Instead, you can *propose* a change to the payload. Data Pipe provides a simple algorithm for [best-fit change application](#best-fit-change-application). This allows to keep and compare multiple values for a single attribute.

### Creating pipelines

[](#creating-pipelines)

Now that we have a node, let's create a pipeline to add it to:

```
use Matchory\DataPipe\Payload\Payload;
use Matchory\DataPipe\Pipeline;
use Symfony\Component\EventDispatcher\EventDispatcher;

$nodes = [
    new MyNode(),
];
$eventDispatcher = new EventDispatcher();
$pipeline = new Pipeline($nodes, $eventDispatcher);

function(): Generator {
    yield new Payload([
        'email' => 'foo@bar.com'
    ]);
}

$pipeline->process(fetchNextPayload());
```

### DI usage

[](#di-usage)

This is a contrived example, of course; in reality, a dependency-injection container would handle almost everything for you:

```
use Matchory\DataPipe\Pipeline;

class EntryPoint {
    public function main(Pipeline $pipeline, Generator $recordFetcher): void
    {
        foreach ($recordFetcher as $record) {
            $pipeline->process($recordFetcher);
        }
    }
}
```

Core Concepts
-------------

[](#core-concepts)

Data Pipe uses a few building blocks to structure your pipelines.

### Pipeline nodes

[](#pipeline-nodes)

Nodes are the stages forming a pipeline. They can depend on other nodes to have been executed previously; these dependencies will be figured out before the pipeline runs, so you don't have to define an order manually. Every payload processed by the pipeline will be piped to all nodes in it, each having the option to suggest changes to the data.
There are two types of nodes currently:

#### Collector nodes

[](#collector-nodes)

Nodes that enhance a record with additional information are called *collector nodes*. These nodes may optionally define a *cost*: It is used to order those nodes by cost, and determine whether executing additional nodes is even necessary.
Imagine you have two data sources -- your own, internal database, and an external system that charges per API call. The node for your database will have a lower cost than that or the external API. Now, if we're looking for a piece of information, we'll first execute the "cheaper" node (your internal database), then, *only if it can't satisfy our request*, we'll also execute the more expensive node.

The more nodes you have, the more apparent the advantage of granular costs will be: Information will always be acquired with the cheapest means possible.

#### Transformer nodes

[](#transformer-nodes)

Transformer nodes allow you to refine, modify, or compare previously gathered information. This is different from data enriching nodes, as they're typically executed *after* those nodes.

### Best-Fit change application

[](#best-fit-change-application)

The more data sources you have, the more variants of pieces of information you will collect. What's problematic is determining the *best* of those variants - think of an email address for example:

-
-
-
-
-

Depending on a few rules, you're probably able to infer which is the closest variant to what you're looking for. Now, to keep a sequence of nodes from overriding each other's results, instead of setting an attribute on the payload, they can *suggest changes* instead:

```
$context->proposeChange($this, 'attribute_name', 42);
```

All nodes may propose changes to existing data, along with an optional *confidence score*: In the email case above, for example, we'd probably have a grey-list of trashmail domains, and assign that address a low confidence score. The idea here is, *take that email if nothing better can be found later on*.

###  Health Score

21

—

LowBetter than 19% of packages

Maintenance20

Infrequent updates — may be unmaintained

Popularity8

Limited adoption so far

Community7

Small or concentrated contributor base

Maturity42

Maturing project, gaining track record

 Bus Factor1

Top contributor holds 100% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~87 days

Total

2

Last Release

1697d ago

### Community

Maintainers

![](https://www.gravatar.com/avatar/aa633f25468b1a85e85646d55aa16c39d04e163e92744d0a7c2a694dbcafc3e1?d=identicon)[Radiergummi](/maintainers/Radiergummi)

---

Top Contributors

[![Radiergummi](https://avatars.githubusercontent.com/u/6115429?v=4)](https://github.com/Radiergummi "Radiergummi (15 commits)")

###  Code Quality

Static AnalysisPsalm

Type Coverage Yes

### Embed Badge

![Health badge](/badges/matchory-data-pipe/health.svg)

```
[![Health](https://phpackages.com/badges/matchory-data-pipe/health.svg)](https://phpackages.com/packages/matchory-data-pipe)
```

###  Alternatives

[symfony/framework-bundle

Provides a tight integration between Symfony components and the Symfony full-stack framework

3.6k235.4M9.7k](/packages/symfony-framework-bundle)[shopware/platform

The Shopware e-commerce core

3.3k1.5M3](/packages/shopware-platform)[drupal/core-recommended

Locked core dependencies; require this project INSTEAD OF drupal/core.

6939.5M343](/packages/drupal-core-recommended)[drupal/core

Drupal is an open source content management platform powering millions of websites and applications.

19462.3M1.3k](/packages/drupal-core)[sulu/sulu

Core framework that implements the functionality of the Sulu content management system

1.3k1.3M152](/packages/sulu-sulu)[shopware/core

Shopware platform is the core for all Shopware ecommerce products.

595.2M386](/packages/shopware-core)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
