PHPackages                             mkcg/php-query-model - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [HTTP &amp; Networking](/categories/http)
4. /
5. mkcg/php-query-model

ActiveLibrary[HTTP &amp; Networking](/categories/http)

mkcg/php-query-model
====================

Agnostic model to efficiently query and scroll any kind of data (SQL, Search engine, HTTP API, CSV, ...) and push them anywhere with a ETL

0.9.14(5y ago)61.2k2[2 PRs](https://github.com/MKCG/php-query-model/pulls)GPL-3.0-onlyPHPPHP ^7.2.0

Since Feb 27Pushed 2y ago3 watchersCompare

[ Source](https://github.com/MKCG/php-query-model)[ Packagist](https://packagist.org/packages/mkcg/php-query-model)[ Docs](https://github.com/MKCG/php-query-model)[ RSS](/packages/mkcg-php-query-model/feed)WikiDiscussions master Synced 1w ago

READMEChangelogDependencies (1)Versions (23)Used By (0)

Presentation
============

[](#presentation)

Simple multi-database library to search content on different engines and aggregate those results into document-oriented structures.

The library define the class `MKCG\Model\DBAL\QueryEngine` to build documents using different `Drivers`.

It also defines a simple `ETL` to be able to easily and efficiently synchronize content between different datasources.

Engine API
==========

[](#engine-api)

The `QueryEngine` API define two methods : `query()` and `scroll()`.

Each one can fetch and build documents using the provided `\MKCG\Model\Model` with the appropriate `\MKCG\Model\DBAL\QueryCriteria`. However, the `scroll()` method return a `\Generator` and internally performs multiple batches to efficiently scroll big collections.

Examples
--------

[](#examples)

```
$model = Schema\User::make('default', 'user')
    ->with(Schema\Address::make())
    ->with(Schema\Post::make());

$criteria = (new QueryCriteria())
    ->forCollection('user')
        ->addFilter('status', FilterInterface::FILTER_IN, [ 2 , 3 , 5 , 7 ])
        ->addFilter('registered_at', FilterInterface::FILTER_GREATER_THAN_EQUAL, '2000-01-01')
        ->addSort('firstname', 'ASC')
        ->addSort('lastname', 'ASC')
        ->setLimit(10)
    ->forCollection('addresses')
        ->setLimitByParent(2)
    ->forCollection('posts')
        ->addFilter('title', FilterInterface::FILTER_FULLTEXT_MATCH, 'ab')
;

$users = $engine->query($model, $criteria);

echo json_encode($users->getContent(), JSON_PRETTY_PRINT) . "\n";
echo "\nFound : " . $users->getCount() . " users\n";

$iterator = $engine->scroll($model, $criteria);

foreach ($iterator as $user) {
    echo json_encode($user, JSON_PRETTY_PRINT) . "\n";
}
```

[![Example](./query_engine.svg)](./query_engine.svg)

Drivers definition
------------------

[](#drivers-definition)

Each `Driver` is responsible to perform queries on a single `datasource` (database, HTTP API, local files, ...) and must :

- implements the `MKCG\Model\DBAL\Drivers\DriverInterface`
- registered inside the `QueryEngine`

Example
-------

[](#example)

```
use MKCG\Model\DBAL\QueryEngine;
use MKCG\Model\DBAL\Drivers;

$mongoClient = new MongoDB\Client('mongodb://root:password@mongodb');

$redisClient = new \Predis\Client([
    'scheme' => 'tcp',
    'host' => 'redisearch',
    'port' => 6379
]);

$sqlConnection = \Doctrine\DBAL\DriverManager::getConnection([
    'user' => 'root',
    'password' => 'root',
    'host' => 'mysql',
    'driver' => 'pdo_mysql',
]);

$engine = (new QueryEngine('mysql'))
    ->registerDriver(new Drivers\Doctrine($sqlConnection), 'mysql')
    ->registerDriver(new Drivers\CsvReader($fixturePath), 'csv')
    ->registerDriver(new Drivers\RssReader(new Adapters\Guzzle), 'rss')
    ->registerDriver(new Drivers\SitemapReader(new Adapters\Guzzle), 'sitemap')
    ->registerDriver(new Drivers\Http(new Adapters\Guzzle), 'http')
    ->registerDriver(new Drivers\HttpRobot(new Adapters\Guzzle), 'http_robot')
    ->registerDriver(new Drivers\MongoDB($mongoClient), 'mongodb')
```

Runtime behaviors
-----------------

[](#runtime-behaviors)

Drivers
=======

[](#drivers)

TypeNameDescriptionDocument-oriented databaseMongoDBDriver for MongoDB 3.6+Relational databaseDoctrineDoctrine DBAL Adapter (MySQL, MariaDB are supported, other might not)Search engineElasticsearchDriver for Elasticsearch 5+ (Work in progress)Search engineRedisearchDriver for redisearchFile readerCsvReaderHTTPHttpInteract with remote url (Work in progress)HTTPHttpRobotParse robots.txt from remote url (Work in progress)RSSRssReaderExtract RSS from remote urlSitemapSitemapReaderExtract Sitemap urlset from remote urlFeatures supported by driver
----------------------------

[](#features-supported-by-driver)

DriverScrollableFilterableSortableAggregatableCountCsvReaderYESYESNOYESYESDoctrineYESYESYESYESYESHttpYESYESNONONOHttpRobotYESNONONONOMongoDBYESYESYESYESYESRedisearchYESYESYESYESYESRssReaderYESYESNONONOSitemapReaderYESYESNONONOAlgoliaCassandraElasticsearchPostgreSQLScyllaDBSolrQuery criteria options
----------------------

[](#query-criteria-options)

HTTP-based drivers :

- HTTP
- HttpRobot
- RssReader
- SitemapReader
- Elasticsearch

Result-based filterable drivers :

- CsvReader
- RssReader
- SitemapReader

OptionDriversDescriptioncase\_sensitiveMongoDB, Result-based filterable driversPerform case sensitive `FILTER_FULLTEXT_MATCH` search , default : `false`filepath CsvReaderAbsolute or relative filepath of the CSVdelimiter CsvReaderField delimiterjson\_formatterHTTPFormat JSON response body using a callbackmultiple\_requestsnone , used by the QueryEngineDisable sub-requests batching when including sub-modelsurlHTTP-based driversDefine the URL to use to queryurl\_generatorHTTP-based driversUse a callback to generate the URL to use based on the Querymax\_query\_timeHTTP-based drivers , MongoDBMax query time in milliseconds , default : `5000` (5 seconds)allow\_partialMongoDB , ElasticsearchAllow partial results to be returned , default : `false`readPreferenceMongoDBreadConcernMongoDBbatchSizeMongoDBdiacriticSensitiveMongoDBnameallOverrides the name defined into the SchemaInterfaceWhen both `url_generator` and `url` are provided, then only `url_generator` is used.

Filters
=======

[](#filters)

NameConstant name DescriptionINFILTER\_INNOT INFILTER\_NOT\_INGTFILTER\_GREATER\_THANGTEFILTER\_GREATER\_THAN\_EQUALLTFILTER\_LESS\_THANLTEFILTER\_LESS\_THAN\_EQUALMATCHFILTER\_FULLTEXT\_MATCHText searchCUSTOMFILTER\_CUSTOMAllow to use a callable to apply complex filtersConstants are defined by the interface **MKCG\\Model\\DBAL\\FilterInterface**

Filters supported by driver
---------------------------

[](#filters-supported-by-driver)

DriverINNOT INGTGTELTLTEMATCHCUSTOMHttpNONONONONONONONOHttpRobotNONONONONONONONODoctrineYESYESYESYESYESYESInterpreted as LIKE "%value%"YESElasticsearchYESYESYESYESYESYESYES , using elasticsearch `match` filterWIPMongoDBYESYESYESYESYESYESYES , using mongodb `$text` operatorYESRedisearchYESYESYESYESYESYESYES , using redisearch `search` syntaxNOCsvReaderYESYESYESYESYESYESInterpreted as LIKE "%value%"YESRssReaderYESYESYESYESYESYESInterpreted as LIKE "%value%"YESSitemapReaderYESYESYESYESYESYESInterpreted as LIKE "%value%"YESCUSTOM filter type
------------------

[](#custom-filter-type)

Custom filters can be applied by providing a `callable` to the `QueryCriteria` instance :

```
(new QueryCriteria())
    ->forCollection('order')
        ->addCallableFilter(function(Query $query, ...$arguments) {
            // do something
        })
```

The first argument of the `callable` SHOULD always be the `Query` instance. Other arguments might change depending on the driver.

Some `Driver` apply filters on fetched results and expect a `false` return value when the filter does not match. Internaly they apply a `array_filter` on each fetched result before :

- CsvReader
- RssReader
- SitemapReader

`callable` arguments by Driver
------------------------------

[](#callable-arguments-by-driver)

DriverFirst argumentSecond argumentDoctrine\\MKCG\\Model\\DBAL\\Query\\Doctrine\\DBAL\\Query\\QueryBuilderCsvReader\\MKCG\\Model\\DBAL\\Query`array` representing a raw itemRssReader\\MKCG\\Model\\DBAL\\Query`array` representing a raw itemSitemapReader\\MKCG\\Model\\DBAL\\Query`array` representing a raw itemMongoDB\\MKCG\\Model\\DBAL\\Quert`array` representing the filters passed as first argument of `\MongoDB\Collection::find()`ETL
===

[](#etl)

A deadly simple ETL is defined as a single class `\MKCG\Model\ETL`. It can be used in combination with the QueryEngine `scroll` API to transform then push content to different loaders;

Example
-------

[](#example-1)

```
function pipelineEtl(QueryEngine $engine)
{
    $model = Schema\Product::make('default', 'products');
    $criteria = (new QueryCriteria())
        ->forCollection('products')
            ->addFilter('sku.color', FilterInterface::FILTER_IN, ['aqua', 'purple'])
        ;

    $iterator = $engine->scroll($model, $criteria, 100);

    $pushed = ETL::extract($engine->scroll($model, $criteria, 100), 1000, 500)
        ->transform(function($item) {
            return [
                'id' => $item['_id'],
                'sku' => $item['sku']
            ];
        })
        ->transform(function($item) {
            return $item + [
                'sku_count' => count($item['sku'] ?? [])
            ];
        })
        ->load(function(iterable $bulk) {
            echo sprintf("[ETL] Loader 1 - Loading %d elements\n", count($bulk));
        })
        ->load(function(iterable $bulk) {
            echo sprintf("[ETL] Loader 2 - Loading %d elements\n", count($bulk));
        })
        ->load(function(iterable $bulk) {
            echo sprintf("[ETL] Loader 3 - Loading %d elements\n", count($bulk));
        })
        ->run();

    echo sprintf("[ETL] Pushed %d elements\n", $pushed);
}
```

Aggregations
============

[](#aggregations)

NAMEDescriptionTERMSNumber of distinct elements by field , with the field filters considered by the aggregationFACETNumber of distinct elements by field , with the field filters excluded by the aggregationAVERAGEAverage value of a numeric fieldMINMin value of a fieldMAXMax value of a fieldQUANTILEQuantile value of a fieldAggregrations supported
-----------------------

[](#aggregrations-supported)

DriverTERMSFACET AVERAGEMINMAXQUANTILECsvReaderNONOYESYESYESNODoctrineYESYESYESYESYESYESMongoDBYESYESYESYESYESYESRedisearchYESYESYESYESYESYESTest and examples
=================

[](#test-and-examples)

No tests are provided although some will be made using `Behat` for the release of the version 1.0.0. However a fully functionnal example is provided in `examples/` and build documents using different kinds of `Drivers`

From : ./examples

```
docker-compose up --build -d
docker exec -it php_query_model sh -c "cd /home/php-query-model/examples && composer install"
docker exec -it php_query_model sh -c "php /home/php-query-model/examples/index.php"
```

By default this will run only two functions (located in `index.php`)

```
pipelineEtl($engine);
searchOrder($engine);
// searchProducts($engine);
// searchGithubRobot($engine);
// searchSitemaps($engine);
// searchPackages($engine);
// searchUsers($engine);
// searchHackerNews($engine);
```

The `pipelineEtl` use the Engine `scroll` API to iterates a list of `Product` stored in MongoDB and apply different `transformations`before pushing content with three `loaders` using the `ETL` component.

The `searchOrder` use the Engine `scroll` API to :

- scan a `CSV` file containing ecommerce `Order`
    - then inject their corresponding `Product` stored on `MongoDB`
    - then inject their correspondng customers stored as `User` into `MySQL`
        - with their first two defined `Address` also stored in `Mysql`
        - and all their `Post` stored in `Mysql`

You might want to uncomment the other search functions to execute HTTP queries and fetch :

-  with `searchGithubRobot`
-  top stories with `searchHackerNews`
-  RSS feed with `searchPackages`
-  Sitemap with `searchSitemaps`

Roadmap
=======

[](#roadmap)

Expected features
-----------------

[](#expected-features)

Work In progress
----------------

[](#work-in-progress)

FeatureDescriptionElasticsearch DriverCallable validationValidate callable arguments using Reflection and PHP tokensBacklog
-------

[](#backlog)

FeatureDescriptionAsync HTTP requestsPerform non-blocking HTTP requestsLazy requestsOnly perform requests when the content is manipulatedCacheable requestsCache results and detect what to invalidate using surrogate keysContent synchronizerUse streamed eventlog to synchronize content between each datasourceError handling strategiesAllow to apply different strategy in case of a failure : crash, retry, fallback...Generate schema classesGenerate Schema classes by analyzing each database schemaContent lifecycleAllow to create / update / delete contentDrivers "nice to have"
----------------------

[](#drivers-nice-to-have)

Database Drivers
================

[](#database-drivers)

- Algolia
- ArangoDB
- Cassandra
- \\Illuminate\\Eloquent (library used by Laravel)
- Neo4J
- PostgreSQL
- ScyllaDB
- Solr

Streaming
=========

[](#streaming)

- Kafka
- MySQL binlog
- RabbitMQ

Storage
=======

[](#storage)

- AWS S3
- File system
- OpenIO

Infrastructure
==============

[](#infrastructure)

- AWS
- OVH

Service
=======

[](#service)

- Cloudinary
- Sendinblue

Social Network
==============

[](#social-network)

- Facebook
- LinkedIn
- Twitter

Contribution
============

[](#contribution)

Feel free to open a merge request for any suggestion or to contribute to this project.

###  Health Score

28

—

LowBetter than 54% of packages

Maintenance20

Infrequent updates — may be unmaintained

Popularity21

Limited adoption so far

Community12

Small or concentrated contributor base

Maturity51

Maturing project, gaining track record

 Bus Factor1

Top contributor holds 94.7% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~9 days

Recently: every ~20 days

Total

19

Last Release

2106d ago

### Community

Maintainers

![](https://www.gravatar.com/avatar/8b2ef3ebe4e2bde4aa4f3385c67db8a8da9601ef4651e8d3638d530e3dd41509?d=identicon)[MKCG](/maintainers/MKCG)

---

Top Contributors

[![MKCG](https://avatars.githubusercontent.com/u/8983434?v=4)](https://github.com/MKCG "MKCG (18 commits)")[![mounirrquiba](https://avatars.githubusercontent.com/u/5168216?v=4)](https://github.com/mounirrquiba "mounirrquiba (1 commits)")

---

Tags

httpdatabaseelasticsearchrssdoctrinecsvredismongodbSitemapetlcqrsaggregationprojectionrobots.txtredisearch

### Embed Badge

![Health badge](/badges/mkcg-php-query-model/health.svg)

```
[![Health](https://phpackages.com/badges/mkcg-php-query-model/health.svg)](https://phpackages.com/packages/mkcg-php-query-model)
```

###  Alternatives

[laudis/neo4j-php-client

Neo4j-PHP-Client is the most advanced PHP Client for Neo4j

184616.9k31](/packages/laudis-neo4j-php-client)[prooph/psb-http-producer

Http Message Producer for Prooph Service Bus using guzzle

124.0k](/packages/prooph-psb-http-producer)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
