PHPackages                             eftec/documentstoreone - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Database &amp; ORM](/categories/database)
4. /
5. eftec/documentstoreone

ActiveLibrary[Database &amp; ORM](/categories/database)

eftec/documentstoreone
======================

A flat document store for PHP that allows multiples concurrencies.

1.28(1y ago)124.5k34LGPL-3.0-onlyPHPPHP &gt;=7.4CI failing

Since Sep 13Pushed 1y ago2 watchersCompare

[ Source](https://github.com/EFTEC/DocumentStoreOne)[ Packagist](https://packagist.org/packages/eftec/documentstoreone)[ Docs](https://github.com/EFTEC/DocumentStoreOne)[ RSS](/packages/eftec-documentstoreone/feed)WikiDiscussions master Synced 1w ago

READMEChangelog (10)Dependencies (1)Versions (18)Used By (4)

DocumentStoreOne
================

[](#documentstoreone)

A document store for PHP that allows multiples concurrencies. It is a minimalist alternative to MongoDB or CouchDB without the overhead of installing a new service.

It also works as a small footprint database.

[![Packagist](https://camo.githubusercontent.com/56ebf3020ac9be5d1c3651bc973c62137a11f7ab803fdede39d15bc26db58096/68747470733a2f2f696d672e736869656c64732e696f2f7061636b61676973742f762f65667465632f646f63756d656e7473746f72656f6e652e737667)](https://packagist.org/packages/eftec/documentstoreone)[![Total Downloads](https://camo.githubusercontent.com/acf0f3ffffe81e846eaf14a95599d1f74ed28a954aff51832e4fe749fd1ce49f/68747470733a2f2f706f7365722e707567782e6f72672f65667465632f646f63756d656e7473746f72656f6e652f646f776e6c6f616473)](https://packagist.org/packages/eftec/documentstoreone)![License](https://camo.githubusercontent.com/9328acede8c358746921eddd20e852416a85061fec434a0099834b44e6c35f2a/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d4c47504c56332d626c75652e737667)![Maintenance](https://camo.githubusercontent.com/0c8f829897840ac35cb3daf181a719612c0f64c0ed5fca3c7b90ed7591169162/68747470733a2f2f696d672e736869656c64732e696f2f6d61696e74656e616e63652f7965732f323032352e737667)![composer](https://camo.githubusercontent.com/7a6cce75e3353cd615b111f2f4ff50dec30cf814dddb88b2613f656cec298330/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f636f6d706f7365722d253345322e302d626c75652e737667)![php](https://camo.githubusercontent.com/59558613d05bebac3748d4f75f0c94435dec5fb11d059b448c2d172e25d82120/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f7068702d372e342d677265656e2e737667)![php](https://camo.githubusercontent.com/5cd91a78fb469ca20b235b6951fb6dd77bda78ac4633eb432e93699bcb141589/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f7068702d382e342d677265656e2e737667)![Doc](https://camo.githubusercontent.com/0b437c9be7db2f82951f4196f45fd7a738ddd6cb242938189322e6250f90d0a4/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f646f63732d36322532352d677265656e2e737667)

- [DocumentStoreOne](#documentstoreone)
    - [Key features](#key-features)
    - [Test](#test-)
    - [Concurrency test](#concurrency-test)
    - [Usage](#usage)
    - [Methods](#methods)
        - [Constructor($baseFolder,$collection,$strategy=DocumentStoreOne::DSO\_AUTO,$server="",$serializeStrategy = false,$keyEncryption = '')](#constructorbasefoldercollectionstrategydocumentstoreonedso_autoserverserializestrategy--falsekeyencryption--)
        - [isCollection($collection)](#iscollectioncollection)
        - [collection($collection)](#collectioncollection)
        - [autoSerialize($value=true,$strategy='php')](#autoserializevaluetruestrategyphp-)
        - [createCollection($collection)](#createcollectioncollection-)
        - [insertOrUpdate($id,$document,\[$tries=-1\])](#insertorupdateiddocumenttries-1)
        - [insert($id,$document,\[$tries=-1\])](#insertiddocumenttries-1)
        - [update($id,$document,\[$tries=-1\])](#updateiddocumenttries-1)
        - [get($id,\[$tries=-1\],$default=false)](#getidtries-1defaultfalse)
        - [getFiltered($id,\[$tries=-1\],$default=false,$condition=\[\],$reindex=true)](#getfilteredidtries-1defaultfalseconditionreindextrue)
        - [public function appendValue($name,$addValue,$tries=-1)](#public-function-appendvaluenameaddvaluetries-1)
        - [getNextSequence($name="seq",$tries=-1,$init=1,$interval=1,$reserveAdditional=0)](#getnextsequencenameseqtries-1init1interval1reserveadditional0)
        - [getSequencePHP()](#getsequencephp)
        - [ifExist($id,\[$tries=-1\])](#ifexistidtries-1)
        - [delete($id,\[$tries=-1\])](#deleteidtries-1)
        - [select($mask="\*")](#selectmask)
        - [copy($idorigin,$iddestination,\[$tries=-1\])](#copyidoriginiddestinationtries-1)
        - [rename($idorigin,$iddestination,\[$tries=-1\])](#renameidoriginiddestinationtries-1)
        - [fixCast (util class)](#fixcast-util-class)
    - [DocumentStoreOne Fields](#documentstoreone-fields)
    - [MapReduce](#mapreduce)
    - [Limits](#limits)
- [Strategy of Serialization](#strategy-of-serialization)
    - [NONE](#none)
    - [PHP](#php)
    - [PHP\_ARRAY](#php_array)
    - [JSON\_ARRAY and JSON\_OBJECT](#json_array-and-json_object)
- [Control of Error](#control-of-error)
- [Working with CSV](#working-with-csv)
- [Version list](#version-list)
    - [Pending](#pending)

Key features
------------

[](#key-features)

- Single key based.
- Fast. However, it's not an alternative to a relational database. It's optimized to store a moderated number documents instead of millions of rows.
- **Allows multiple concurrences by locking and unlocking a document**. If the document is locked then, it retries until the document is unlocked or fails after a number of retries.
- One single class with no dependencies.
- Automatic unlock document locked (by default, every 2 minutes if the file was left locked).
- It could use **MapReduce** See [example](https://github.com/EFTEC/DocumentStoreOne/blob/master/examples/4_example_read_mapreduce.php)

Test
----

[](#test)

In average, an SMB generates 100 invoices per month. So, let's say that an SMB generates 12000 invoices per decade.

Testing generating 12000 invoices with customer, details (around 1-5 lines per detail) and date on an i7/ssd/16gb/windows 64bits.

- Store 12000 invoices 45.303 seconds (reserving a sequence range)
- Store 12000 invoices 73.203 seconds (reading a sequence for every new invoice)
- Store 12000 invoices 49.0286 seconds (reserving a sequence range and using igbinary)
- Reading all invoices 60.2332 seconds. (only reading)
- MapReduce all invoices per customers 64.0569 seconds.
- MapReduce all invoices per customers 32.9869 seconds (igbinary)
- Reading all invoices from a customer **0.3 seconds.** (including render the result, see image)
- Adding a new invoice without recalculating all the MapReduce 0.011 seconds.

[![mapreduce example](https://github.com/EFTEC/DocumentStoreOne/raw/master/doc/mapreduce.jpg "mapreduce on php")](https://github.com/EFTEC/DocumentStoreOne/blob/master/doc/mapreduce.jpg)

Concurrency test
----------------

[](#concurrency-test)

A test with 100 concurrent test (write and read), 10 times.

N°Reads(ms)ReadsError1100747110002100775110003100749010004100748010005100819910006100745110007100747610008100724410009100757310001010078181000Usage
-----

[](#usage)

```
include "lib/DocumentStoreOne.php";
use eftec\DocumentStoreOne\DocumentStoreOne;
try {
    $flatcon = new DocumentStoreOne("base", 'tmp');
    // or you could use:
    // $flatcon = new DocumentStoreOne(__DIR__ . "/base", 'tmp');
} catch (Exception $e) {
    die("Unable to create document store. Please, check the folder");
}
$flatcon->insertOrUpdate("somekey1",json_encode(array("a1"=>'hello',"a2"=>'world'))); // or you could use serialize/igbinary_serialize
$doc=$flatcon->get("somekey1");
$listKeys=$flatcon->select();
$flatcon->delete("somekey1");
```

```
include "lib/DocumentStoreOne.php";
use eftec\DocumentStoreOne\DocumentStoreOne;
$doc=new DocumentStoreOne("base","task",'folder');
//also: $doc=new DocumentStoreOne(__DIR__."/base","task",'folder');
$doc->serializeStrategy='php'; // it sets the strategy of serialization to php
$doc->autoSerialize(true); // autoserialize

$flatcon->insertOrUpdate("somekey1",array("a1"=>'hello',"a2"=>'world'));
```

Methods
-------

[](#methods)

### Constructor($baseFolder,$collection,$strategy=DocumentStoreOne::DSO\_AUTO,$server="",$serializeStrategy = false,$keyEncryption = '')

[](#constructorbasefoldercollectionstrategydocumentstoreonedso_autoserverserializestrategy--falsekeyencryption--)

It creates the DocumentStoreOne instance.

- **$baseFolder**: should be a folder
- **$collection**: (a subfolder) is optional.
- **$strategy**: It is the strategy used to determine if the file is in use or not.

strategytypeserverbenchmarkDSO\_AUTOIt sets the best available strategy (default)depends-DSO\_FOLDERIt uses a folder for lock/unlock a document-0.3247DSO\_APCUIt uses APCU for lock/unlock a document-0.1480DSO\_REDISIt uses REDIS for lock/unlock a documentlocalhost:63792.5403 (worst)DSO\_NONEIt uses nothing to lock/unlock a document. It is the fastest method but it is unsafe for multiples users0- **$server**: It is used by REDIS. You can set the server used by the strategy.
- **$serializeStrategy**: If false then it does not serialize the information.

strategytypephpit serializes using serialize() functionphp\_arrayit serializes using include()/var\_export()function. The result could be cached on OpCache because the result is a PHP code file.json\_objectit is serialized using json (as object)json\_arrayit is serialized using json (as array)csvit serializes using a csv file.igbinaryit serializes using a igbinary file.**none** (default value)it is not serialized. Information must be serialized/de-serialized manuallyExamples:

```
$flatcon = new DocumentStoreOne(__DIR__ . "/base"); // new instance, using the folder /base, without serialization and with the default data

$flatcon = new DocumentStoreOne(__DIR__ . "/base", '','auto','','php_array'); // new instance and serializing using php_array
```

Benchmark how much time (in seconds) it takes to add 100 inserts.

```
use eftec\DocumentStoreOne\DocumentStoreOne;
include "lib/DocumentStoreOne.php";
try {
    $flatcon = new DocumentStoreOne(__DIR__ . "/base", 'tmp');
} catch (Exception $e) {
    die("Unable to create document store.".$e->getMessage());
}
```

```
use eftec\DocumentStoreOne\DocumentStoreOne;
include "lib/DocumentStoreOne.php";
try {
    $flatcon = new DocumentStoreOne("/base", 'tmp',DocumentStoreOne::DSO_APCU);
} catch (Exception $e) {
    die("Unable to create document store.".$e->getMessage());
}
```

### isCollection($collection)

[](#iscollectioncollection)

Returns true if collection is valid (a sub-folder).

```
$ok=$flatcon->isCollection('tmp');
```

### collection($collection)

[](#collectioncollection)

It sets the current collection

```
$flatcon->collection('newcollection'); // it sets a collection.
```

This command could be nested.

```
$flatcon->collection('newcollection')->select(); // it sets and return a query
```

> Note, it doesn't validate if the collection is correct or exists. You must use **isCollection()** to verify if it's right.

### autoSerialize($value=true,$strategy='php')

[](#autoserializevaluetruestrategyphp)

It sets if we want to auto serialize the information, and we set how it is serialized. You can also set using the constructor.

strategytypephpit serializes using serialize() function.php\_arrayit serializes using include()/var\_export()function. The result could be cached on OpCache because the result is a php filejson\_objectit is serialized using json (as object)json\_arrayit is serialized using json (as array)csvit serializes using a csv file.igbinaryit serializes using a igbinary file.**none** (default value)it is not serialized. Information must be serialized/de-serialized manually### createCollection($collection)

[](#createcollectioncollection)

It creates a collection (a new folder inside the base folder). It returns false if the operation fails; otherwise it returns true

```
$flatcon->createCollection('newcollection');
$flatcon->createCollection('/folder1/folder2');
```

### insertOrUpdate($id,$document,\[$tries=-1\])

[](#insertorupdateiddocumenttries-1)

inserts a new document (string) in the **$id** indicated. If the document exists, then it's updated.
**$tries** indicates the number of tries. The default value is -1 (default number of attempts).

```
// if we are not using auto serialization
$doc=json_encode(["a1"=>'hello',"a2"=>'world']);
$flatcon->insertOrUpdate("1",$doc); // it will create a document called 1.dson in the base folder.

// if we are using auto serialization
$flatcon->insertOrUpdate("1",["a1"=>'hello',"a2"=>'world']);
```

> If the document is locked then it retries until it is available or after a "nth" number of tries (by default it's 100 tries that equivalent to 10 seconds)

> It's faster than insert or update.

### insert($id,$document,\[$tries=-1\])

[](#insertiddocumenttries-1)

Inserts a new document (string) in the **$id** indicated. If the document exists, then it returns false.
**$tries** indicates the number of tries. The default value is -1 (default number of attempts).

```
// if we are not using auto serialization
$doc=json_encode(array("a1"=>'hello',"a2"=>'world'));
$flatcon->insert("1",$doc);

// if we are using auto serialization
$flatcon->insert("1",["a1"=>'hello',"a2"=>'world']);
```

> If the document is locked then it retries until it is available or after a "nth" number of tries (by default it's 100 tries that equivalent to 10 seconds)

### update($id,$document,\[$tries=-1\])

[](#updateiddocumenttries-1)

Update a document (string) in the **$id** indicated. If the document doesn't exist, then it returns false
**$tries** indicates the number of tries. The default value is -1 (default number of attempts).

```
// if we are not using auto serialization
$doc=json_encode(["a1"=>'hello',"a2"=>'world']);
$flatcon->update("1",$doc);
// if we are using auto serialization
$flatcon->update("1",["a1"=>'hello',"a2"=>'world']);
```

> If the document is locked then it retries until it is available or after a "nth" number of tries (by default it's 100 tries that equivales to 10 seconds)

### get($id,\[$tries=-1\],$default=false)

[](#getidtries-1defaultfalse)

It reads the document **$id**. If the document doesn't exist, or it's unable to read it, then it returns false.
**$tries** indicates the number of tries. The default value is -1 (default number of attempts).

```
$doc=$flatcon->get("1"); // the default value is false

$doc=$flatcon->get("1",-1,'empty');
```

> If the document is locked then it retries until it is available or after a "nth" number of tries (by default it's 100 tries that equivalent to 10 seconds)

### getFiltered($id,\[$tries=-1\],$default=false,$condition=\[\],$reindex=true)

[](#getfilteredidtries-1defaultfalseconditionreindextrue)

It reads the document **$id** filtered. If the document doesn't exist, or it's unable to read it, then it returns false.
**$tries** indicates the number of tries. The default value is -1 (default number of attempts).

```
// data in rows [['id'=>1,'cat'=>'vip'],['id'=>2,'cat'=>'vip'],['id'=>3,'cat'=>'normal']];
$data=$this->getFiltered('rows',-1,false,['cat'=>'normal']); // [['id'=>3,'cat'=>'normal']]
$data=$this->getFiltered('rows',-1,false,['type'=>'busy'],false); // [2=>['id'=>3,'cat'=>'normal']]
```

> If the document is locked then it retries until it is available or after a "nth" number of tries (by default it's 100 tries that equivalent to 10 seconds)

### public function appendValue($name,$addValue,$tries=-1)

[](#public-function-appendvaluenameaddvaluetries-1)

It adds a value to a document with name **$name**. The new value is added, so it avoids to create the whole document. It is useful, for example, for a log file.

a) If the value doesn't exist, then it's created with $addValue. Otherwise, it will return true
b) If the value exists, then **$addValue** is added, and it'll return true
c) Otherwise, it will return false

```
$seq=$flatcon->appendValue("log",date('c')." new log");
```

### getNextSequence($name="seq",$tries=-1,$init=1,$interval=1,$reserveAdditional=0)

[](#getnextsequencenameseqtries-1init1interval1reserveadditional0)

It reads or generates a new sequence.

a) If the sequence exists, then it's incremented by **$interval** and this value is returned.
b) If the sequence doesn't exist, then it's created with **$init**, and this value is returned. c) If the library is unable to create a sequence, unable to lock or the sequence exists but, it's unable to read, then it returns false

```
$seq=$flatcon->getNextSequence();
```

> You could peek a sequence with $id=get('genseq\_') however it's not recommended.

> If the sequence is corrupt then it's reset to $init

> If you need to reserve a list of sequences, you could use **$reserveAdditional**

```
$seq=$flatcon->getNextSequence("seq",-1,1,1,100); // if $seq=1, then it's reserved up to the 101. The next value will be 102.
```

### getSequencePHP()

[](#getsequencephp)

It returns a unique sequence (64bit integer) based on time, a random value and a serverId.

> The chances of collision (a generation of the same value) is 1/4095 (per two operations executed every 0.0001 second).

```
$this->nodeId=1; // if it is not set then it uses a random value each time.
$unique=$flatcon->getSequencePHP();
```

### ifExist($id,\[$tries=-1\])

[](#ifexistidtries-1)

It checks if the document **$id** exists. It returns true if the document exists. Otherwise, it returns false.
**$tries** indicates the number of tries. The default value is -1 (default number of tries).

> The validation only happens if the document is fully unlocked.

```
$found=$flatcon->ifExist("1");
```

> If the document is locked then it retries until it is available or after a "nth" number of tries (by default it's 100 tries that equivales to 10 seconds)

### delete($id,\[$tries=-1\])

[](#deleteidtries-1)

It deletes the document **$id**. If the document doesn't exist, or it's unable to delete, then it returns false.
**$tries** indicates the number of tries. The default value is -1 (default number of tries).

```
$doc=$flatcon->delete("1");
```

> If the document is locked then it retries until it is available or after a "nth" number of tries (by default it's 100 tries that equivales to 10 seconds)

### select($mask="\*")

[](#selectmask)

It returns all the IDs stored on a collection.

```
$listKeys=$flatcon->select();
$listKeys=$flatcon->select("invoice_*");
```

> It includes locked documents.

### copy($idorigin,$iddestination,\[$tries=-1\])

[](#copyidoriginiddestinationtries-1)

Copy the document **$idorigin** in **$iddestination**

```
$bool=$flatcon->copy(20,30);
```

> If the document destination exists then its replaced

### rename($idorigin,$iddestination,\[$tries=-1\])

[](#renameidoriginiddestinationtries-1)

Rename the document **$idorigin** as **$iddestination**

```
$bool=$flatcon->rename(20,30);
```

> If the document destination exists then the operation fails.

### fixCast (util class)

[](#fixcast-util-class)

It converts a stdclass to a specific class.

```
$inv=new Invoice();
$invTmp=$doc->get('someid'); //$invTmp is a stdClass();
DocumentStoreOne::fixCast($inv,$invTmp);
```

> It doesn't work with members that are array of objects. The array is kept as stdclass.

DocumentStoreOne Fields
-----------------------

[](#documentstoreone-fields)

The next fields are public, and they could be changed during runtime

fieldType$databasestring root folder of the database$collectionstring Current collection (subfolder) of the database$maxLockTime=120int Maximium duration of the lock (in seconds). By default it's 2 minutes$defaultNumRetry=100int Default number of retries. By default it tries 100x0.1sec=10 seconds$intervalBetweenRetry=100000int Interval (in microseconds) between retries. 100000 means 0.1 seconds$docExt=".dson"string Default extension (with dot) of the document$keyEncryption=""string Indicates if the key is encrypted or not when it's stored (the file name). Empty means, no encryption. You could use md5,sha1,sha256,..Example:

```
$ds=new DocumentStoreOne();
$ds->maxLockTime=300;
```

```
$ds=new DocumentStoreOne();
$ds->insert('1','hello'); // it stores the document 1.dson
$ds->keyEncryption='SHA256';
$ds->insert('1','hello'); // it stores the document 6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b.dson
```

MapReduce
---------

[](#mapreduce)

It could be done manually. The system allows to store a pre-calculated value that could be easily accesses (instead of read all values).

Let's say the next exercise, we have a list of purchases

idcustomeragesexproductpurchaseamount14john33m33325anna22f321productcodeunitprice3223.33330John purchased 3 products with the code 33. The products 33 costs $23.3 per unit.

Question, how much every customer paid?.

> It's a simple exercise, it's more suitable for a relational database (select \* from purchases inner join products). However, if the document is long or complex to store in the database then it's here where a document store shines.

```
// 1) open the store
$ds=new DocumentStoreOne('base','purchases'); // we open the document store and selected the collection purchase.
$ds->autoSerialize(true,'auto');
// 2) reading all products
// if the list of products holds in memory then, we could store the whole list in a single document (listproducts key)
$products=$ds->collection('products')->get('listproducts');
// 3) we read the keys of every purchases. It could be slow and it should be a limited set (collection('purchases')->select(); // they are keys such as 14,15...

$customerXPurchase=[];
// 4) We read every purchase. It is also slow.  Then we merge the result and obtained the final result
foreach($purchases as $k) {
    $purchase=$ds->get($k);
    @$customerXPurchase[$purchase->customer]+=($purchase->amount * @$products[$purchase->productpurchase]); // we add the amount
}
// 5) Finally, we store the result.
$ds->collection('total')->insertOrUpdate($customerXPurchase,'customerXPurchase'); // we store the result.
```

customervaluejohn69.9anna30Since it's done on code then it's possible to create a hybrid system (relational database+store+memory cache)

Limits
------

[](#limits)

- Keys should be of the type A-a,0-9. In windows, keys are not case-sensitive.
- The limit of documents that a collection could hold is based on the document system used. NTFS allows 2 million of documents per collection.

Strategy of Serialization
=========================

[](#strategy-of-serialization)

Let's say we want to serialize the next information:

```
$input=[['a1'=>1,'a2'=>'a'],['a1'=>2,'a2'=>'b']];
```

NONE
----

[](#none)

The values are not serialized, so it is not possible to serialize an object, array or other structure. It only works with strings.

How values are stored

```
helloworld

```

How values are returned

```
"helloworld"
```

PHP
---

[](#php)

The serialization of PHP is one of the faster way to serialize and de-serialize, and it always returns the same value with the same structure (classes, array, fields)

However, the value stored could be long.

How the values are stored:

```
a:2:{i:0;a:2:{s:2:"a1";i:1;s:2:"a2";s:1:"a";}i:1;a:2:{s:2:"a1";i:2;s:2:"a2";s:1:"b";}}

```

How the values are returned:

```
[['a1'=>1,'a2'=>'a'],['a1'=>2,'a2'=>'b']]

```

PHP\_ARRAY
----------

[](#php_array)

This serialization generates a PHP code. This code is verbose however, it has some nice features:

- It could be cached by PHP's OPcache.
- It's fast to load.

How the values are stored:

```
