PHPackages                             ganglio/pds - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Search &amp; Filtering](/categories/search)
4. /
5. ganglio/pds

ActiveLibrary[Search &amp; Filtering](/categories/search)

ganglio/pds
===========

Probabilistic Data Structures to efficiently analyze and mine big datasets

v1.0.11(10y ago)332MITPHPPHP &gt;=5.4.0

Since Sep 16Pushed 10y ago1 watchersCompare

[ Source](https://github.com/ganglio/PDS)[ Packagist](https://packagist.org/packages/ganglio/pds)[ Docs](http://www.github.com/ganglio/PDS)[ RSS](/packages/ganglio-pds/feed)WikiDiscussions master Synced 1mo ago

READMEChangelog (1)Dependencies (1)Versions (13)Used By (0)

PDS
===

[](#pds)

Probabilistic Data Structures to efficiently analyze and mine big datasets

[![Latest Stable Version](https://camo.githubusercontent.com/bd8761f5f0b7cec33e69111869f2c038182e872c40b01b73aea2e374483be7e4/68747470733a2f2f706f7365722e707567782e6f72672f67616e676c696f2f7064732f762f737461626c65)](https://packagist.org/packages/ganglio/pds)[![Build Status](https://camo.githubusercontent.com/8256c9c56078a2d8ca94ccc71da600a95e4fa58e99ef24eb3e850cc033a42ea7/68747470733a2f2f7472617669732d63692e6f72672f67616e676c696f2f5044532e7376673f6272616e63683d6d6173746572)](https://travis-ci.org/ganglio/PDS)[![codecov.io](https://camo.githubusercontent.com/ff8f23e51f27b0835ea428ea73110b1044a91357f64ca4826f8ad0d3cef563cc/687474703a2f2f636f6465636f762e696f2f6769746875622f67616e676c696f2f5044532f636f7665726167652e7376673f6272616e63683d6d6173746572)](http://codecov.io/github/ganglio/PDS?branch=master)[![Code Climate](https://camo.githubusercontent.com/958f4ba5c3297e4216a93704a165eea2938b994b716f6f71ff26713a174e89d3/68747470733a2f2f636f6465636c696d6174652e636f6d2f6769746875622f67616e676c696f2f5044532f6261646765732f6770612e737667)](https://codeclimate.com/github/ganglio/PDS)[![License](https://camo.githubusercontent.com/7c2140b9aa862beba28e7539498dc5f2efad9a86763f40cce86fd64d56f232b5/68747470733a2f2f706f7365722e707567782e6f72672f67616e676c696f2f7064732f6c6963656e7365)](https://packagist.org/packages/ganglio/pds)

This package contains a collection of data structures and tools to analyze big amounts of data in a memory efficient way.

Table Of Content
----------------

[](#table-of-content)

1. [Installation](#user-content-installation)
2. [Namespaces](#user-content-namespaces)
3. [Interfaces](#user-content-interfaces)
4. [Classes](#user-content-classes)
5. [Examples](#user-content-examples)

### Installation

[](#installation)

Install via [Composer](https://getcomposer.org/) (make sure you have composer in your path or in your project).

Put the following in your package.json:

```
{
    "require": {
        "ganglio/PDS": "*"
    }
}
```

and then run `composer install` or just run

```
composer require ganglio/PDS

```

### Namespaces

[](#namespaces)

A number of namespaces are defined in the library.

- \\ganglio\\PDS\\Bloom
- \\ganglio\\PDS\\Estimators
- \\ganglio\\PDS\\Hash
- \\ganglio\\PDS\\Storage

### Interfaces

[](#interfaces)

#### Estimator

[](#estimator)

This interface is the basis for cardinality estimators. It defines two methods:

- `add($key)` - adds a key to the estimator
- `count()` - returns the number of keys added to the estimator

Depending on the implementation the actual class might return an exact estimation, like the [`Exact`](#user-content-exact) class, or an approximation like the [`HyperLogLog`](#user-content-hyperloglog) class.

#### Hash

[](#hash)

This interface is the basis for the various hashing classes offered by the package. It defines one method and a constant:

- `hash($str)` - performs the actual hashing of the string provided
- `UPPERBOUND` - a 32-bit mask to be used by the hashing functions `0xffffffff`

#### Storage

[](#storage)

This interface is the basis for the storage classes. It defines four methods:

- `set($key, $value)` - sets a value to the key in the storage system
- `get($key)` - gets the value stored to the key
- `flush()` - flushes the storage system
- `size()` - returns the number of keys stored in the storage system

### Classes

[](#classes)

#### BitArray (implements [`Storage`](#user-content-storage))

[](#bitarray-implements-storage)

Implements a single bit array. It's used to implement the [`Bloob Filter`](#user-content-bloom) where the `set` method only accepts `Bool` as `$value`.

#### HyperLogLog (implements [`Estimator`](#user-content-estimator))

[](#hyperloglog-implements-estimator)

Implements the HyperLogLog cardinality estimator algorithm. The actual implementation uses HyperLogLog for big cardinalities and LinearCounting for small ones as it gives a better approximation.

#### Exact (implements [`Estimator`](#user-content-estimator))

[](#exact-implements-estimator)

Implement an exact counter. It's primarily a toy class to show how to use the [`Estimator`](#user-content-estimator) interface.

#### Trivial (implements [`Hash`](#user-content-hash))

[](#trivial-implements-hash)

Implements a trivial hashing algorithm. Basically adds the ASCII code shifted right by the character position for each characted of the input string and then takes the lower 32 bits. It's a toy class to show how to use the [`Hash`](#user-content-hash) interface.

#### Pearson (implements [`Hash`](#user-content-hash))

[](#pearson-implements-hash)

Implements the Pearson non-cryptographic hashing function.

#### FVNHash (implements [`Hash`](#user-content-hash))

[](#fvnhash-implements-hash)

Implements the Fowler-Noll-Vo non-cryptographic hashing function. The actual algorithm is the FNV-1 hash.

#### Generic (implements [`Hash`](#user-content-hash))

[](#generic-implements-hash)

This class is basically a wrapper around the standard PHP hash function. The constructor accepts the algorithm name to use as from the PHP hash\_algos() function. If an unknown algorithm is specified it raises an exception, if none is specified MD5 is selected as default.

#### MultiHash (implements [`Hash`](#user-content-hash))

[](#multihash-implements-hash)

This class calculates multiple hashes using different algorithms specified ad arguments of the constructor. It's primarily used in conjunction with the [`BitArray`](#user-content-bitarray) class to implement the [`Bloom Filter`](#user-content-bloom).

#### Bloom

[](#bloom)

This class implements a [Bloom Filter](https://en.wikipedia.org/wiki/Bloom_filter), a probabilistic data structure that allows to test if an element is a member of a set with a very small memory footprint.

### Examples

[](#examples)

**TODO**

###  Health Score

29

—

LowBetter than 59% of packages

Maintenance20

Infrequent updates — may be unmaintained

Popularity10

Limited adoption so far

Community7

Small or concentrated contributor base

Maturity65

Established project with proven stability

 Bus Factor1

Top contributor holds 100% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~2 days

Total

12

Last Release

3878d ago

### Community

Maintainers

![](https://www.gravatar.com/avatar/5305bcf3bda76fdc9179cd51aac78e79992208ad4414819a814c5d0c3315f75c?d=identicon)[ganglio](/maintainers/ganglio)

---

Top Contributors

[![ganglio](https://avatars.githubusercontent.com/u/498939?v=4)](https://github.com/ganglio "ganglio (36 commits)")

---

Tags

filterdata structuresuniqueBloom Filterbloomhyperlogloghyper log logcardinalitiescardinalityhashesprobabilistic

###  Code Quality

TestsPHPUnit

### Embed Badge

![Health badge](/badges/ganglio-pds/health.svg)

```
[![Health](https://phpackages.com/badges/ganglio-pds/health.svg)](https://phpackages.com/packages/ganglio-pds)
```

###  Alternatives

[clue/stream-filter

A simple and modern approach to stream filtering in PHP

1.7k261.7M7](/packages/clue-stream-filter)[pleonasm/bloom-filter

A pure PHP implementation of a Bloom Filter

76745.5k6](/packages/pleonasm-bloom-filter)[laminas/laminas-filter

Programmatically filter and normalize data and files

9528.0M150](/packages/laminas-laminas-filter)[joegreen0991/hyperloglog

A hyper log log with min hash data structure library, for counting cardinalities. Union and intersection capable

2116.5k](/packages/joegreen0991-hyperloglog)[friendsofcake/search

CakePHP Search plugin using PRG pattern

1742.0M37](/packages/friendsofcake-search)[outl1ne/nova-input-filter

An input filter for Laravel Nova

24822.7k](/packages/outl1ne-nova-input-filter)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
