PHPackages                             centamiv/vektor - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Search &amp; Filtering](/categories/search)
4. /
5. centamiv/vektor

ActiveProject[Search &amp; Filtering](/categories/search)

centamiv/vektor
===============

A native PHP Vector Database implementation with strict binary storage and Zero-RAM overhead.

v20260104225022(4mo ago)3418.5k↓18.9%42MITPHPPHP ^8.2

Since Dec 31Pushed 3mo ago2 watchersCompare

[ Source](https://github.com/centamiv/vektor)[ Packagist](https://packagist.org/packages/centamiv/vektor)[ RSS](/packages/centamiv-vektor/feed)WikiDiscussions develop Synced 1mo ago

READMEChangelog (3)Dependencies (1)Versions (10)Used By (2)

Vektor - Native PHP Vector Database
===================================

[](#vektor---native-php-vector-database)

**Vektor** is a high-performance, purely file-based, embedded **Vector Database** written entirely in native PHP. It is designed for **Zero-RAM Overhead**, meaning it does not require loading your entire dataset into memory to function.

**Each Vektor instance operates as a standalone database**, with data stored by default in the `/data` directory.

Instead of memory-heavy indexes, Vektor utilizes strict binary file layouts and optimized disk-seeking strategies to perform **Approximate Nearest Neighbor (ANN)** searches using the **HNSW (Hierarchical Navigable Small World)** algorithm.

---

Features
--------

[](#features)

- **Pure PHP**: No external dependencies or C extentions required. Runs on any standard PHP 8.2+ environment.
- **Zero-RAM Overhead**: Data is read directly from disk. Memory usage is constant regardless of dataset size.
- **HNSW Algorithm**: Efficient graph-based index for fast approximate nearest neighbor search.
- **Binary Storage**: Compact binary file formats for Vectors, Graph connections, and Metadata.
- **Embedded or Server**: Use it directly in your PHP code or run it as a standalone HTTP API server.
- **Thread-Safety**: Implements file locking (flock) to safely handle concurrent reads and writes.
- **Cosine Similarity**: Optimized distance metric for high-dimensional embeddings (configurable, default 1536).

---

Requirements
------------

[](#requirements)

- **PHP**: 8.2 or higher
- **Composer**: For dependency management

---

Upgrading to v2.0.0
-------------------

[](#upgrading-to-v200)

⚠️ **Important Breaking Change**: Version 2.0.0 introduces a change to the binary record length in `meta.bin`. If you are upgrading from a previous version, your existing data files will be **incompatible**. You must **delete your existing `data/` directory** and re-index (rebuild) your dataset.

---

Installation
------------

[](#installation)

### 1. Installation via Composer

[](#1-installation-via-composer)

To use Vektor in your existing PHP project:

```
composer require centamiv/vektor
```

### 2. Standalone Installation

[](#2-standalone-installation)

To run Vektor as a standalone API server:

```
git clone https://github.com/centamiv/vektor.git
cd vektor
composer install --no-dev
```

Ensure the `data/` directory is writable by your web server or script user:

```
mkdir -p data
chmod -R 775 data
```

---

Configuration
-------------

[](#configuration)

Vektor uses a `.env` file for configuration when running as a server.

1. Copy the example environment file:

    ```
    cp .env.example .env
    ```
2. Open `.env` and configure your API Token:

    ```
    # .env
    VEKTOR_API_TOKEN=your_secure_random_string_here
    VEKTOR_DIMENSIONS=1536
    ```

- **VEKTOR\_API\_TOKEN**: If set, all API requests (except `/up`) must include this token in the `Authorization` header. If left empty, the API is open to the public.
- **VEKTOR\_DIMENSIONS**: Set the dimension of your vectors (default: 1536). IMPORTANT: Changing this requires a fresh database (delete data/ dir).

---

Usage
-----

[](#usage)

Vektor is designed for flexibility, allowing you to either integrate it directly into your PHP projects as a library or deploy it as a standalone REST API server.

---

### Usage: HTTP API Server

[](#usage-http-api-server)

Vektor includes a built-in Controller to run as a REST API. You can serve this using Apache, Nginx, or the PHP built-in server.

#### Starting the Server

[](#starting-the-server)

For testing/development:

```
# Serves the public/ directory on port 8000
php -S 0.0.0.0:8000 -t public
```

#### Authentication

[](#authentication)

If `VEKTOR_API_TOKEN` is set in your `.env`, you must include the header in all requests:

```
Authorization: Bearer

```

#### API Endpoints

[](#api-endpoints)

##### `GET /up`

[](#get-up)

Health check endpoint.

- **Auth Required**: No
- **Response**:

```
{
  "status": "up"
}
```

##### `GET /info`

[](#get-info)

Returns database statistics.

- **Response**:

```
{
  "storage": {
    "vector_file_bytes": 1048576,
    "graph_file_bytes": 524288,
    "meta_file_bytes": 2048,
    "payload_file_bytes": 4096
  },
  "records": {
    "vectors_total": 150,
    "graph_nodes": 150
  },
  "config": {
    "dimension": 1536,
    "max_levels": 4
  }
}
```

##### `POST /insert`

[](#post-insert)

Insert a vector.

- **Body**:

```
{
  "id": "my-doc-id",
  "vector": [0.1, 0.2, 0.3, ...],
  "metadata": {
    "source": "docs/intro.md",
    "chunk": 3
  }
}
```

- **Response**:

```
{
  "status": "success",
  "id": "my-doc-id"
}
```

##### `POST /search`

[](#post-search)

Search for nearest neighbors.

- **Body**:

```
{
  "vector": [0.1, 0.2, 0.3, ...],
  "k": 5
}
```

- **Response**:

```
{
  "results": [
    { "id": "my-doc-id", "distance": 0.95 },
    { "id": "another-id", "distance": 0.88 }
  ]
}
```

Optionally pass `"include_vector": true` to also get vector data of similar documents. Optionally pass `"include_metadata": true` to also get metadata stored with the document.

- **Body**:

```
{
  "vector": [0.1, 0.2, 0.3, ...],
  "include_vector": true,
  "include_metadata": true,
  "k": 5
}
```

- **Response**:

```
{
  "results": [
    { "id": "my-doc-id", "distance": 0.95, "vector": [0.5, 1.0, 0.3, ...], "metadata": { "source": "docs/intro.md", "chunk": 3 } },
    { "id": "another-id", "distance": 0.88, "vector": [0.5, 1.1, 0.3, ...], "metadata": { "source": "docs/faq.md", "chunk": 1 } }
  ]
}
```

##### `POST /delete`

[](#post-delete)

Delete a vector.

- **Body**:

```
{
  "id": "my-doc-id"
}
```

- **Response**:

```
{
  "status": "success",
  "message": "..."
}
```

##### `POST /optimize`

[](#post-optimize)

Trigger database optimization.

- **Response**:

```
{
  "status": "success",
  "message": "..."
}
```

---

### Usage: Embedded Library

[](#usage-embedded-library)

You can use Vektor directly in your PHP scripts without running an HTTP server. This is the fastest way to interact with the database.

#### Configuration

[](#configuration-1)

By default, Vektor stores data in the `data/` directory relative to the package root. You can change this path using the `Config` class:

```
use Centamiv\Vektor\Core\Config;

Config::setDataDir(__DIR__ . '/my_custom_data_dir');
```

You can also set the vector dimensions (default 1536):

```
Config::setDimensions(768);
// Note: This must be called BEFORE initializing Indexer/Searcher
```

#### Initialization

[](#initialization)

```
use Centamiv\Vektor\Services\Indexer;
use Centamiv\Vektor\Services\Searcher;
use Centamiv\Vektor\Services\Optimizer;

// The Indexer handles writing (Insert, Delete)
$indexer = new Indexer();

// The Searcher handles reading (Search)
$searcher = new Searcher();
```

#### 1. Inserting Vectors

[](#1-inserting-vectors)

Vectors must be **1536-dimensional arrays** of floats.

```
$id = "doc-123"; // String ID (max 36 chars)
$vector = [0.0123, -0.5231, ...]; // Array of 1536 floats

// Insert (or update if ID exists - NOTE: Updates are essentially Appends with pointer updates)
$metadata = ['source' => 'docs/intro.md', 'chunk' => 3];
$indexer->insert($id, $vector, $metadata);
```

#### 2. Searching

[](#2-searching)

Find the `k` nearest neighbors to a query vector.

```
$queryVector = [0.0123, ...];
$k = 5; // Number of results

$results = $searcher->search($queryVector, $k, includeMetadata: true);

// Output:
// [
//   ['id' => 'doc-123', 'score' => 0.9823, 'metadata' => ['source' => 'docs/intro.md', 'chunk' => 3]],
//   ['id' => 'doc-456', 'score' => 0.8912, 'metadata' => ['source' => 'docs/faq.md', 'chunk' => 1]],
//   ...
// ]
```

#### 3. Deleting Results

[](#3-deleting-results)

Deletes a document by its ID. This performs a "soft delete" in the vector file and updates the metadata mapping.

```
$success = $indexer->delete("doc-123");

if ($success) {
    echo "Document deleted.";
} else {
    echo "Document not found.";
}
```

#### 4. Getting Statistics

[](#4-getting-statistics)

Retrieve current database stats, including file sizes and node counts.

```
$stats = $indexer->getStats();
print_r($stats);
```

#### 5. Optimizing (Vacuum)

[](#5-optimizing-vacuum)

Since deletions are "soft", the file size can grow over time. Run the optimizer to rebuild the index and reclaim space. **Note**: This is a blocking operation.

```
$optimizer = new Optimizer();
$optimizer->run();
```

---

Database File Structure
-----------------------

[](#database-file-structure)

Vektor achieves its performance and low memory footprint through three specialized binary files located in the `data/` directory.

- **`vector.bin`**: Stores raw vector data in an append-only structure.
- **`meta.bin`**: Maps external string IDs to internal file offsets using a disk-based Binary Search Tree (BST) for efficient lookups without loading maps into RAM.
- **`payload.bin`**: Stores serialized metadata (JSON) in an append-only structure referenced by `meta.bin`.
- **`graph.bin`**: Stores the HNSW Graph structure to enable fast navigation and approximate nearest neighbor searches.
- **Concurrency**: Implements advisory file locking to manage simultaneous shared reads and exclusive write operations safely.

---

How to generate embeddings from a document?
-------------------------------------------

[](#how-to-generate-embeddings-from-a-document)

Vektor stores vectors, but it does not generate them. You need an embedding model for that. A great local option is [Ollama](https://ollama.com/).

1. Install Ollama and pull an embedding model (e.g., `nomic-embed-text`).
2. Use the Ollama API to generate the vector for your text.
3. Pass that vector to Vektor:

```
function getEmbedding(string $text): array {
    $ch = curl_init('http://localhost:11434/api/embeddings');
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_POST, true);
    curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode([
        'model' => 'nomic-embed-text',
        'prompt' => $text
    ]));

    $response = json_decode(curl_exec($ch), true);
    curl_close($ch);

    return $response['embedding'];
}

// IMPORTANT: Vektor stores the ID and the Vector, but NOT the original content.
// You are responsible for storing the actual text (in files, S3, etc.).

// 1. Read your document
$id = "doc-hello";
$text = file_get_contents("{$id}.txt");

// 2. Generate vector
$vector = getEmbedding($text);

// 3. Insert into Vektor using the filename/ID as the reference
$indexer->insert($id, $vector);
```

---

Troubleshooting
---------------

[](#troubleshooting)

### 1. Permission Denied Errors

[](#1-permission-denied-errors)

Ensure your PHP process (e.g., `www-data`) has read/write access to the `data/` folder and the files inside it.

```
chown -R www-data:www-data data/
chmod -R 775 data/
```

### 2. "Invalid Vector Dimensions"

[](#2-invalid-vector-dimensions)

Vektor defaults to **1536 dimensions**. If you send a vector with different dimensions, it will be rejected. To change this, you can use `VECTOR_DIMENSIONS` in your `.env` or `Config::setDimensions(N)` in your code. **Important**: If you change dimensions, you must start with an empty data directory, as the binary file structure depends on the dimension size.

### 3. Slow Performance?

[](#3-slow-performance)

- **Disk I/O**: Since Vektor is disk-based, SSDs are highly recommended. HDDs will result in slow seek times.
- **Opcache**: Ensure PHP Opcache is enabled for production.

---

Contributing
------------

[](#contributing)

Contributions are welcome! Please run the test suite before submitting a PR.

```
composer test
```

The test suite includes Unit tests for storage engines and Feature tests for the HNSW logic.

---

License
-------

[](#license)

This project is licensed under the **MIT License**.

###  Health Score

50

—

FairBetter than 96% of packages

Maintenance79

Regular maintenance activity

Popularity41

Moderate usage in the ecosystem

Community17

Small or concentrated contributor base

Maturity53

Maturing project, gaining track record

 Bus Factor1

Top contributor holds 82.1% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~8 days

Total

7

Last Release

90d ago

### Community

Maintainers

![](https://www.gravatar.com/avatar/4bb6e94628f1de489fb3f7114991eef0c4d84f8b42d507cadc2bd54fefc75091?d=identicon)[centamiv](/maintainers/centamiv)

---

Top Contributors

[![centamiv](https://avatars.githubusercontent.com/u/4162703?v=4)](https://github.com/centamiv "centamiv (23 commits)")[![aszenz](https://avatars.githubusercontent.com/u/25319264?v=4)](https://github.com/aszenz "aszenz (4 commits)")[![mrfelipemartins](https://avatars.githubusercontent.com/u/24225909?v=4)](https://github.com/mrfelipemartins "mrfelipemartins (1 commits)")

---

Tags

phpsearchembeddingsvector-databasehnswbinary-storage

###  Code Quality

TestsPHPUnit

### Embed Badge

![Health badge](/badges/centamiv-vektor/health.svg)

```
[![Health](https://phpackages.com/badges/centamiv-vektor/health.svg)](https://phpackages.com/packages/centamiv-vektor)
```

###  Alternatives

[solarium/solarium

PHP Solr client

93432.7M98](/packages/solarium-solarium)[omaressaouaf/query-builder-criteria

Define reusable query criteria for filtering, sorting, search, field selection, and includes in Laravel Eloquent models

282.4k](/packages/omaressaouaf-query-builder-criteria)[apicart/fql

Filter Query Language

1110.6k](/packages/apicart-fql)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
