PHPackages                             dgtlss/semantica - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Search &amp; Filtering](/categories/search)
4. /
5. dgtlss/semantica

ActiveLibrary[Search &amp; Filtering](/categories/search)

dgtlss/semantica
================

A Laravel package for semantic search using vector embeddings

1.0.0(6mo ago)31MITPHPPHP ^8.4

Since Dec 24Pushed 6mo agoCompare

[ Source](https://github.com/dgtlss/semantica)[ Packagist](https://packagist.org/packages/dgtlss/semantica)[ RSS](/packages/dgtlss-semantica/feed)WikiDiscussions main Synced today

READMEChangelog (1)Dependencies (11)Versions (2)Used By (0)

Semantica
=========

[](#semantica)

A Laravel package that enables semantic search using vector embeddings for better relevance in content-heavy applications like blogs, e-commerce, or knowledge bases. Supports multiple AI providers including OpenAI, Google Gemini, and local Ollama models, with comprehensive security features and static analysis.

Features
--------

[](#features)

- Generate text embeddings using multiple AI providers (OpenAI, Gemini, Ollama)
- Automatic embedding generation for Eloquent models with `HasEmbeddings` trait
- Semantic search with configurable similarity metrics (cosine, euclidean, dot product)
- Configurable similarity thresholds and result caching
- Batch processing for performance optimization
- Artisan commands for indexing and reindexing existing data
- Comprehensive security features and input validation
- Static analysis with PHPStan and automated code quality tools
- Support for both cloud and local embedding models

Installation
------------

[](#installation)

Install via Composer:

```
composer require dgtlss/semantica
```

Publish the configuration and migration:

```
php artisan vendor:publish --provider="Dgtlss\Semantica\Providers\SemanticaServiceProvider" --tag="semantica-config"
php artisan vendor:publish --provider="Dgtlss\Semantica\Providers\SemanticaServiceProvider" --tag="semantica-migrations"
```

Run the migration:

```
php artisan migrate
```

Configuration
-------------

[](#configuration)

Choose your embedding provider and set the appropriate API keys in your `.env` file:

### OpenAI (Default)

[](#openai-default)

```
SEMANTICA_PROVIDER=openai
OPENAI_API_KEY=your-openai-api-key-here
SEMANTICA_EMBEDDING_MODEL=text-embedding-3-small
```

### Anthropic

[](#anthropic)

```
SEMANTICA_PROVIDER=anthropic
ANTHROPIC_API_KEY=your-anthropic-api-key-here
SEMANTICA_ANTHROPIC_MODEL=claude-3-sonnet-20240229
```

### Gemini (Google)

[](#gemini-google)

```
SEMANTICA_PROVIDER=gemini
GEMINI_API_KEY=your-gemini-api-key-here
SEMANTICA_GEMINI_MODEL=text-embedding-004
```

### Ollama (Local Models)

[](#ollama-local-models)

```
SEMANTICA_PROVIDER=ollama
OLLAMA_BASE_URL=http://localhost:11434
SEMANTICA_OLLAMA_MODEL=nomic-embed-text
```

### Additional Configuration

[](#additional-configuration)

```
SEMANTICA_AUTO_EMBED=false  # Disabled by default for security
SEMANTICA_CACHE_ENABLED=true
SEMANTICA_CACHE_TTL=3600
SEMANTICA_BATCH_SIZE=100
SEMANTICA_SIMILARITY_THRESHOLD=0.7
```

Usage
-----

[](#usage)

### Automatic Embedding

[](#automatic-embedding)

Models using the `HasEmbeddings` trait will automatically have embeddings generated when saved **only if** `SEMANTICA_AUTO_EMBED=true` is set in your environment file. This is **disabled by default** for security reasons.

```
// With SEMANTICA_AUTO_EMBED=true
$post = Post::create([
    'title' => 'Laravel Tips',
    'content' => 'Here are some useful Laravel tips...',
]);
// Embedding is automatically generated

// With SEMANTICA_AUTO_EMBED=false (default)
$post = Post::create([
    'title' => 'Laravel Tips',
    'content' => 'Here are some useful Laravel tips...',
]);
// No embedding generated - use manual embedding or artisan commands
```

### Manual Embedding

[](#manual-embedding)

Use the service directly:

```
use Dgtlss\Semantica\Services\EmbeddingService;

$embeddingService = app(EmbeddingService::class);
$embeddingService->embed($post);
```

### Semantic Search

[](#semantic-search)

Use the facade for searching:

```
use Dgtlss\Semantica\Facades\Semantica;

$results = Semantica::search('PHP framework tutorials', App\Models\Post::class, 10, 0.8);

// Or get models directly
$posts = Semantica::searchModels('PHP framework tutorials', App\Models\Post::class);
```

### Commands

[](#commands)

Index existing records:

```
php artisan semantica:index App\\Models\\Post
```

Reindex models:

```
php artisan semantica:reindex App\\Models\\Post
php artisan semantica:reindex --all
```

Model Trait
-----------

[](#model-trait)

To enable automatic embeddings for a model, use the `HasEmbeddings` trait:

```
use Dgtlss\Semantica\Traits\HasEmbeddings;

class Post extends Model
{
    use HasEmbeddings;

    // Customize embedding fields (optional)
    public function getEmbeddingFields(): array
    {
        return ['title', 'excerpt', 'body'];
    }
}
```

Security Considerations
-----------------------

[](#security-considerations)

### API Keys and Authentication

[](#api-keys-and-authentication)

- API keys are stored securely in environment variables and never logged
- The package validates API key presence at service initialization
- Supports multiple providers with proper key validation

### Data Privacy and Protection

[](#data-privacy-and-protection)

- **Auto-embedding is disabled by default** - must be explicitly enabled via `SEMANTICA_AUTO_EMBED=true`
- Text content is sanitized before sending to external APIs (HTML tags removed, whitespace normalized)
- Input validation prevents empty or malicious text from being processed
- Embeddings are hidden from model JSON serialization by default

### Input Validation and Sanitization

[](#input-validation-and-sanitization)

- Search queries are trimmed and validated for emptiness
- Model class names are validated to prevent class injection attacks
- Configured models are verified to be valid Eloquent classes
- Text length is limited to prevent abuse (8KB max)

### Performance and Abuse Prevention

[](#performance-and-abuse-prevention)

- Embedding generation is rate-limited by external API constraints
- Batch processing limits prevent memory exhaustion
- Search results are capped (max 100 results)
- Similarity thresholds are clamped between 0.0 and 1.0

### Network Security

[](#network-security)

- HTTPS is enforced for API communications
- HTTP client includes retry logic for resilience
- Timeouts prevent hanging requests

### Configuration Security

[](#configuration-security)

- Sensitive configuration values are properly typed and validated
- Unsupported providers throw exceptions instead of falling back silently

### Logging and Monitoring

[](#logging-and-monitoring)

- Errors are logged without exposing sensitive information
- API failures include status codes but not response bodies in logs
- Performance metrics (text length, provider, model) are logged for monitoring

### Best Practices

[](#best-practices)

- Regularly rotate API keys
- Monitor API usage and costs
- Use caching to reduce external API calls
- Test with mock providers in development
- Keep dependencies updated for security patches

Supported Providers
-------------------

[](#supported-providers)

- **OpenAI**: High-quality embeddings with multiple models available
- **Anthropic**: Currently not supported for embeddings (API doesn't provide embedding endpoints) - placeholder implementation
- **Gemini**: Google's embedding models via Generative AI API
- **Ollama**: Run embedding models locally using Ollama

API Reference
-------------

[](#api-reference)

### Facade Methods

[](#facade-methods)

```
use Dgtlss\Semantica\Facades\Semantica;

// Search for similar content
$results = Semantica::search('query text', App\Models\Post::class, 10, 0.8);

// Get models directly
$posts = Semantica::searchModels('query text', App\Models\Post::class);

// Embed a model manually
Semantica::embed($model);
```

### Service Methods

[](#service-methods)

```
use Dgtlss\Semantica\Services\EmbeddingService;
use Dgtlss\Semantica\Services\SearchService;

$embeddingService = app(EmbeddingService::class);
$searchService = app(SearchService::class);

// Generate embedding for text
$embedding = $embeddingService->generateEmbedding('text');

// Embed a model
$embeddingService->embed($model);

// Search
$results = $searchService->search('query', App\Models\Post::class);
$models = $searchService->searchModels('query', App\Models\Post::class);
```

Extending Providers
-------------------

[](#extending-providers)

To add support for additional embedding providers, implement the `EmbeddingProviderInterface`:

```
