PHPackages                             eze/elasticsearch-pdf-importer - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [PDF &amp; Document Generation](/categories/documents)
4. /
5. eze/elasticsearch-pdf-importer

ActiveLibrary[PDF &amp; Document Generation](/categories/documents)

eze/elasticsearch-pdf-importer
==============================

PDF importer for elasticsearch

0.0.4(7y ago)51.8k2MITPHPPHP &gt;=7.0

Since May 12Pushed 7y agoCompare

[ Source](https://github.com/caminoezequiel/elasticsearch-pdf-importer)[ Packagist](https://packagist.org/packages/eze/elasticsearch-pdf-importer)[ Docs](https://github.com/caminoezequiel)[ RSS](/packages/eze-elasticsearch-pdf-importer/feed)WikiDiscussions master Synced today

READMEChangelog (4)Dependencies (1)Versions (5)Used By (0)

Elasticsearch PDF importer
==========================

[](#elasticsearch-pdf-importer)

It allows you import PDF files to elasticsearch and search in them.

Requirements
------------

[](#requirements)

- Elasticsearch (version 6)
- ingest-attachment plugin (see the [doc](https://www.elastic.co/guide/en/elasticsearch/plugins/master/ingest-attachment.html))

If you haven't installed `ingest-attachment` plugin run this in your server:

```
sudo bin/elasticsearch-plugin ingest-attachment

```

Installation
------------

[](#installation)

##### Installing composer package

[](#installing-composer-package)

```
composer require eze/elasticsearch-pdf-importer

```

##### Installing the Attachment Processor in a Pipeline

[](#installing-the-attachment-processor-in-a-pipeline)

You need to create a pipeline with the attachment processor. For it, you can choose following:

- Create a symfony's command ([see here](examples/SetupCommand.php))
- Create a php file and run it ([see here](examples/setup.php))
- Or via `curl` in command line:

```
PUT _ingest/pipeline/attachment
{
  "description" : "Extract attachment information",
  "processors" : [
    {
      "attachment" : {
        "field" : "data",
        "indexed_chars": -1
      }
    }
  ]
}

```

How to use
----------

[](#how-to-use)

The basic is create a Index, a Document and call to importer.

```
$client = (new \Eze\Elastic\Factory())->getClient('localhost:9200');
$resolver = new \Eze\Elastic\Importer\Reader\ReaderResolver([
    new \Eze\Elastic\Importer\Reader\UrlReader(),
    new \Eze\Elastic\Importer\Reader\FileReader()
]);
$importer = new \Eze\Elastic\Importer\AttachmentImporter($client, $resolver);

$file = 'PATH_TO_PDF_FILE.pdf';

$index = new Eze\Elastic\Model\Index('INDEX', 'TYPE', 'ID:OPTIONAL');
$document = new Eze\Elastic\Model\Document();
$document->setFile($file)->setIndex($index);
$id = $importer->import($document);

```

You can add more field calling to:

```
$document->addField('FIELD-NAME-ONE', 'VALUE)
    ->addField('FIELD-NAME-TWO', 'VALUE)
    ->addField('FIELD-NAME-THREE', 'VALUE);

```

Also you can do data processing before send its to elasticsearch, you only need to do an implementation of `ProcessorInterface`

I have implemented a processor to reduce pdf size with Ghostscript via command line.

*Requirements: php need to allow `exec` function, server need to have installed `ghostscript libgs-dev imagemagick` on ubuntu server*

```
$client = (new \Eze\Elastic\Factory())->getClient('localhost:9200');
$resolver = new \Eze\Elastic\Importer\Reader\ReaderResolver([
    new \Eze\Elastic\Importer\Reader\UrlReader(),
    new \Eze\Elastic\Importer\Reader\FileReader()
]);
$processor = new \Eze\Elastic\Importer\Processor\GhostscriptProcessor();
$importer = new \Eze\Elastic\Importer\AttachmentImporter($client, $resolver, $processor);
//
// or..
//
/**
$manyProcessor = new \Eze\Elastic\Importer\Processor\MultiProcessor([
    $processor1,
    $processor2,
    $processor3,
]);

$importer = new \Eze\Elastic\Importer\AttachmentImporter($client, $resolver, $manyProcessor);
*/

$file = 'PATH_TO_PDF_FILE.pdf';

$index = new Eze\Elastic\Model\Index('INDEX', 'TYPE', 'ID:OPTIONAL');
$document = new Eze\Elastic\Model\Document();
$document->setFile($file)->setIndex($index);
$id = $importer->import($document);

```

###  Health Score

28

—

LowBetter than 54% of packages

Maintenance20

Infrequent updates — may be unmaintained

Popularity21

Limited adoption so far

Community7

Small or concentrated contributor base

Maturity51

Maturing project, gaining track record

 Bus Factor1

Top contributor holds 100% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~31 days

Total

4

Last Release

2829d ago

PHP version history (2 changes)0.0.1PHP ^7.0 || ^7.1

0.0.3PHP &gt;=7.0

### Community

Maintainers

![](https://www.gravatar.com/avatar/5bff2f0871c0aa95ee0b5bcbb84ece4761f0d37d81f2355f7a5e7cdea283aeda?d=identicon)[camino.ezequiel](/maintainers/camino.ezequiel)

---

Top Contributors

[![caminoezequiel](https://avatars.githubusercontent.com/u/1306671?v=4)](https://github.com/caminoezequiel "caminoezequiel (8 commits)")

---

Tags

composer-packageelasticelasticsearchelasticsearch-phpindexingingest-attachmentpdfphp7searchpdfelasticsearchimportelasticimporterattachmentingest-attachment

### Embed Badge

![Health badge](/badges/eze-elasticsearch-pdf-importer/health.svg)

```
[![Health](https://phpackages.com/badges/eze-elasticsearch-pdf-importer/health.svg)](https://phpackages.com/packages/eze-elasticsearch-pdf-importer)
```

###  Alternatives

[jeroen-g/explorer

Next-gen Elasticsearch driver for Laravel Scout.

397612.3k](/packages/jeroen-g-explorer)[tecnickcom/tc-lib-pdf-font

PHP library containing PDF page formats and definitions

21773.2k5](/packages/tecnickcom-tc-lib-pdf-font)[blomstra/search

Replaces Flarum search with one powered by an elastic search server.

114.9k](/packages/blomstra-search)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
