PHPackages                             shibashish/pdf-reader - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [PDF &amp; Document Generation](/categories/documents)
4. /
5. shibashish/pdf-reader

ActiveLibrary[PDF &amp; Document Generation](/categories/documents)

shibashish/pdf-reader
=====================

A comprehensive Laravel package for extracting text, HTML, images, and metadata from PDF files using Poppler utilities.

v1.0.2(5mo ago)00[1 PRs](https://github.com/shibi1259/laravel-pdf-reader/pulls)MITPHPPHP ^8.2

Since Dec 9Pushed 5mo agoCompare

[ Source](https://github.com/shibi1259/laravel-pdf-reader)[ Packagist](https://packagist.org/packages/shibashish/pdf-reader)[ RSS](/packages/shibashish-pdf-reader/feed)WikiDiscussions main Synced 1mo ago

READMEChangelogDependencies (5)Versions (5)Used By (0)

PDF Reader Package for Laravel
==============================

[](#pdf-reader-package-for-laravel)

A comprehensive, production-ready Laravel package for extracting content from PDF files using Poppler utilities. This package provides a secure, type-safe interface for PDF manipulation with extensive error handling and validation.

---

📋 Table of Contents
-------------------

[](#-table-of-contents)

- [Overview](#overview)
- [Features](#features)
- [System Requirements](#system-requirements)
- [Dependencies](#dependencies)
- [Installation](#installation)
- [Configuration](#configuration)
- [Usage Guide](#usage-guide)
- [Exception Handling](#exception-handling)
- [Testing](#testing)
- [Architecture](#architecture)
- [Troubleshooting](#troubleshooting)
- [License](#license)

---

Overview
--------

[](#overview)

The **PDF Reader Package** wraps the powerful [Poppler](https://poppler.freedesktop.org/) command-line utilities in a clean, Laravel-friendly API. It handles PDF text extraction, HTML conversion, image extraction, and metadata retrieval with built-in validation, security, and error handling.

### Why This Package?

[](#why-this-package)

- **Secure**: Uses Laravel's `Process` facade instead of unsafe `shell_exec`
- **Validated**: Checks file existence, readability, and PDF format before processing
- **Type-Safe**: Full PHP 8.2+ type hints for better IDE support
- **Cross-Platform**: Works on Windows, macOS, and Linux
- **Well-Tested**: Comprehensive Pest test suite included
- **Production-Ready**: Proper exception handling and logging support

---

Features
--------

[](#features)

### Core Functionality

[](#core-functionality)

- 📄 **Text Extraction** - Extract plain text from PDFs with optional page ranges
- 🌐 **HTML Conversion** - Convert PDFs to HTML while preserving layout
- 🖼️ **Image Extraction** - Extract all embedded images from PDFs
- ℹ️ **Metadata Retrieval** - Get PDF properties (author, title, page count, etc.)

### Advanced Features

[](#advanced-features)

- 📑 **Page Range Support** - Extract specific pages (e.g., "1-5", "3-10")
- ✅ **Input Validation** - Automatic file existence and PDF format validation
- 🔒 **Secure Execution** - Uses Laravel Process facade for safe command execution
- 🎯 **Custom Exceptions** - Specific exceptions for different error scenarios
- 💾 **File Management** - Option to keep or auto-delete temporary files
- 🌍 **Cross-Platform** - Proper path handling for all operating systems

---

System Requirements
-------------------

[](#system-requirements)

### Required Software

[](#required-software)

- **PHP**: 8.2 or higher
- **Laravel**: 10.0 or higher
- **Poppler Utilities**: All binaries must be installed and accessible

### Poppler Binaries

[](#poppler-binaries)

The package requires the following Poppler command-line tools:

- `pdftotext` - Text extraction
- `pdftohtml` - HTML conversion
- `pdfinfo` - Metadata retrieval
- `pdfimages` - Image extraction

---

Dependencies
------------

[](#dependencies)

### Installing Poppler Utilities

[](#installing-poppler-utilities)

#### Ubuntu/Debian

[](#ubuntudebian)

```
sudo apt-get update
sudo apt-get install poppler-utils
```

Verify installation:

```
pdftotext -v
pdftohtml -v
pdfinfo -v
pdfimages -v
```

#### macOS

[](#macos)

Using Homebrew:

```
brew install poppler
```

Verify installation:

```
which pdftotext
which pdftohtml
which pdfinfo
which pdfimages
```

#### Windows

[](#windows)

1. Download Poppler for Windows from [GitHub Releases](https://github.com/oschwartz10612/poppler-windows/releases/)
2. Extract the archive to a permanent location (e.g., `C:\Program Files\poppler`)
3. Add the `bin` directory to your system PATH:
    - Right-click "This PC" → Properties → Advanced system settings
    - Environment Variables → System variables → Path → Edit
    - Add: `C:\Program Files\poppler\Library\bin`
4. Restart your terminal/IDE

Verify installation:

```
pdftotext -v
pdftohtml -v
pdfinfo -v
pdfimages -v
```

### Laravel Dependencies

[](#laravel-dependencies)

This package uses the following Laravel features:

- `Illuminate\Support\Facades\Process` - For secure command execution
- `Illuminate\Support\ServiceProvider` - For package registration
- `Illuminate\Support\Facades\Facade` - For the PdfReader facade

All dependencies are included in Laravel 10+.

---

Installation
------------

[](#installation)

### Step 1: Package Location

[](#step-1-package-location)

This package is located at:

```
packages/shibashish/pdf-reader

```

It's already configured in your main `composer.json` under `autoload-dev`.

### Step 2: Publish Configuration

[](#step-2-publish-configuration)

Publish the package configuration file to your Laravel application:

```
php artisan vendor:publish --tag=pdf-reader-config
```

This creates `config/pdf-reader.php` with default settings.

### Step 3: Configure Binary Paths (Optional)

[](#step-3-configure-binary-paths-optional)

If Poppler binaries are not in your system PATH, specify full paths in `.env`:

```
PDFTOTEXT_BINARY=/usr/bin/pdftotext
PDFTOHTML_BINARY=/usr/bin/pdftohtml
PDFINFO_BINARY=/usr/bin/pdfinfo
PDFIMAGES_BINARY=/usr/bin/pdfimages
```

**Windows Example:**

```
PDFTOTEXT_BINARY="C:\Program Files\poppler\Library\bin\pdftotext.exe"
PDFTOHTML_BINARY="C:\Program Files\poppler\Library\bin\pdftohtml.exe"
PDFINFO_BINARY="C:\Program Files\poppler\Library\bin\pdfinfo.exe"
PDFIMAGES_BINARY="C:\Program Files\poppler\Library\bin\pdfimages.exe"
```

### Step 4: Create Storage Directories

[](#step-4-create-storage-directories)

The package auto-creates these directories when needed, but you can create them manually:

```
mkdir -p storage/app/public/pdf-reader/{texts,htmls,images}
```

---

Configuration
-------------

[](#configuration)

### Configuration File

[](#configuration-file)

The published `config/pdf-reader.php` file contains:

```
