PHPackages                             ediazaro/receipt-scanner - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Parsing &amp; Serialization](/categories/parsing)
4. /
5. ediazaro/receipt-scanner

ActiveLibrary[Parsing &amp; Serialization](/categories/parsing)

ediazaro/receipt-scanner
========================

Use OpenAI to extract structured receipt and invoice data from Text, Html, Images and PDFs.

v4.0.1(5mo ago)042MITRich Text FormatPHP ^8.1|^8.2

Since Nov 13Pushed 5mo agoCompare

[ Source](https://github.com/ediazaro/receipt-scanner)[ Packagist](https://packagist.org/packages/ediazaro/receipt-scanner)[ Docs](https://github.com/helgesverre/receipt-scanner)[ RSS](/packages/ediazaro-receipt-scanner/feed)WikiDiscussions main Synced 1mo ago

READMEChangelog (1)Dependencies (15)Versions (3)Used By (0)

[![](.github/header.png)](.github/header.png)

> *Need more flexibility?* Try the [Extractor](https://github.com/HelgeSverre/extractor) package instead, a AI-Powered data extraction library for Laravel

AI-Powered Receipt and Invoice Scanner for Laravel
==================================================

[](#ai-powered-receipt-and-invoice-scanner-for-laravel)

[![Latest Version on Packagist](https://camo.githubusercontent.com/b118fe03519e18ebf1167d66d348c6a807066d99567bee951744321976e25dae/68747470733a2f2f696d672e736869656c64732e696f2f7061636b61676973742f762f68656c67657376657272652f726563656970742d7363616e6e65722e7376673f7374796c653d666c61742d737175617265)](https://camo.githubusercontent.com/b118fe03519e18ebf1167d66d348c6a807066d99567bee951744321976e25dae/68747470733a2f2f696d672e736869656c64732e696f2f7061636b61676973742f762f68656c67657376657272652f726563656970742d7363616e6e65722e7376673f7374796c653d666c61742d737175617265)[![Total Downloads](https://camo.githubusercontent.com/49a134f9a42a667d1552aadba4a9ec80011208449707649b031e3f5ea4a09657/68747470733a2f2f696d672e736869656c64732e696f2f7061636b61676973742f64742f68656c67657376657272652f726563656970742d7363616e6e65722e7376673f7374796c653d666c61742d737175617265)](https://camo.githubusercontent.com/49a134f9a42a667d1552aadba4a9ec80011208449707649b031e3f5ea4a09657/68747470733a2f2f696d672e736869656c64732e696f2f7061636b61676973742f64742f68656c67657376657272652f726563656970742d7363616e6e65722e7376673f7374796c653d666c61742d737175617265)

Easily extract structured receipt data from images, PDFs, and emails within your Laravel application using OpenAI.

Features
--------

[](#features)

- Light wrapper around OpenAI Chat and Completion endpoints.
- Accepts text as input and returns structured receipt information.
- Includes a well-tuned prompt for parsing receipts.
- Supports various input formats including Plain Text, PDF, Images, Word documents, and Web content.
- Integrates with [Textract](https://aws.amazon.com/textract/) for OCR functionality.

Installation
------------

[](#installation)

Install the package via composer:

```
composer require helgesverre/receipt-scanner
```

Publish the config file:

```
php artisan vendor:publish --tag="receipt-scanner-config"
```

All the configuration options are documented in the configuration file.

Since this package uses the [OpenAI Laravel Package](https://github.com/openai-php/laravel), so you also need to publish their config and add the `OPENAI_API_KEY` to your `.env` file:

```
php artisan vendor:publish --provider="OpenAI\Laravel\ServiceProvider"
```

```
OPENAI_API_KEY="your-key-here
```

Usage
-----

[](#usage)

### Extracting receipt data from Plain Text

[](#extracting-receipt-data-from-plain-text)

Plain text scanning is useful when you already have the textual representation of a receipt or invoice.

The example is from a Paddle.com receipt email, where I copied all the text in the email, and removed all the empty lines.

```
$text = toString()`) into the `ReceiptScanner::scan()` method.

```
use HelgeSverre\ReceiptScanner\Facades\ReceiptScanner;

ReceiptScanner::scan($textPlainText)
ReceiptScanner::scan($textPdf)
ReceiptScanner::scan($textImageOcr)
ReceiptScanner::scan($textPdfOcr)
ReceiptScanner::scan($textWord)
ReceiptScanner::scan($textWeb)
ReceiptScanner::scan($textHtml)
```

Receipt Data Model
------------------

[](#receipt-data-model)

The scanned receipt is parsed into a DTO which consists of a main `Receipt` class, which contains the receipt metadata, and a `Merchant` dto, representing the seller on the receipt or invoice, and an array of `LineItem` DTOs holding each individual line item.

- `HelgeSverre\ReceiptScanner\Data\Receipt`
- `HelgeSverre\ReceiptScanner\Data\Merchant`
- `HelgeSverre\ReceiptScanner\Data\LineItem`

The DTO has a `toArray()` method, which will result in a structure like this:

For flexibility, all fields are nullable.

```
[
    "orderRef" => "string",
    "date" => "date",
    "taxAmount" => "number",
    "totalAmount" => "number",
    "currency" => "string",
    "merchant" => [
        "name" => "string",
        "vatId" => "string",
        "address" => "string",
    ],
    "lineItems" => [
        [
            "text" => "string",
            "sku" => "string",
            "qty" => "number",
            "price" => "number",
        ],
    ],
];
```

Returning an Array instead of a DTO
-----------------------------------

[](#returning-an-array-instead-of-a-dto)

If you prefer to work with an array instead of the built-in DTO, you can specify `asArray: true` when calling `scan()`

```
use HelgeSverre\ReceiptScanner\Facades\ReceiptScanner;

ReceiptScanner::scan(
    $textPlainText
    asArray: true
)
```

Specifying the model
--------------------

[](#specifying-the-model)

To use a different model, you can specify the model name to use with the `model` named argument when calling the `scan()` method.

```
use HelgeSverre\ReceiptScanner\Facades\ReceiptScanner;
use HelgeSverre\ReceiptScanner\ModelNames;

// With the ModelNames class
ReceiptScanner::scan($content, model: ModelNames::GPT4_1106_PREVIEW)

// With a string
ReceiptScanner::scan($content, model: 'gpt-4-1106-preview')
```

All parameters and what they do
-------------------------------

[](#all-parameters-and-what-they-do)

**`$text` (TextContent|string)**

The input text from the receipt or invoice that needs to be parsed. It accepts either a `TextContent` object or a string.

\*\*`$model` (string)

This parameter specifies the OpenAI model used for the extraction process.

`HelgeSverre\ReceiptScanner\ModelNames` is a class containing constants for each model, provided for convenience. However, you can also directly use a string to specify the model if you prefer.

Different models have different speed/accuracy characteristics.

If you require high accuracy, use a GPT-4 model, if you need speed, use a GPT-3 model, if you need even more speed, use the `gpt-3.5-turbo-instruct` model.

The default model is `ModelNames::TURBO_INSTRUCT`.

`ModelNames` ConstantValue`ModelNames::TURBO``gpt-3.5-turbo``ModelNames::TURBO_INSTRUCT``gpt-3.5-turbo-instruct``ModelNames::TURBO_1106``gpt-3.5-turbo-1106``ModelNames::TURBO_16K``gpt-3.5-turbo-16k``ModelNames::TURBO_0613``gpt-3.5-turbo-0613``ModelNames::TURBO_16K_0613``gpt-3.5-turbo-16k-0613``ModelNames::TURBO_0301``gpt-3.5-turbo-0301``ModelNames::GPT4``gpt-4``ModelNames::GPT4_32K``gpt-4-32k``ModelNames::GPT4_32K_0613``gpt-4-32k-0613``ModelNames::GPT4_1106_PREVIEW``gpt-4-1106-preview``ModelNames::GPT4_0314``gpt-4-0314``ModelNames::GPT4_32K_0314``gpt-4-32k-0314`**`$maxTokens` (int)**

The maximum number of tokens that the model will processes. The default value is `2000`, adjusting this value may be necessary for very long text, but 2000 is "usually" fairly good.

**`$temperature` (float)**

Controls the randomness/creativity of the model's output.

A higher value (e.g., 0.8) makes the output more random, which is usually not what we want in this scenario, I usually go with 0.1 or 0.2, anything over 0.5 becomes useless. Defaults to `0.1`.

**`$template` (string)**

This parameter specifies the template used for the prompt.

The default template is `'receipt'`. You can create and use additional templates by adding new blade files in the `resources/views/vendor/receipt-scanner/` directory and specifying the file name (without extension) as the `$template` value (eg: `"minimal_invoice"`.

**`$asArray` (bool)**

If true, returns the response from the AI model as an array instead of as a DTO, useful if you need to modifythe default DTO to have more/less fields or want to convert the response into your own DTO, defaults to `false`

### Example Usage:

[](#example-usage)

```
use HelgeSverre\ReceiptScanner\Facades\ReceiptScanner;

$parsedReceipt = ReceiptScanner::scan(
    text: $textInput,
    model: ModelNames::TURBO_INSTRUCT,
    maxTokens: 500,
    temperature: 0.2,
    template: 'minimal_invoice',
    asArray: true,
);
```

### List of supported models

[](#list-of-supported-models)

Enum ValueModel nameEndpointTURBO\_INSTRUCTgpt-3.5-turbo-instructCompletionTURBO\_16Kgpt-3.5-turbo-16kChatTURBOgpt-3.5-turboChatGPT4gpt-4ChatGPT4\_32Kgpt-4-32ChatOCR Configuration with AWS Textract
-----------------------------------

[](#ocr-configuration-with-aws-textract)

To use AWS Textract for extracting text from large images and multi-page PDFs, the package needs to upload the file to S3 and pass the s3 object location along to the textract service.

So you need to configure your AWS Credentials in the `config/receipt-scanner.php` file as follows:

```
TEXTRACT_KEY="your-aws-access-key"
TEXTRACT_SECRET="your-aws-security"
TEXTRACT_REGION="your-textract-region"

# Can be omitted
TEXTRACT_VERSION="2018-06-27"
```

You also need to configure a seperate Textract disk where the files will be stored, open your `config/filesystems.php` configuration file and add the following:

```
'textract' => [
    'driver' => 's3',
    'key' => env('TEXTRACT_KEY'),
    'secret' => env('TEXTRACT_SECRET'),
    'region' => env('TEXTRACT_REGION'),
    'bucket' => env('TEXTRACT_BUCKET'),
],
```

Ensure the `textract_disk` setting in `config/receipt-scanner.php` is the same as your disk name in the `filesystems.php`config, you can change it with the .env value `TEXTRACT_DISK`.

```
return [
    "textract_disk" => env("TEXTRACT_DISK")
];
```

`.env`

```
TEXTRACT_DISK="uploads"
```

**Note**

Textract is not available in all regions:

> Q: In which AWS regions is Amazon Textract available? Amazon Textract is currently available in the US East (Northern Virginia), US East (Ohio), US West (Oregon), US West ( N. California), AWS GovCloud (US-West), AWS GovCloud (US-East), Canada (Central), EU (Ireland), EU (London), EU ( Frankfurt), EU (Paris), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Seoul), and Asia Pacific ( Mumbai) Regions.

See:

Publishing Prompts
------------------

[](#publishing-prompts)

You may publish the prompt file that is used under the hood by running this command:

```
php artisan vendor:publish --tag="receipt-scanner-prompts"
```

This package simply uses blade files as prompts, the `{{ $context }}` variable will be replaced by the text you pass to `ReceiptScanner::scan("text here")`.

Adding prompts/templates
------------------------

[](#adding-promptstemplates)

By default, the package uses the `receipt.blade.php` file as its prompt template, you may add additional templates by simply creating a blade file in `resources/views/vendor/receipt-scanner/minimal_invoice.blade.php` and changing the `$template` parameter when calling `scan()`

**Example prompt:**

```
Extract the following fields from the text below, output as JSON

date (as string in the  Y-m-d format)
total_amount (as float, do not include currency symbol)
vendor_name (company name)

{{ $context }}

OUTPUT IN JSON
```

```
use HelgeSverre\ReceiptScanner\Facades\ReceiptScanner;

$receipt = ReceiptScanner::scan(
    text: "Your invoice here",
    model:  ModelNames::TURBO_INSTRUCT,
    template: 'minimal_invoice',
    asArray: true,
);
```

License
-------

[](#license)

This package is licensed under the MIT License. For more details, refer to the [License File](LICENSE.md).

###  Health Score

37

—

LowBetter than 82% of packages

Maintenance74

Regular maintenance activity

Popularity8

Limited adoption so far

Community10

Small or concentrated contributor base

Maturity48

Maturing project, gaining track record

 Bus Factor1

Top contributor holds 85.1% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~0 days

Total

2

Last Release

176d ago

### Community

Maintainers

![](https://www.gravatar.com/avatar/97184fa8b835bc3a28d4a7cc9118f778ab9cdd702ad0b7c97aa60b2134b64e9f?d=identicon)[ediazaro](/maintainers/ediazaro)

---

Top Contributors

[![HelgeSverre](https://avatars.githubusercontent.com/u/1089652?v=4)](https://github.com/HelgeSverre "HelgeSverre (40 commits)")[![ediazaro](https://avatars.githubusercontent.com/u/159196692?v=4)](https://github.com/ediazaro "ediazaro (4 commits)")[![pondi](https://avatars.githubusercontent.com/u/4928170?v=4)](https://github.com/pondi "pondi (2 commits)")[![joskolenberg](https://avatars.githubusercontent.com/u/26161164?v=4)](https://github.com/joskolenberg "joskolenberg (1 commits)")

---

Tags

laravel-packagereceiptscanner

###  Code Quality

TestsPest

Code StyleLaravel Pint

### Embed Badge

![Health badge](/badges/ediazaro-receipt-scanner/health.svg)

```
[![Health](https://phpackages.com/badges/ediazaro-receipt-scanner/health.svg)](https://phpackages.com/packages/ediazaro-receipt-scanner)
```

###  Alternatives

[helgesverre/extractor

AI-Powered Data Extraction for your Laravel application.

22128.0k](/packages/helgesverre-extractor)[helgesverre/receipt-scanner

Use OpenAI to extract structured receipt and invoice data from Text, Html, Images and PDFs.

1438.2k](/packages/helgesverre-receipt-scanner)[spatie/laravel-sitemap

Create and generate sitemaps with ease

2.6k14.6M107](/packages/spatie-laravel-sitemap)[vormkracht10/laravel-mails

Laravel Mails can collect everything you might want to track about the mails that has been sent by your Laravel app.

24149.7k](/packages/vormkracht10-laravel-mails)[spatie/laravel-markdown-response

Serve markdown versions of your HTML pages to AI agents and bots

6512.6k](/packages/spatie-laravel-markdown-response)[spatie/laravel-visit

Quickly visit any route of your Laravel app

15614.6k](/packages/spatie-laravel-visit)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
