PHPackages                             survos/import-bundle - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Utility &amp; Helpers](/categories/utility)
4. /
5. survos/import-bundle

ActiveSymfony-bundle[Utility &amp; Helpers](/categories/utility)

survos/import-bundle
====================

import-bundle Bundle

2.0.182(1mo ago)0500—0%3MITPHPPHP ^8.4

Since Oct 27Pushed 1mo agoCompare

[ Source](https://github.com/survos/import-bundle)[ Packagist](https://packagist.org/packages/survos/import-bundle)[ GitHub Sponsors](https://github.com/kbond)[ RSS](/packages/survos-import-bundle/feed)WikiDiscussions main Synced 1mo ago

READMEChangelogDependencies (46)Versions (146)Used By (3)

SurvosImportBundle
==================

[](#survosimportbundle)

Symfony bundle that provides tools for importing data.

SurvosImportBundle helps you get **raw CSV/JSON data into your database via Doctrine** with minimal fuss.

Typical problems this bundle solves:

- You have **CSV or JSON exports** (from an API, a vendor, a legacy system…) and you want them in your app’s database.
- You need a **real primary key**, correct **Doctrine field types** (int, float, bool, datetime, json, text…), and ideally some **basic statistics** to make good schema decisions.
- You want a **repeatable pipeline** that goes from:
    1. Raw file → cleaned, normalized **JSONL + profile**
    2. JSONL + profile → **Doctrine entity** with good defaults
    3. JSONL → **Doctrine entities persisted** efficiently (batches, progress, etc.)

SurvosImportBundle provides exactly that pipeline:

1. `import:convert` – convert raw CSV/JSON into JSONL + a profile with field statistics.
2. `code:entity` – generate a Doctrine entity from that profile (via SurvosCodeBundle).
3. `import:entities` – import JSONL records into your database using Doctrine.

You can also use it in a simpler “direct CSV → Entity → Import” mode for quick one-off jobs and demos.

---

Table of Contents
-----------------

[](#table-of-contents)

1. [Installation](#installation)
2. [Quick Start (Direct CSV → Entity → Import)](#quick-start-direct-csv--entity--import)
3. [Concepts](#concepts)
    - [JSONL](#jsonl)
    - [Profile](#profile)
4. [The Pipeline](#the-pipeline)
    - [1. import:convert](#1-importconvert)
    - [2. code:entity](#2-codeentity)
    - [3. import:entities](#3-importentities)
5. [End-to-End Example](#end-to-end-example)
6. [Complete Demo App with EasyAdmin](#complete-demo-app-with-easyadmin)
7. [Castor Automation](#castor-automation)
8. [Events &amp; Extensibility](#events--extensibility)
9. [Filesystem Indexing (`import:dir`)](#filesystem-indexing-importdir)
10. [Tips &amp; Gotchas](#tips--gotchas)
11. [See Also](#see-also)

---

Installation
------------

[](#installation)

```
composer require survos/import-bundle
composer require --dev survos/code-bundle
```

Register the bundle if you’re not using auto-discovery:

```
// config/bundles.php
return [
    Survos\ImportBundle\SurvosImportBundle::class => ['all' => true],
];
```

---

Quick Start (Direct CSV → Entity → Import)
------------------------------------------

[](#quick-start-direct-csv--entity--import)

This is the minimal “I just want my CSV in Doctrine” flow.

In short, install the bundles:

```
composer req survos/import-bundle
composer req --dev survos/code-bundle
```

First, create an entity class by inspecting the first line (and/or a sample) of a CSV file:

```
bin/console code:entity Movie --file=data/movies.csv
```

The entity has property names that loosely match the CSV headers
(e.g. `"First Name"` becomes `$firstName` in the entity).

Then import the data:

```
bin/console import:entities Movie --file data/movies.csv --limit 500
```

That’s the “fast path” for simple, flat CSVs.

For more control and richer metadata, use the JSONL-based pipeline below.

---

Concepts
--------

[](#concepts)

### JSONL

[](#jsonl)

The bundle normalizes input into **JSON Lines (JSONL)**:

- One JSON object **per line**
- Easy to stream in batches
- Unix-friendly
- Plays nicely with SurvosJsonlBundle and other ETL tools

Example (`movies.jsonl`):

```
{"id": 1, "title": "The Matrix", "year": 1999}
{"id": 2, "title": "Inception", "year": 2010}
```

### Profile

[](#profile)

Conversion also generates a **profile** (`*.profile.json`) containing:

- Field type inference
- Null count, distinct count
- String length stats
- Boolean-like detection
- Facet candidate detection
- Primary key candidates
- First/last samples
- Min/max distributions

This powers `code:entity` to emit correct Doctrine field mappings (e.g. using `Types::TEXT` when max length &gt; 255).

---

The Pipeline
------------

[](#the-pipeline)

### 1. `import:convert`

[](#1-importconvert)

**Goal:** Transform CSV/JSON/ZIP/GZ input into:

- A normalized `*.jsonl` file
- A detailed `*.profile.json` file

**Usage:**

```
bin/console import:convert data/movies.csv --dataset=movies
```

Features:

- Detects CSV / JSON / JSONL / ZIP / GZIP automatically
- Normalizes encoding
- Produces JSONL for streaming
- Produces a profile with complete field statistics
- Supports `--limit`, `--tags`, `--dataset`

---

### 2. `code:entity`

[](#2-codeentity)

*(from SurvosCodeBundle, but part of this pipeline)*

**Goal:** Generate a Doctrine entity from a JSONL profile.

Example:

```
bin/console code:entity data/movies.profile.json App\\Entity\\Movie
```

What it infers:

- Primary key (or use `--pk`)
- Doctrine field types:
    - small strings → `string`
    - long strings (length &gt; 255) → `Types::TEXT`
    - ints/floats
    - datetime/dates
    - json for nested structures
- Public properties with helpful PHPDoc derived from the profile
- `#[ORM\Entity(repositoryClass: ...)]`

You review/tweak it, then generate schema/migrations.

---

### 3. `import:entities`

[](#3-importentities)

**Goal:** Insert the JSONL data into your database using Doctrine.

Example:

```
bin/console import:entities App\\Entity\\Movie data/movies.jsonl
```

Key features:

- Batch processing (`--batch=200`)
- PK assignment via `--pk`
- Reset/truncate via `--reset`
- Progress bar
- Works with any Doctrine entity

---

End-to-End Example
------------------

[](#end-to-end-example)

### Step 1 — Convert CSV → JSONL + profile

[](#step-1--convert-csv--jsonl--profile)

```
bin/console import:convert data/movies.csv --dataset=movies
```

Produces:

- `data/movies.jsonl`
- `data/movies.profile.json`

### Step 2 — Generate Doctrine entity

[](#step-2--generate-doctrine-entity)

```
bin/console code:entity data/movies.profile.json App\\Entity\\Movie --pk=id
```

Creates something like:

```
#[ORM\Entity(repositoryClass: MovieRepository::class)]
class Movie
{
    #[ORM\Id]
    #[ORM\Column(type: 'integer')]
    public ?int $id = null;

    #[ORM\Column(length: 255, nullable: true)]
    public ?string $title = null;

    #[ORM\Column(type: 'integer', nullable: true)]
    public ?int $year = null;

    // ...
}
```

### Step 3 — Import entities

[](#step-3--import-entities)

```
bin/console import:entities App\\Entity\\Movie data/movies.jsonl --pk=id
```

Done — your DB is now populated.

---

Complete Demo App with EasyAdmin
--------------------------------

[](#complete-demo-app-with-easyadmin)

This is a complete “from scratch” demo using EasyAdmin to view the data.

### Prerequisites

[](#prerequisites)

- symfony CLI
- curl
- PHP 8.4 (the demo uses property hooks)
- gunzip (because the demo data is gzipped)

### Commands

[](#commands)

```
symfony new import-demo --webapp  && cd import-demo
composer config extra.symfony.allow-contrib true
echo "DATABASE_URL=sqlite:///%kernel.project_dir%/var/data.db" > .env.local
symfony server:start -d

composer req --dev survos/code-bundle
composer req survos/import-bundle league/csv
composer req easycorp/easyadmin-bundle:4.x-dev

mkdir -p data
curl -L -o data/movies.csv.gz https://github.com/metarank/msrd/raw/master/dataset/movies.csv.gz
gunzip data/movies.csv.gz

# sanity check
head -n 2 data/movies.csv

# generate entity from CSV
bin/console code:entity Movie --file=data/movies.csv

# create schema
bin/console d:sch:update --force

# import some data
bin/console import:entities Movie --file data/movies.csv --limit 500

# EasyAdmin dashboard + CRUD
bin/console make:admin:dashboard -n
bin/console make:admin:crud App\\Entity\\Movie -n
```

For reasons that are still a bit mysterious, clearing the cache inline doesn’t always work, so run:

```
bin/console cache:clear
bin/console cache:pool:clear cache.app
symfony open:local --path=/admin/movie
```

---

Castor Automation
-----------------

[](#castor-automation)

Instead of the bash script above, you can run everything as a Castor command, after installing Castor:

```
curl "https://castor.jolicode.com/install" | bash
```

Now create a project, download the castor file and build using it:

```
symfony new import-demo --webapp && cd import-demo

curl -L https://github.com/survos/import-bundle/raw/master/app/castor.php -o castor.php

castor build
```

This will scaffold the demo, run imports, and set up admin views in one go.

---

Events &amp; Extensibility
--------------------------

[](#events--extensibility)

SurvosImportBundle emits events so you can **tweak records on the fly** during conversion/import.
The three main ImportBundle events are:

1. `ImportConvertStartedEvent`

    - Emitted when an import/convert run starts.
    - Carries dataset name, input path, limit, tags, etc.
    - Good place for initialization, logging, or dataset-specific setup.
2. `ImportConvertRowEvent`

    - Emitted for **every row** during conversion.
    - Lets you mutate, enrich, or even drop records before they are written to JSONL.
    - You can:
        - Normalize IDs
        - Slugify codes
        - Attach derived URLs
        - Store images to disk
        - Deduplicate by tracking `$event->index`/keys
3. `ImportConvertFinishedEvent`

    - Emitted when conversion finishes.
    - Good for summaries, flushing caches, or post-processing.

You can also listen to JsonlBundle’s events (e.g. `JsonlConvertStartedEvent`, `JsonlRecordEvent`) for lower-level control of JSONL conversion.

### Example: Enriching Records During Conversion

[](#example-enriching-records-during-conversion)

Here’s a simplified example based on a real service used in this bundle’s demos:

```
