PHPackages                             inetprocess/neuralyzer - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Database &amp; ORM](/categories/database)
4. /
5. inetprocess/neuralyzer

Abandoned → [edyan/neuralyzer](/?search=edyan%2Fneuralyzer)Library[Database &amp; ORM](/categories/database)

inetprocess/neuralyzer
======================

Library and CLI for Data anonymization

v4.1(5y ago)533.0k11[2 issues](https://github.com/edyan/neuralyzer/issues)[1 PRs](https://github.com/edyan/neuralyzer/pulls)1GPL-2.0-or-laterPHPPHP &gt;=7.2.5

Since Jan 5Pushed 1y ago9 watchersCompare

[ Source](https://github.com/edyan/neuralyzer)[ Packagist](https://packagist.org/packages/inetprocess/neuralyzer)[ RSS](/packages/inetprocess-neuralyzer/feed)WikiDiscussions master Synced 2mo ago

READMEChangelog (10)Dependencies (11)Versions (30)Used By (1)

[![Scrutinizer Code Quality](https://camo.githubusercontent.com/873c1b4db405ee2905d3de7c1dafa45e42572ea5d46b8cc625b8fd350ec3ba62/68747470733a2f2f7363727574696e697a65722d63692e636f6d2f672f656479616e2f6e657572616c797a65722f6261646765732f7175616c6974792d73636f72652e706e673f623d6d6173746572)](https://scrutinizer-ci.com/g/edyan/neuralyzer/?branch=master)[![Code Coverage](https://camo.githubusercontent.com/4bf106c2e19b217405fa1dda227ebcd106824253e9114f328150e227c445ad9b/68747470733a2f2f7363727574696e697a65722d63692e636f6d2f672f656479616e2f6e657572616c797a65722f6261646765732f636f7665726167652e706e673f623d6d6173746572)](https://scrutinizer-ci.com/g/edyan/neuralyzer/?branch=master)[![Build Status](https://camo.githubusercontent.com/c2897597fb314ad87065d8cf21299565bb370543e55210cc832eecee8cfd096a/68747470733a2f2f7363727574696e697a65722d63692e636f6d2f672f656479616e2f6e657572616c797a65722f6261646765732f6275696c642e706e673f623d6d6173746572)](https://scrutinizer-ci.com/g/edyan/neuralyzer/build-status/master)[![Build Status](https://camo.githubusercontent.com/6f5107ea90c55c2c45b87cabd4fdc8a54cd90ae02cf320b12a232517a3c396ca/68747470733a2f2f7472617669732d63692e636f6d2f656479616e2f6e657572616c797a65722e7376673f6272616e63683d6d6173746572)](https://travis-ci.com/edyan/neuralyzer)

edyan/neuralyzer
================

[](#edyanneuralyzer)

Summary
-------

[](#summary)

This project is a library and a command line tool that **anonymizes** a database by updating data or generating fake data (update vs insert). It uses [Faker](https://github.com/fakerphp/faker)to generate data from rules defined in a configuration file.

As it can do row per row or use batch mechanisms, you can load tables with dozens of millions of fake records.

It uses [Doctrine DBAL](https://github.com/doctrine/dbal) to abstract interactions with databases. It's then supposed to be able to work with any database type. Currently it works (tested extensively) with MySQL, PostgreSQL and SQLServer.

Neuralyzer has an option to clean tables by injecting a `DELETE FROM` with a `WHERE` critera before launching the anonymization (see the config parameters `delete` and `delete_where`).

Neuralyzer had an option to clean tables but it's now managed by pre and post actions :

```
entities:
    books:
        cols:
            title: { method: sentence, params: [8], unique: true }
        action: update
        pre_actions:
            - db.query("DELETE FROM books")
post_actions:
    - db.query("DELETE FROM books WHERE title LIKE '%war%'")
```

Installation as a library
-------------------------

[](#installation-as-a-library)

```
composer require edyan/neuralyzer
```

Installation as an executable
-----------------------------

[](#installation-as-an-executable)

You can even download the executable directly (example with v3.1):

```
$ wget https://github.com/edyan/neuralyzer/raw/v4.0/neuralyzer.phar
$ sudo mv neuralyzer.phar /usr/local/bin/neuralyzer
$ sudo chmod +x /usr/local/bin/neuralyzer
$ neuralyzer
```

Usage
-----

[](#usage)

The easiest way to use that tool is to start with the command line tool. After cloning the project and running a `composer install`, try:

```
$ bin/neuralyzer
```

### Generate the configuration automatically

[](#generate-the-configuration-automatically)

Neuralyzer is able to read a database and generate the configuration for you. The command `config:generate` accepts the following options:

```
Options:
    -D, --driver=DRIVER              Driver (check Doctrine documentation to have the list) [default: "pdo_mysql"]
    -H, --host=HOST                  Host [default: "127.0.0.1"]
    -d, --db=DB                      Database Name
    -u, --user=USER                  User Name [default: "www-data"]
    -p, --password=PASSWORD          Password (or it'll be prompted)
    -f, --file=FILE                  File [default: "neuralyzer.yml"]
        --protect                    Protect IDs and other fields
        --ignore-table=IGNORE-TABLE  Table to ignore. Can be repeated (multiple values allowed)
        --ignore-field=IGNORE-FIELD  Field to ignore. Regexp in the form "table.field". Can be repeated (multiple values allowed)

```

#### Example

[](#example)

```
bin/neuralyzer config:generate --db test_db -u root -p root --ignore-table config --ignore-field ".*\.id.*"
```

That produces a file which looks like:

```
entities:
    authors:
        cols:
            first_name: { method: firstName, unique: false }
            last_name: { method: lastName, unique: false }
        action: update # Will update existing data, "insert" would create new data
        pre_actions: {  }
        post_actions: {  }

    books:
        cols:
            name: { method: sentence, params: [8] }
            date_modified: { method: date, params: ['Y-m-d H:i:s', now] }
        action: update
        pre_actions: {  }
        post_actions: {  }

guesser: Edyan\Neuralyzer\Guesser
guesser_version: '3.0'
language: en_US
```

You have to modify the file to change its configuration. For example, if you need to remove data while anonymizing and change the language (see [Faker's doc](https://fakerphp.github.io/) for available languages), do :

```
# be careful that some languages have only a few methods.
# Example : https://github.com/FakerPHP/Faker/tree/v1.14.1/src/Faker/Provider/fr_FR
language: fr_FR
```

**INFO**: You can also use delete in standalone, without anonymizing anything. That will delete everything in books:

```
entities:
    authors:
        cols:
            first_name: { method: firstName, unique: false }
            last_name: { method: lastName, unique: false }
        action: update
    books:
        pre_actions:
            - db.query("DELETE FROM books")
```

If you wanted to delete everything then insert 1000 new books:

```
guesser_version: '3.0'
entities:
    authors:
        cols:
            first_name: { method: firstName, unique: false }
            last_name: { method: lastName, unique: false }
        action: update
    books:
        cols:
            name: { method: sentence, params: [8] }
        action: insert
        pre_actions:
            - db.query("DELETE FROM books")
        limit: 1000
```

### Run the anonymizer

[](#run-the-anonymizer)

To run the anonymizer, the command is simply "run" and expects:

```
Options:
    -D, --driver=DRIVER      Driver (check Doctrine documentation to have the list) [default: "pdo_mysql"]
    -H, --host=HOST          Host [default: "127.0.0.1"]
    -d, --db=DB              Database Name
    -u, --user=USER          User Name [default: "www-data"]
    -p, --password=PASSWORD  Password (or prompted)
    -c, --config=CONFIG      Configuration File [default: "neuralyzer.yml"]
    -t, --table=TABLE        Do a single table
        --pretend            Don't run the queries
    -s, --sql                Display the SQL

    -m, --mode=MODE          Set the mode : batch or queries [default: "batch"]

```

#### Example

[](#example-1)

```
bin/neuralyzer run --db test_db -u root -p root
```

That produces that kind of output:

```
Anonymizing authors
 2/2 [============================] 100%

Queries:
UPDATE authors SET first_name = 'Don', last_name = 'Wisoky' WHERE id = '1'
UPDATE authors SET first_name = 'Sasha', last_name = 'Denesik' WHERE id = '2'

....
```

**WARNING**: On a huge table, `--sql` will produce a HUGE output. Use it for debugging purpose.

Library
-------

[](#library)

The library is made to be integrated with any Tool such as a CLI tool. It contains:

- A Configuration Reader and a Configuration Writer
- A Guesser
- A DB Anonymizer

### Guesser

[](#guesser)

The guesser is the central piece of the config generator. It guesses, according to the field name or field type what type of faker method to apply.

It can be extended very easily as it has to be injected to the Writer.

### Configuration Writer

[](#configuration-writer)

The writer is helpful to generate a yaml file that contains all tables and fields from a DB. A basic usage could be the following:

```
