PHPackages                             geoffroy-aubry/awk-csv-parser - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Parsing &amp; Serialization](/categories/parsing)
4. /
5. geoffroy-aubry/awk-csv-parser

ActiveLibrary[Parsing &amp; Serialization](/categories/parsing)

geoffroy-aubry/awk-csv-parser
=============================

AWK and Bash code to easily parse CSV files, with possibly embedded commas and quotes.

v1.0.2(8y ago)547.6k14[1 issues](https://github.com/geoffroy-aubry/awk-csv-parser/issues)[1 PRs](https://github.com/geoffroy-aubry/awk-csv-parser/pulls)1LGPL-3.0+Shell

Since Oct 22Pushed 8y ago8 watchersCompare

[ Source](https://github.com/geoffroy-aubry/awk-csv-parser)[ Packagist](https://packagist.org/packages/geoffroy-aubry/awk-csv-parser)[ RSS](/packages/geoffroy-aubry-awk-csv-parser/feed)WikiDiscussions stable Synced today

READMEChangelogDependenciesVersions (4)Used By (1)

Awk CSV parser
==============

[](#awk-csv-parser)

[![Latest stable version](https://camo.githubusercontent.com/da03d58c26c2f63d2578e045234f48ec2be260e825a60ac39393cf2478c7ca35/68747470733a2f2f706f7365722e707567782e6f72672f67656f6666726f792d61756272792f61776b2d6373762d7061727365722f762f737461626c652e706e67 "Latest stable version")](https://packagist.org/packages/geoffroy-aubry/awk-csv-parser)[![Build Status](https://camo.githubusercontent.com/1ecca54c9015e38c489873279463dc94e6f6f19b56b9744e19bbdf46e91914ab/68747470733a2f2f7365637572652e7472617669732d63692e6f72672f67656f6666726f792d61756272792f61776b2d6373762d7061727365722e706e673f6272616e63683d737461626c65)](http://travis-ci.org/geoffroy-aubry/awk-csv-parser)

AWK and Bash code to easily parse CSV files, with possibly embedded commas and quotes.

Table of Contents
-----------------

[](#table-of-contents)

- [Features](#features)
    - [Known limitations](#known-limitations)
    - [Links](#links)
- [Requirements](#requirements)
- [Usage](#usage)
- [Examples](#examples)
- [Installation](#installation)
- [Copyrights &amp; licensing](#copyrights--licensing)
- [Change log](#change-log)
- [Continuous integration](#continuous-integration)
- [Git branching model](#git-branching-model)

Features
--------

[](#features)

- Parse CSV files with only Bash and Awk.
- Allow to process CSV data with standard UNIX shell commands.
- Properly handle CSV data that contain field separators (commas by default) and field enclosures (double quotes by default) inside enclosed data fields.
- Process CSVs from stdin pipe as well as from multiple command line file arguments.
- Handle any character both for field separator and field enclosure.
- Can rewrite CSV records with a multi-character output field separator, CSV enclosure characters removed and escaped enclosures unescaped.
- Each line may not contain the same number of fields throughout the file.

### Known limitations

[](#known-limitations)

- Does not **yet** handle embedded newlines inside data fields.

### Links

[](#links)

- [Wikipedia: Comma-separated values](http://en.wikipedia.org/wiki/Comma-separated_values)
- [RFC 4180: Common Format and MIME Type for Comma-Separated Values (CSV) Files](http://tools.ietf.org/html/rfc4180)

Other Awk implementations:

- [dbro/csvquote](https://github.com/dbro/csvquote)
- [AWK CSV Parser](http://lorance.freeshell.org/csv/)

Requirements
------------

[](#requirements)

- Bash v4 *(2009)* and above
- GNU [Awk](http://www.gnu.org/software/gawk/) 3.1+

Tested on Debian/Ubuntu Linux.

Usage
-----

[](#usage)

Displayed by:

```
$ awk-csv-parser.sh --help
```

[![Help on command prompt](doc/screenshots/awk-csv-parser.png)](doc/screenshots/awk-csv-parser.png)

##### Text version

[](#text-version)

```
Description
    AWK and Bash code to easily parse CSV files, with possibly embedded commas and quotes.

Usage
    awk-csv-parser.sh [OPTION]… []…

Options
    -e , --enclosure=
        Set the CSV field enclosure. One character only, '"' (double quote) by default.

    -o , --output-separator=
        Set the output field separator. Multiple characters allowed, '|' (pipe) by default.

    -s , --separator=
        Set the CSV field separator. One character only, ',' (comma) by default.

    -h, --help
        Display this help.

        CSV file to parse.

Discussion
    – The last record in the file may or may not have an ending line break.
    – Each line may not contain the same number of fields throughout the file.
    – The last field in the record must not be followed by a field separator.
    – Fields containing field enclosures or field separators must be enclosed in field
      enclosure.
    – A field enclosure appearing inside a field must be escaped by preceding it with
      another field enclosure. Example: "aaa","b""bb","ccc"

Examples
    Parse a CSV and display records without field enclosure, fields pipe-separated:
        awk-csv-parser.sh --output-separator='|' resources/iso_3166-1.csv

    Remove CSV's header before parsing:
        tail -n+2 resources/iso_3166-1.csv | awk-csv-parser.sh

    Keep only first column of multiple files:
        awk-csv-parser.sh a.csv b.csv c.csv | cut -d'|' -f1

    Keep only first column, using multiple UTF-8 characters output separator:
        awk-csv-parser.sh -o '⇒⇒' resources/iso_3166-1.csv | awk -F '⇒⇒' '{print $1}'

    You can directly call the Awk script:
        awk -f csv-parser.awk -v separator=',' -v enclosure='"' --source '{
            csv_parse_record($0, separator, enclosure, csv)
            print csv[2] " ⇒ " csv[0]
        }' resources/iso_3166-1.csv

```

Examples
--------

[](#examples)

Excerpt from `resources/iso_3166-1.csv` ([full version](resources/iso_3166-1.csv)):

```
Country or Area Name,ISO ALPHA-2 Code,ISO ALPHA-3 Code,ISO Numeric Code
Brazil,BR,BRA,076
British Virgin Islands,VG,VGB,092
British Indian Ocean Territory,IO,IOT,086
Brunei Darussalam,BN,BRN,096
Burkina Faso,BF,BFA,854
"Hong Kong, Special Administrative Region of China",HK,HKG,344
"Macao, Special Administrative Region of China",MO,MAC,446
Christmas Island,CX,CXR,162
Cocos (Keeling) Islands,CC,CCK,166
```

##### 1. Parse a CSV and display records without field enclosure, output fields pipe-separated

[](#1-parse-a-csv-and-display-records-without-field-enclosure-output-fields-pipe-separated)

```
$ awk-csv-parser.sh --output-separator='|' resources/iso_3166-1.csv | head -n10
# or:
$ cat resources/iso_3166-1.csv | awk-csv-parser.sh --output-separator='|' | head -n10
```

Result:

```
Country or Area Name|ISO ALPHA-2 Code|ISO ALPHA-3 Code|ISO Numeric Code|
Brazil|BR|BRA|076|
British Virgin Islands|VG|VGB|092|
British Indian Ocean Territory|IO|IOT|086|
Brunei Darussalam|BN|BRN|096|
Burkina Faso|BF|BFA|854|
Hong Kong, Special Administrative Region of China|HK|HKG|344|
Macao, Special Administrative Region of China|MO|MAC|446|
Christmas Island|CX|CXR|162|
Cocos (Keeling) Islands|CC|CCK|166|
```

##### 2. Remove CSV header, keep only first column and grep fields containing separator

[](#2-remove-csv-header-keep-only-first-column-and-grep-fields-containing-separator)

```
$ tail -n+2 resources/iso_3166-1.csv | awk-csv-parser.sh | cut -d'|' -f1 | grep ,
```

Result:

```
Hong Kong, Special Administrative Region of China
Macao, Special Administrative Region of China
Congo, Democratic Republic of the
Iran, Islamic Republic of
Korea, Democratic People's Republic of
Korea, Republic of
Micronesia, Federated States of
Taiwan, Republic of China
Tanzania, United Republic of

```

##### 3. You can directly call the Awk script

[](#3-you-can-directly-call-the-awk-script)

```
$ awk -f csv-parser.awk -v separator=',' -v enclosure='"' --source '{
    csv_parse_record($0, separator, enclosure, csv)
    print csv[2] " ⇒ " csv[0]
}' resources/iso_3166-1.csv | head -n10
```

Result:

```
ISO ALPHA-3 Code ⇒ Country or Area Name
BRA ⇒ Brazil
VGB ⇒ British Virgin Islands
IOT ⇒ British Indian Ocean Territory
BRN ⇒ Brunei Darussalam
BFA ⇒ Burkina Faso
HKG ⇒ Hong Kong, Special Administrative Region of China
MAC ⇒ Macao, Special Administrative Region of China
CXR ⇒ Christmas Island
CCK ⇒ Cocos (Keeling) Islands

```

##### 4. Technical example

[](#4-technical-example)

Content of `tests/resources/ok.csv`:

```
,,
a, b,c , d ,e e
"","a","a,",",a",",,"
"a""b","""","c"""""
```

Test:

```
$ awk-csv-parser.sh tests/resources/ok.csv
```

Result:

```
|| |
a| b|c | d |e e|
|a|a,|,a|,,|
a"b|"|c""|

```

##### 5. Errors

[](#5-errors)

Content of `tests/resources/invalid.csv`:

```
"
"a,
a"
"a"b
```

Test:

```
$ awk-csv-parser.sh tests/resources/invalid.csv
```

Result:

```
[CSV ERROR: 3] Missing closing quote after '' in following record: '"'
[CSV ERROR: 3] Missing closing quote after 'a,' in following record: '"a,'
[CSV ERROR: 1] Missing opening quote before 'a' in following record: 'a"'
[CSV ERROR: 2] Missing separator after 'a' in following record: '"a"b'

```

Installation
------------

[](#installation)

### Debian/Ubuntu

[](#debianubuntu)

1. Move to the directory where you wish to store the source.
2. Clone the repository:

```
$ git clone https://github.com/geoffroy-aubry/awk-csv-parser.git
```

3. You should be on `stable` branch. If not, switch your clone to that branch:

```
$ cd awk-csv-parser && git checkout stable
```

4. You can create a symlink to `awk-csv-parser.sh`:

```
$ sudo ln -s /path/to/src/awk-csv-parser.sh /usr/local/bin/awk-csv-parser
```

5. It's ready for use:

```
$ awk-csv-parser
```

### OS X

[](#os-x)

As both `readlink` and `sed` Mac OS X versions are based on BSD with small differences with the GNU version, you need to install GNU utilities:

```
$ brew install coreutils gnu-sed [--with-default-names]
```

With `--with-default-names` option, GNU utilities replace those of OS X. Else GNU utilities are prefixed with a `g` and you have to edit the scripts `src/awk-csv-parser.sh` and `tests/all-tests.sh`to replace both `readlink` and `sed` with `greadlink` and `gsed` respectively.

Then follow Debian/Ubuntu installation process.

Copyrights &amp; licensing
--------------------------

[](#copyrights--licensing)

Licensed under the GNU Lesser General Public License v3 (LGPL version 3). See [LICENSE](LICENSE) file for details.

Change log
----------

[](#change-log)

See [CHANGELOG](CHANGELOG.md) file for details.

Continuous integration
----------------------

[](#continuous-integration)

[![Build Status](https://camo.githubusercontent.com/1ecca54c9015e38c489873279463dc94e6f6f19b56b9744e19bbdf46e91914ab/68747470733a2f2f7365637572652e7472617669732d63692e6f72672f67656f6666726f792d61756272792f61776b2d6373762d7061727365722e706e673f6272616e63683d737461626c65)](http://travis-ci.org/geoffroy-aubry/awk-csv-parser)

Launch unit tests:

```
$ tests/all-tests.sh
```

Git branching model
-------------------

[](#git-branching-model)

The git branching model used for development is the one described and assisted by `twgit` tool: .

###  Health Score

37

—

LowBetter than 81% of packages

Maintenance20

Infrequent updates — may be unmaintained

Popularity34

Limited adoption so far

Community18

Small or concentrated contributor base

Maturity65

Established project with proven stability

 Bus Factor1

Top contributor holds 97.8% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~742 days

Total

3

Last Release

3152d ago

### Community

Maintainers

![](https://www.gravatar.com/avatar/c1fcf1fb7a80bf7ddb1bab1b7ee8c4059419000ec8b80a4d532d31d45410cf70?d=identicon)[geoffroy-aubry](/maintainers/geoffroy-aubry)

---

Top Contributors

[![geoffroy-aubry](https://avatars.githubusercontent.com/u/1247448?v=4)](https://github.com/geoffroy-aubry "geoffroy-aubry (44 commits)")[![agraboso](https://avatars.githubusercontent.com/u/7795093?v=4)](https://github.com/agraboso "agraboso (1 commits)")

---

Tags

bashcsv parserawk

### Embed Badge

![Health badge](/badges/geoffroy-aubry-awk-csv-parser/health.svg)

```
[![Health](https://phpackages.com/badges/geoffroy-aubry-awk-csv-parser/health.svg)](https://phpackages.com/packages/geoffroy-aubry-awk-csv-parser)
```

###  Alternatives

[mck89/peast

Peast is PHP library that generates AST for JavaScript code

19139.2M45](/packages/mck89-peast)[sauladam/shipment-tracker

Parses tracking information for several carriers, like UPS, USPS, DHL and GLS by simply scraping the data. No need for any kind of API access.

9843.5k](/packages/sauladam-shipment-tracker)[jstewmc/rtf

Read and write Rich Text Format (RTF) documents with PHP

45153.1k6](/packages/jstewmc-rtf)[tcds-io/php-jackson

A lightweight, flexible object serializer for PHP, inspired by FasterXML/jackson

113.2k10](/packages/tcds-io-php-jackson)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
