PHPackages                             bakame/html-table - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Search &amp; Filtering](/categories/search)
4. /
5. bakame/html-table

ActiveLibrary[Search &amp; Filtering](/categories/search)

bakame/html-table
=================

convert html table into a PHP data structure

0.5.0(10mo ago)113.0k—0%2[3 issues](https://github.com/bakame-php/html-table/issues)MITPHPCI passing

Since Sep 23Pushed 7mo ago1 watchersCompare

[ Source](https://github.com/bakame-php/html-table)[ Packagist](https://packagist.org/packages/bakame/html-table)[ Docs](https://github.com/bakame-php/html-table)[ GitHub Sponsors](https://github.com/sponsors/nyamsprod)[ RSS](/packages/bakame-html-table/feed)WikiDiscussions master Synced 1mo ago

READMEChangelog (5)Dependencies (13)Versions (7)Used By (0)

HTML Table
==========

[](#html-table)

[![Author](https://camo.githubusercontent.com/8227017ba73e8c0c93fa2c9c653cf702644a5af5d422e7f612b31d45efc180a1/687474703a2f2f696d672e736869656c64732e696f2f62616467652f617574686f722d406e79616d7370726f642d626c75652e7376673f7374796c653d666c61742d737175617265)](https://twitter.com/nyamsprod)[![Software License](https://camo.githubusercontent.com/55c0218c8f8009f06ad4ddae837ddd05301481fcf0dff8e0ed9dadda8780713e/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d4d49542d627269676874677265656e2e7376673f7374796c653d666c61742d737175617265)](LICENSE)[![Build](https://github.com/bakame-php/html-table/workflows/build/badge.svg)](https://github.com/bakame-php/html-table/actions?query=workflow%3A%22build%22)[![Latest Version](https://camo.githubusercontent.com/010579d8c4246cb9e0c16e9ba74c129e4484cbee15b07ad0c50197db09b06230/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f72656c656173652f62616b616d652d7068702f68746d6c2d7461626c652e7376673f7374796c653d666c61742d737175617265)](https://github.com/bakame-php/html-table/releases)[![Total Downloads](https://camo.githubusercontent.com/fdad53b33e5a8d456b480248b80265a8b0650a009b850d6a9c954316bf939c99/68747470733a2f2f696d672e736869656c64732e696f2f7061636b61676973742f64742f62616b616d652f68746d6c2d7461626c652e7376673f7374796c653d666c61742d737175617265)](https://packagist.org/packages/bakame/html-table)[![Sponsor development of this project](https://camo.githubusercontent.com/2e662697b46a37233abdd7e45373397aab0bd5206336533151cdf42455d81048/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f73706f6e736f72253230746869732532307061636b6167652d2545322539442541342d6666363962342e7376673f7374796c653d666c61742d737175617265)](https://github.com/sponsors/nyamsprod)

`bakame/html-table` is a small PHP package that allows you to parse, import and manipulate tabular data represented as HTML Table. Once installed, you will be able to do the following:

```
use Bakame\TabularData\HtmlTable\Parser;

$table = Parser::new()
    ->tableHeader(['rank', 'move', 'team', 'player', 'won', 'drawn', 'lost', 'for', 'against', 'gd', 'points'])
    ->parseFile('https://www.bbc.com/sport/football/tables');

$table
    ->getTabularData()
    ->filter(fn (array $row) => (int) $row['points'] >= 10)
    ->sorted(fn (array $rowA, array $rowB) => (int) $rowB['for']  (int) $rowA['for'])
    ->fetchPairs('team', 'for');

// returns
// [
//  "Brighton" => "15"
//  "Man City" => "14"
//  "Tottenham" => "13"
//  "Liverpool" => "12"
//  "West Ham" => "10"
//  "Arsenal" => "9"
// ]
```

System Requirements
-------------------

[](#system-requirements)

**league\\csv 9.25.0** library is required. (since version 0.6.0).

Installation
------------

[](#installation)

Use composer:

```
composer require bakame/html-table

```

Documentation
-------------

[](#documentation)

The `Parser` can convert a file (a PHP stream or a Path with an optional context like `fopen`) or an HTML document into a `League\Csv\TabularData` implementing object. Once converted you can use all the methods and feature made available by the interface (see [ResultSet](https://csv.thephpleague.com/9.0/reader/resultset/)) for more information.

**The `Parser` itself is immutable, whenever you change a configuration option a new instance is returned.**

**The `Parser` constructor is private to instantiate the object you are required to use the `new` method instead**

```
use Bakame\HtmlTable\Parser;

$parser = Parser::new()
    ->ignoreTableHeader()
    ->ignoreXmlErrors()
    ->tableCaption('This is a beautiful table');
```

### parseHtml and parseFile

[](#parsehtml-and-parsefile)

To extract and parse your table use either the `parseHtml` or `parseFile` methods. If parsing is not possible a `ParseError` exception will be thrown.

```
use Bakame\HtmlTable\Parser;

$parser = new Parser();

$table = $parser->parseHtml('...');
$table = $parser->parseFile('path/to/html/file.html');
```

`parseHtml` parses an HTML page represented by:

- a `string`,
- a `Stringable` object,
- a `DOMDocument`,
- a `DOMElement`,
- or a `SimpleXMLElement`

whereas `parseFile` works with:

- a filepath,
- or a PHP readable stream.

Both methods return a `Table` instance which implements the `League\Csv\TabularDataReader`interface and also give access to the table caption if present via the `getCaption` method.

```
use Bakame\HtmlTable\Parser;

$html = getHeader();  //returns ['Title','Singer', 'Country']
$tableData = $table->geTabularData();
$tableData->nth(2); //returns ["Title" => "Nzinzi", "Singer" => "Emeneya", "Country" => "DR Congo"]
json_encode($tableData->slice(0, 1));
//[{"Title":"Nakei Nairobi","Singer":"Mbilia Bel","Country":"DR Congo"}]
```

#### Default configuration

[](#default-configuration)

By default, when calling the `new Parser()` the parser will:

- try to parse the first table found in the page
- expect the table header row to be the first `tr` found in the `thead` section of your table
- exclude the table `thead` section when extracting the table content.
- ignore XML errors.
- have no formatter attached.
- have no default caption to be used if none is present in the table.

Each of the following settings can be changed to improve the conversion against your business rules:

### tablePosition and tableXpathPosition

[](#tableposition-and-tablexpathposition)

Selecting the table to parse in the HTML page can be done using two (2) methods `Parser::tablePosition` and `Parser::tableXpathPosition`

If you know the table position in the page in relation with its integer offset or if you know it's `id` attribute value you should use `Parser::tablePosition` otherwise favor `Parser::tableXpathPosition` which expects an `xpath` expression. If the expression is valid, and a list of table is found, the first result will be returned.

```
use Bakame\HtmlTable\Parser;

$parser = (new Parser())->tablePosition('table-id'); // parses the
$parser = (new Parser())->tablePosition(3); // parses the 4th table of the page
$parser = (new Parser())->tableXPathPosition("//main/div/table");
//parse the first table that matches the xpath expression
```

**`Parser::tableXpathPosition` and `Parser::tablePosition` override each other. It is recommended to use one or the other but not both at the same time.**

### tableCaption

[](#tablecaption)

You can optionally define a caption for your table if none is present or found during parsing.

```
use Bakame\HtmlTable\Parser;

$parser = (new Parser())->tableCaption('this is a generated caption');
$parser = (new Parser())->tableCaption(null);  // remove any default caption set
```

### tableHeader, tableHeaderPosition, ignoreTableHeader and resolveTableHeader

[](#tableheader-tableheaderposition-ignoretableheader-and-resolvetableheader)

The following settings configure the `Parser` in relation to the table header. By default, the parser will try to parse the first `tr` tag found in the `thead` section of the table. But you can override this behaviour using one of these settings:

#### tableHeaderPosition

[](#tableheaderposition)

Tells where to locate and resolve the table header

```
use Bakame\HtmlTable\Parser;
use Bakame\HtmlTable\Section;

$parser = (new Parser())->tableHeaderPosition(Section::Thead, 3);
// header is the 4th row in the  table section
```

The method uses the `Bakame\HtmlTable\Section` enum to designate which table section to use to resolve the header

```
use Bakame\HtmlTable\Section;

enum Section
{
    case thead;
    case tbody;
    case tfoot;
    case tr;
}
```

If `Section::tr` is used, `tr` tags will be used independently of their section. The second argument is the table header `tr` offset; it defaults to `0` (ie: the first row).

#### ignoreTableHeader and resolveTableHeader

[](#ignoretableheader-and-resolvetableheader)

Instructs the parser to resolve or not the table header using `tableHeaderPosition` configuration. If no resolution is done, no header will be included in the returned `Table` instance.

```
use Bakame\HtmlTable\Parser;

$parser = (new Parser())->ignoreTableHeader();  // no table header will be resolved
$parser = (new Parser())->resolveTableHeader(); // will attempt to resolve the table header
```

#### tableHeader

[](#tableheader)

You can directly specify the header of your table and override any other table header related configuration with this configuration

```
use Bakame\HtmlTable\Parser;
use Bakame\HtmlTable\Section;

$parser = (new Parser())->tableHeader(['rank', 'team', 'winner']);
```

**If you specify a non-empty array as the table header, it will take precedence over any other table header related options.**

**Because it is tabular data, each cell MUST be unique otherwise an exception will be thrown**

You can skip or re-arrange the source columns by skipping them by their offsets and/or by re-ordering the offsets.

```
use Bakame\HtmlTable\Parser;
use Bakame\HtmlTable\Section;

$parser = (new Parser())->tableHeader([3 => 'rank',  7 => 'winner', 5 => 'team']);
// only 3 columns will be extracted the 4th, 6th and 8th columns
// and re-arrange as 'rank' first and 'team' last
// if a column is missing its value will be PHP `null` type
```

### includeSection and excludeSection

[](#includesection-and-excludesection)

Tells which section should be parsed based on the `Section` enum

```
use Bakame\HtmlTable\Parser;
use Bakame\HtmlTable\Section;

$parser = (new Parser())->includeSection(Section::Tbody);  // thead and tfoot are included during parsing
$parser = (new Parser())->excludeSection(Section::Tr, Section::Tfoot); // table direct tr children and tfoot are not included during parsing
```

**By default, the `thead` section is not parse. If a `thead` row is selected to be the header, it will be parsed independently of this setting.**

**⚠️Tips:** to be sure of which sections will be modified, first remove all previous settings before applying your configuration as shown below:

```
- (new Parser())->includeSection(Section::tbody);
+ (new Parser())->excludeSection(...Section::cases())->includeSection(Section::tbody);
```

The first call will still include the `tfoot` and the `tr` sections, whereas the second call removes any previous setting guaranting that only the `tbody` if present will be parsed.

### withFormatter and withoutFormatter

[](#withformatter-and-withoutformatter)

Add or remove a record formatter applied to the data extracted from the table before you can access it. The header is not affected by the formatter if it is defined.

```
use Bakame\HtmlTable\Parser;

$parser = (new Parser())->withFormatter($formatter); // attach a formatter to the parser
$parser = (new Parser())->withFormatter(null);       // removed the attached formatter if it exists
```

The formatter closure signature should be:

```
function (array $record): array;
```

If a header was defined or specified, the submitted record will have the header definition set; otherwise an array list is provided.

The following formatter will work on any table content as long as it is defined as a string.

```
$formatter = fn (array $record): array => array_map(strtolower(...), $record);
// the following formatter will convert all the fields from your table to lowercase.
```

The following formatter will only work if the table has a header attached to it with a column named `count`.

```
$formatter = function (array $record): array {
   $record['count'] = (int) $record['count'];

   return $record;
}
// the following formatter will convert the data of all count column into integer..
```

### ignoreXmlErrors and failOnXmlErrors

[](#ignorexmlerrors-and-failonxmlerrors)

Tells whether the parser should ignore or throw in case of malformed HTML content.

```
use Bakame\HtmlTable\Parser;

$parser = (new Parser())->ignoreXmlErrors();   // ignore the XML errors
$parser = (new Parser())->failOnXmlErrors(3); // throw on XML errors
```

Testing
-------

[](#testing)

The library:

- has a [PHPUnit](https://phpunit.de) test suite
- has a coding style compliance test suite using [PHP CS Fixer](https://cs.sensiolabs.org/).
- has a code analysis compliance test suite using [PHPStan](https://github.com/phpstan/phpstan).

To run the tests, run the following command from the project folder.

```
composer test
```

Security
--------

[](#security)

If you discover any security related issues, please email  instead of using the issue tracker.

Credits
-------

[](#credits)

- [ignace nyamagana butera](https://github.com/nyamsprod)
- [All Contributors](https://github.com/bakame-php/html-table/contributors)

License
-------

[](#license)

The MIT License (MIT). Please see [License File](LICENSE) for more information.

###  Health Score

35

—

LowBetter than 80% of packages

Maintenance54

Moderate activity, may be stable

Popularity30

Limited adoption so far

Community12

Small or concentrated contributor base

Maturity37

Early-stage or recently created project

 Bus Factor1

Top contributor holds 93.3% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~163 days

Total

5

Last Release

316d ago

### Community

Maintainers

![](https://avatars.githubusercontent.com/u/51073?v=4)[Ignace Nyamagana Butera](/maintainers/nyamsprod)[@nyamsprod](https://github.com/nyamsprod)

---

Top Contributors

[![nyamsprod](https://avatars.githubusercontent.com/u/51073?v=4)](https://github.com/nyamsprod "nyamsprod (28 commits)")[![danieldevine](https://avatars.githubusercontent.com/u/5939939?v=4)](https://github.com/danieldevine "danieldevine (1 commits)")[![tacman](https://avatars.githubusercontent.com/u/619585?v=4)](https://github.com/tacman "tacman (1 commits)")

---

Tags

convertexporthtmlfilterimportreadwritetransformtable

###  Code Quality

TestsPHPUnit

Static AnalysisPHPStan

Code StylePHP CS Fixer

Type Coverage Yes

### Embed Badge

![Health badge](/badges/bakame-html-table/health.svg)

```
[![Health](https://phpackages.com/badges/bakame-html-table/health.svg)](https://phpackages.com/packages/bakame-html-table)
```

###  Alternatives

[league/csv

CSV data manipulation made easy in PHP

3.5k166.1M646](/packages/league-csv)[htmlawed/htmlawed

Official htmLawed PHP library for HTML filtering

401.1M9](/packages/htmlawed-htmlawed)[shuchkin/simplecsv

Parse and retrieve data from CSV files. Export data to CSV.

5192.4k](/packages/shuchkin-simplecsv)[flow-php/parquet

PHP ETL - library for reading and writing Parquet files

56143.1k8](/packages/flow-php-parquet)[flow-php/snappy

PHP ETL - Google Snappy compression algorithm implementation

10190.4k2](/packages/flow-php-snappy)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
