PHPackages                             borsodigerii/php-xml-chunker - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Parsing &amp; Serialization](/categories/parsing)
4. /
5. borsodigerii/php-xml-chunker

ActiveLibrary[Parsing &amp; Serialization](/categories/parsing)

borsodigerii/php-xml-chunker
============================

A lightweight, fast, and optimized XML file splitter with build in tag data validation, written with the XMLParser library. The main goal of this is to split an XML file into multiple small chunks (hence the name), then save it into multiple different little XML files.

2.0.0(2y ago)5166MITPHPPHP &gt;=7.4.0

Since Oct 16Pushed 2y ago1 watchersCompare

[ Source](https://github.com/borsodigerii/php-xml-chunker)[ Packagist](https://packagist.org/packages/borsodigerii/php-xml-chunker)[ Docs](https://github.com/borsodigerii/php-xml-chunker)[ RSS](/packages/borsodigerii-php-xml-chunker/feed)WikiDiscussions main Synced 1mo ago

READMEChangelog (1)DependenciesVersions (2)Used By (0)

Chunker - A lightweight, glazing fast XML splitter written in PHP
=================================================================

[](#chunker---a-lightweight-glazing-fast-xml-splitter-written-in-php)

[![GitHub Workflow Status (with event)](https://camo.githubusercontent.com/89c97d9709302678aec6f759e9985c3491161b688434c4fcd3bf82acb0638157/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f616374696f6e732f776f726b666c6f772f7374617475732f626f72736f646967657269692f7068702d786d6c2d6368756e6b65722f7068702e796d6c)](https://camo.githubusercontent.com/89c97d9709302678aec6f759e9985c3491161b688434c4fcd3bf82acb0638157/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f616374696f6e732f776f726b666c6f772f7374617475732f626f72736f646967657269692f7068702d786d6c2d6368756e6b65722f7068702e796d6c) [![GitHub](https://camo.githubusercontent.com/4ca2a1fa7541119cf91e17bcae20f2a6272c7a075b12b8265a555adf38d8b03d/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f6c6963656e73652f626f72736f646967657269692f7068702d786d6c2d6368756e6b6572)](https://camo.githubusercontent.com/4ca2a1fa7541119cf91e17bcae20f2a6272c7a075b12b8265a555adf38d8b03d/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f6c6963656e73652f626f72736f646967657269692f7068702d786d6c2d6368756e6b6572) [![GitHub release (with filter)](https://camo.githubusercontent.com/c7c3b477c13d68b83355010f42b8b3250e70a57a702f056b91a9c6e2d96d0579/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f762f72656c656173652f626f72736f646967657269692f7068702d786d6c2d6368756e6b6572)](https://camo.githubusercontent.com/c7c3b477c13d68b83355010f42b8b3250e70a57a702f056b91a9c6e2d96d0579/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f762f72656c656173652f626f72736f646967657269692f7068702d786d6c2d6368756e6b6572)

The main goal of this library is to create chunks with predefined sizes from a big XML file (or to 'split' it into multiple chunks, so to say).

The algorithm was written using the XMLParser php library, which is capable of parsing an XML file line to line (or tag to tag) without state-control, and not by a string to string comparison or simple I/O operations. This attribute of the library makes it possible to implement validation on the said tags, everytime they are parsed.

With the correct charset specified, it can handle special characters, and also parse them for validation.

Installing
----------

[](#installing)

You can use this library by downloading the `src\Chunker.php` file, and using it directly, or by **using composer** as your package manager:

```
$ composer require borsodigerii/php-xml-chunker
```

Alternatively, you can add this library as a dependency in your `composer.json` file:

```
"require": {
    "borsodigerii/php-xml-chunker": "2.0.0"
}
```

Then you just have to run `composer update`.

The minimum PHP version for this library to work, is **&gt;= 7.4.0**

Usage
-----

[](#usage)

### Simple Chunking

[](#simple-chunking)

The implementation is Object-oriented, so in order to split the files, an instance of Chunker has to be created first.

An example of a simple Chunker instance without validation, with **maximum 100 main tags**/chunk, and with outputfile names of *"out-{CHUNK}.xml"*:

```
$chunkSize = 100;
$outputFilePrefix = "out-";
$xmlfile = "bigFile.xml";
$validationFunction = fn($data, $tag) => {
    return true;
}
$checkingTags = array();

$chunker = new Chunker\Chunker($xmlfile, $chunkSize, $outputFilePrefix, $validationFunction, $checkingTags);
```

### Constructor variables

[](#constructor-variables)

The following table contains the parameters that can be (and should be) passed to the constructor.

ParameterTypeDescriptionDefault valueIs required$xmlfilestringThe big XML file to be chunkedempty stringYes$chunkSizeintThe number of main tags maximum in a chunk100No$outputFilePrefixstringThe prefix that will be used as the filename for the output chunks. Pattern: **'{outputFilePrefix}{CHUNK-NUMBER}.xml'**'out-'No$validationFunctioncallableThe validator function that is used everytime a tag found, that is inside $checkingTags. If the tag data passes the validation, it will be included in the chunks, and will not be otherwise. It has to receive **two parameters**: first is the *data* that is inside the tag to be validated, and the second is the *tag* itself (both being strings). It has to **return a boolean**.nullYes$checkingTagsarrayAn array of tags, where their data has to be validated using the $validationFunction callable. If we don't want any validation, we can pass an empty array to this parameter, or not specify it at all since it's not required.empy arrayNoIf any of the required parameters are empty/not specified, a Fatal error will be raised.

### Launch the chunking!

[](#launch-the-chunking)

After you created an instance of Chunker, and all the parameters were set, you can start the chunking process. You can do this with the `Chunker::chunkXML` method. An example is shown below:

```
// ... the instance is created in $chunker
$chunker.chunkXML("item", "root");
```

This example will create xml chunks from the big file (if validation is enabled, then only the validated main tags will be included), with `$chunkSize` number of *main tags* (here it's called **"item"**). Every main tag is enclosed between one *root tag* (here it's called **"root"**) in every file (so every chunked file will contain **one root tag**, and `$chunkSize` number of **main tags inside** it).

THe method returns the logging session's string conversion (see below for more information).

Logging
-------

[](#logging)

The class has an implemented logging feature. Everytime the `Chunker::chunkXML` is run, a new logging session is launched, which can be retrieved with the very same function. After its run, it returns the logging session converted into string:

```
// ...
$log = $chunker.chunkXML(....);
echo $log;

/*

[timestamp] Starting new chunking...
[timestamp] ..
[timestamp] ..
*/
```

It is really helpful, when something is not working for your needs, and has to be debugged from step to step. **It is not neccessary to catch, so you can just call the function like its return value is void.**

Examples
--------

[](#examples)

### Basic validation

[](#basic-validation)

Lets say, that you have an XML file (*"feed.xml"*) with a **Shop** root element, and multiple **shopItem** elements inside it (10.000+):

```

        5

        12

```

You want it to split into files named *"feed-{chunk}.xml"* containing 1000 **shopItem**s maximum. And you also want to only include **shopItem**s, that has a *weight\_kg* tag inside, which can only be greater than 10 (or '10 kgs'). The solution is like the following:

```
$chunkSize = 1000; // Max. 1000 shopItems per chunk
$xmlfile = "feed.xml"; // Input file
$outPrefix = "feed-"; // Prefix for chunk-files
$checkingTags = array("weight_kg"); // Tags to be validated (to which the validation function will be called)

// validation function
function validation($data, $tag) {
    if($tag == "weight_kg"){
        if(!empty($data) && intval($data) > 10) return true;
    }
    return false;
}

// Tags to be counted with $chunkSize
$mainTag = "shopItem";

// Root tag/element, that is only present once in the xml file
$rootTag = "Shop";

// Creating the chunker instance, and running the Chunker
$chunker = new Chunker\Chunker($xmlfile, $chunkSize, $outPrefix, "validation", $chekingTags);
$chunker.chunkXML($mainTag, $rootTag);
```

###  Health Score

23

—

LowBetter than 27% of packages

Maintenance20

Infrequent updates — may be unmaintained

Popularity16

Limited adoption so far

Community7

Small or concentrated contributor base

Maturity42

Maturing project, gaining track record

 Bus Factor1

Top contributor holds 100% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Unknown

Total

1

Last Release

945d ago

### Community

Maintainers

![](https://www.gravatar.com/avatar/e9e5dd6fe10c4d92d0c22f49d5e216818ec108ed10bbd4ec0a9255314af3774f?d=identicon)[borsodigerii](/maintainers/borsodigerii)

---

Top Contributors

[![borsodigerii](https://avatars.githubusercontent.com/u/47193153?v=4)](https://github.com/borsodigerii "borsodigerii (20 commits)")

---

Tags

charsetcharset-normalizerchunkingfastlightweightphpphp7php8splitsplitterxmlxml-parserxml-parser-libxml-parser-libraryxml-parsingphpxmlcharsetfastlightweightsplitPHP7php8xml-parserchunkingsplitterxml-parsingxml-parser-librarycharset-normalizerxml-parser-lib

### Embed Badge

![Health badge](/badges/borsodigerii-php-xml-chunker/health.svg)

```
[![Health](https://phpackages.com/badges/borsodigerii-php-xml-chunker/health.svg)](https://phpackages.com/packages/borsodigerii-php-xml-chunker)
```

###  Alternatives

[sbsaga/toon

🧠 TOON for Laravel — a compact, human-readable, and token-efficient data format for AI prompts &amp; LLM contexts. Perfect for ChatGPT, Gemini, Claude, Mistral, and OpenAI integrations (JSON ⇄ TOON).

6115.6k](/packages/sbsaga-toon)[goetas/xsd2php-runtime

Convert XSD (XML Schema) definitions into PHP classes

493.3k](/packages/goetas-xsd2php-runtime)[bupy7/xml-constructor

The array-like constructor of XML document structure.

1337.9k](/packages/bupy7-xml-constructor)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
