PHPackages                             label305/docx-extractor - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [PDF &amp; Document Generation](/categories/documents)
4. /
5. label305/docx-extractor

ActiveLibrary[PDF &amp; Document Generation](/categories/documents)

label305/docx-extractor
=======================

PHP library for extracting and replacing string data in .docx files.

0.2.3(2y ago)1021.2k—0%21Apache-2.0PHPPHP ^8.0

Since Nov 14Pushed 2y ago7 watchersCompare

[ Source](https://github.com/Label305/DocxExtractor)[ Packagist](https://packagist.org/packages/label305/docx-extractor)[ Docs](https://github.com/Label305/DocxExtractor)[ RSS](/packages/label305-docx-extractor/feed)WikiDiscussions master Synced 1mo ago

READMEChangelog (10)Dependencies (1)Versions (42)Used By (1)

Docx Extractor [![Build Status](https://camo.githubusercontent.com/2714ccbcf583b94a257e73f5bd14d7317b82ba1f06e6abfd587139587268e076/68747470733a2f2f7472617669732d63692e6f72672f4c6162656c3330352f446f6378457874726163746f722e737667)](https://travis-ci.org/Label305/DocxExtractor)
====================================================================================================================================================================================================================================================================================

[](#docx-extractor-)

PHP library for extracting and replacing string data in .docx files. Docx files are zip archives filled with XML documents and assets. Their format is described by [OOXML](http://nl.wikipedia.org/wiki/Office_Open_XML). This library only manipulates the `word/document.xml` file.

Composer installation
---------------------

[](#composer-installation)

```
"require": {
    "label305/docx-extractor": "0.2.*"
}
```

Requirements
------------

[](#requirements)

- PHP 8.0
- PHP ext-dom
- PHP ext-zip
- PHP ext-libxml

Basic usage
-----------

[](#basic-usage)

Import the basic classes.

```
use Label305\DocxExtractor\Basic\BasicExtractor;
use Label305\DocxExtractor\Basic\BasicInjector;
```

First we need to extract all the paragraphs from an existing `docx` file. This can be done using the `BasicExtractor` or the `DecoratedTextExtractor`. Calling `extractStringsAndCreateMappingFile` will create a new file which name you pass in the second argument. This new file contains references so the library knows where to later inject the altered text back into.

```
$extractor = new BasicExtractor();
$mapping = $extractor->extractStringsAndCreateMappingFile(
    'simple.docx',
    'simple-extracted.docx'
  );
```

Now that you have extracted paragraphs you can inspect the content of the resulting `$mapping` array. And if you wish to change the content you can simply modify it. The array key maps to a symbol in the `simple-extracted.docx`.

```
echo $mapping[0]; // The quick brown fox jumps over the lazy dog
```

Now after you changed your content, you can save it back to a new file. In this case that file is `simple-injected.docx`.

```
$mapping[0] = "Several fabulous dixieland jazz groups played with quick tempo.";

$injector = new BasicInjector();
$injector->injectMappingAndCreateNewFile(
    $mapping,
    'simple-extracted.docx',
    'simple-injected.docx'
  );
```

Advanced usage
--------------

[](#advanced-usage)

The library is also equiped with a `DecoratedTextExtractor` and `DecoratedTextInjector` with which you can manipulate basic paragraph styling like bold, italic and underline. You can also use the `Paragraph` objects to distinguish logical groupings of text.

```
$extractor = new DecoratedTextExtractor();
$mapping = $extractor->extractStringsAndCreateMappingFile(
    'simple.docx',
    'simple-extracted.docx'
  );

$firstParagraph = $mapping[0]; // Paragraph object
$firstSentence = $firstParagraph[0]; // Sentence object

$firstSentence->italic = true;
$firstSentence->bold = false;
$firstSentence->underline = true;
$firstSentence->br = 2; // Two line breaks before this sentence

echo $firstSentence->text; // The quick brown fox jumps over the lazy dog
$firstSentence->text = "Several fabulous dixieland jazz groups played with quick tempo.";

$injector = new DecoratedTextInjector();
$injector->injectMappingAndCreateNewFile(
    $mapping,
    'simple-extracted.docx',
    'simple-injected.docx'
  );
```

License
-------

[](#license)

Copyright 2014 Label305 B.V.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

###  Health Score

39

—

LowBetter than 86% of packages

Maintenance20

Infrequent updates — may be unmaintained

Popularity31

Limited adoption so far

Community22

Small or concentrated contributor base

Maturity70

Established project with proven stability

 Bus Factor1

Top contributor holds 60.5% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~86 days

Recently: every ~205 days

Total

41

Last Release

748d ago

PHP version history (4 changes)0.1PHP &gt;=5.4.0

0.2.0PHP 8.0.\*

0.2.2PHP 8.1.\*|8.0.\*

0.2.3PHP ^8.0

### Community

Maintainers

![](https://avatars.githubusercontent.com/u/44893?v=4)[Thijs](/maintainers/tscheepers)[@tscheepers](https://github.com/tscheepers)

![](https://avatars.githubusercontent.com/u/411527?v=4)[Label305](/maintainers/Label305)[@Label305](https://github.com/Label305)

![](https://www.gravatar.com/avatar/ab97645e715a5816707321d79e2ceadd772652c6198bf634458c4019c8e7ebbd?d=identicon)[nhaarman](/maintainers/nhaarman)

![](https://www.gravatar.com/avatar/fade7c662ebe1bed72e075de6735949bdecfb1cd185cd4d20963c1b6be1df6f4?d=identicon)[xander\_peuscher](/maintainers/xander_peuscher)

---

Top Contributors

[![xanderpeuscher](https://avatars.githubusercontent.com/u/1401193?v=4)](https://github.com/xanderpeuscher "xanderpeuscher (52 commits)")[![tscheepers](https://avatars.githubusercontent.com/u/44893?v=4)](https://github.com/tscheepers "tscheepers (21 commits)")[![nvelthorst](https://avatars.githubusercontent.com/u/3592114?v=4)](https://github.com/nvelthorst "nvelthorst (9 commits)")[![lucBleijenberg](https://avatars.githubusercontent.com/u/48120882?v=4)](https://github.com/lucBleijenberg "lucBleijenberg (3 commits)")[![JBlaak](https://avatars.githubusercontent.com/u/410113?v=4)](https://github.com/JBlaak "JBlaak (1 commits)")

---

Tags

docxmapperextractor

###  Code Quality

TestsPHPUnit

### Embed Badge

![Health badge](/badges/label305-docx-extractor/health.svg)

```
[![Health](https://phpackages.com/badges/label305-docx-extractor/health.svg)](https://phpackages.com/packages/label305-docx-extractor)
```

###  Alternatives

[gotenberg/gotenberg-php

A PHP client for interacting with Gotenberg, a developer-friendly API for converting numerous document formats into PDF files, and more!

3685.2M19](/packages/gotenberg-gotenberg-php)[vaites/php-apache-tika

Apache Tika bindings for PHP: extracts text from documents and images (with OCR), metadata and more...

1171.5M2](/packages/vaites-php-apache-tika)[mnvx/lowrapper

PHP wrapper over LibreOffice converter

129190.5k](/packages/mnvx-lowrapper)[krustnic/docx-merge

Simple library for merging multiple MS Word ".docx" files into one

61193.9k](/packages/krustnic-docx-merge)[aspose-cloud/aspose-words-cloud

Open, generate, edit, split, merge, compare and convert Word documents. Integrate Cloud API into your solutions to manipulate documents. Convert PDF to Word (DOC, DOCX, ODT, RTF and HTML) and in the opposite direction.

32157.4k](/packages/aspose-cloud-aspose-words-cloud)[enzim/tika-wrapper

This is a simple PHP Wrapper for Apache Tika (using the tika-app jar)

6021.3k](/packages/enzim-tika-wrapper)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
