PHPackages                             pandoc-php/pandoc - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [PDF &amp; Document Generation](/categories/documents)
4. /
5. pandoc-php/pandoc

ActiveLibrary[PDF &amp; Document Generation](/categories/documents)

pandoc-php/pandoc
=================

A native PHP 8.4 port of the Pandoc document converter.

v1.1.0(4mo ago)05GPL-2.0-or-laterPHPPHP &gt;=8.4

Since Jan 7Pushed 4mo agoCompare

[ Source](https://github.com/snorky22/php-pandoc)[ Packagist](https://packagist.org/packages/pandoc-php/pandoc)[ Docs](https://github.com/pandoc-php/pandoc)[ RSS](/packages/pandoc-php-pandoc/feed)WikiDiscussions main Synced 1mo ago

READMEChangelogDependencies (1)Versions (4)Used By (0)

Pandoc PHP
==========

[](#pandoc-php)

A native PHP 8.4 port of the [Pandoc](https://pandoc.org/) document converter. This library allows you to convert documents between different formats (currently focusing on Word `.docx`, HTML `.html`, and Markdown `.md` to LaTeX) without requiring the system-level Pandoc binary.

Features
--------

[](#features)

- **Native PHP 8.4 Implementation**: Uses modern PHP features like `readonly` classes, Enums, and property hooks.
- **AST-Centric Architecture**: Mirrors Pandoc's Abstract Syntax Tree (AST) for robust and accurate conversions.
- **Modular Reader System**: Uses a factory pattern and unified `ReaderInterface` for easy expansion to new formats.
- **Deep Docx Parsing**: Extracts paragraphs, headers, tables, lists, images/media, and advanced text styling (bold, italic, underline, strikeout, superscript/subscript, and colors).
- **LaTeX Generation**: Produces clean LaTeX code, available as both standalone documents and body fragments.
- **Media Support**: Automatically extracts images from documents and includes them in the AST's `MediaBag`. The web interface bundles these into a ZIP archive alongside the LaTeX source.
- **Improved Robustness**: Resilient Docx parsing that handles malformed XML, missing styles, and relationship collisions (e.g., images in headers/footers).
- **No External Dependencies**: Works purely in PHP 8.4+, making it easy to deploy in shared hosting or restricted environments.

Installation
------------

[](#installation)

Ensure you have PHP 8.4 or higher.

```
composer require pandoc-php/pandoc
```

Basic Usage
-----------

[](#basic-usage)

### Converting a Word Document to LaTeX

[](#converting-a-word-document-to-latex)

```
use Pandoc\Reader\DocxReader;
use Pandoc\Writer\LatexWriter;

$reader = new DocxReader();
$writer = new LatexWriter();

// 1. Read the Docx file into an AST
$doc = $reader->read('document.docx');

// 2. Convert AST to LaTeX string (standalone document)
$latex = $writer->write($doc, standalone: true);

file_put_contents('document.tex', $latex);
```

### Converting Markdown to LaTeX Fragment

[](#converting-markdown-to-latex-fragment)

```
use Pandoc\Reader\MarkdownReader;
use Pandoc\Writer\LatexWriter;

$reader = new MarkdownReader();
$writer = new LatexWriter();

$markdown = "# Hello World\nThis is a paragraph.";
$doc = $reader->read($markdown);

// Output just the body (no preamble)
$latexFragment = $writer->write($doc, standalone: false);
```

### Converting HTML to LaTeX

[](#converting-html-to-latex)

```
use Pandoc\Reader\HtmlReader;
use Pandoc\Writer\LatexWriter;

$reader = new HtmlReader();
$writer = new LatexWriter();

$html = "HelloWorld";
$doc = $reader->read($html);
$latex = $writer->write($doc);
```

### Converting Jupyter Notebooks to LaTeX

[](#converting-jupyter-notebooks-to-latex)

```
use Pandoc\Reader\IpynbReader;
use Pandoc\Writer\LatexWriter;

$reader = new IpynbReader();
$writer = new LatexWriter();

$json = file_get_contents('notebook.ipynb');
$doc = $reader->read($json);
$latex = $writer->write($doc);
```

Web Interface
-------------

[](#web-interface)

The project includes a simple web-based demonstration tool in the `web/` directory.

1. Point your web server to the `php-pandoc/web/` folder.
2. Open `index.html` in your browser.
3. Upload a `.docx`, `.html`, `.ipynb` or `.md` file.
4. Choose the output format (Standalone or Fragment).
5. Download the converted `.tex` file. If the document contains images, you will receive a `.zip` archive containing the LaTeX file and all media files in the same directory.

Supported Structures
--------------------

[](#supported-structures)

For a detailed list of Word document features handled by this port, see [SUPPORTED\_STRUCTURES.md](SUPPORTED_STRUCTURES.md). Highlights include:

- **Headers**: Heading 1-6 and Title mapping.
- **Text Styling**: Bold, Italic, Underline, Strikeout, Superscript, Subscript.
- **Colors**: Text color and background (highlight/shading).
- **Lists**: Bulleted and Ordered lists.
- **Images/Media**: Automatic extraction from Word documents, HTML, and Jupyter Notebooks.
- **Headers &amp; Footers**: Extraction of content from Docx headers and footers.
- **Tables**: Multi-body tables with header row detection.
- **Horizontal Rules**: Detection of underscore sequences as rules.

Development and Testing
-----------------------

[](#development-and-testing)

The project uses PHPUnit for testing. To run the test suite:

```
./vendor/bin/phpunit
```

Tests cover:

- **AST Integrity**: Ensuring immutability and correct structure.
- **Reader/Writer Modularity**: Testing the `ReaderFactory` and interface consistency.
- **Writer Accuracy**: Verifying LaTeX output and character escaping.
- **Reader Reliability**: Testing against standardized Docx samples to ensure parity with Pandoc's behavior.

Credits
-------

[](#credits)

This project is a port of [Pandoc](https://github.com/jgm/pandoc), originally created by John MacFarlane.

License
-------

[](#license)

This project is licensed under the GPL v2 or later, mirroring the original Pandoc license.

###  Health Score

37

—

LowBetter than 83% of packages

Maintenance76

Regular maintenance activity

Popularity4

Limited adoption so far

Community6

Small or concentrated contributor base

Maturity54

Maturing project, gaining track record

 Bus Factor1

Top contributor holds 100% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~0 days

Total

2

Last Release

130d ago

### Community

Maintainers

![](https://www.gravatar.com/avatar/4d32dc65259fd93e66c533aaa99c570c9d6508f9dbea16b35dd596c2472aa3f6?d=identicon)[snorky22](/maintainers/snorky22)

---

Top Contributors

[![snorky22](https://avatars.githubusercontent.com/u/3802603?v=4)](https://github.com/snorky22 "snorky22 (3 commits)")

---

Tags

htmlconvertermarkdowndocxastlatexpandocjupyter

###  Code Quality

TestsPHPUnit

### Embed Badge

![Health badge](/badges/pandoc-php-pandoc/health.svg)

```
[![Health](https://phpackages.com/badges/pandoc-php-pandoc/health.svg)](https://phpackages.com/packages/pandoc-php-pandoc)
```

###  Alternatives

[gotenberg/gotenberg-php

A PHP client for interacting with Gotenberg, a developer-friendly API for converting numerous document formats into PDF files, and more!

3685.2M19](/packages/gotenberg-gotenberg-php)[faisalman/simple-excel-php

Easily parse / convert / write between Microsoft Excel XML / CSV / TSV / HTML / JSON / etc formats

582599.4k1](/packages/faisalman-simple-excel-php)[mnvx/lowrapper

PHP wrapper over LibreOffice converter

129190.5k](/packages/mnvx-lowrapper)[aspose-cloud/aspose-words-cloud

Open, generate, edit, split, merge, compare and convert Word documents. Integrate Cloud API into your solutions to manipulate documents. Convert PDF to Word (DOC, DOCX, ODT, RTF and HTML) and in the opposite direction.

32157.4k](/packages/aspose-cloud-aspose-words-cloud)[spiritix/html-to-pdf

Convert HTML markup into beautiful PDF files using the famous wkhtmltopdf library

1932.5k](/packages/spiritix-html-to-pdf)[nilgems/laravel-textract

A Laravel package to extract text from files like DOC, XL, Image, Pdf and more. I've developed this package by inspiring "npm textract".

195.2k](/packages/nilgems-laravel-textract)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
