PHPackages                             skuola/pdf-text-parser - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [PDF &amp; Document Generation](/categories/documents)
4. /
5. skuola/pdf-text-parser

ActiveLibrary[PDF &amp; Document Generation](/categories/documents)

skuola/pdf-text-parser
======================

Library to parse XML resulting from pdftotext

v0.4.2(2y ago)212.3k↓41.7%2MITHTMLPHP ^7.1

Since May 31Pushed 2y ago2 watchersCompare

[ Source](https://github.com/skuola/pdf-text-parser)[ Packagist](https://packagist.org/packages/skuola/pdf-text-parser)[ RSS](/packages/skuola-pdf-text-parser/feed)WikiDiscussions master Synced 1mo ago

READMEChangelogDependencies (4)Versions (11)Used By (0)

PDF text parser
===============

[](#pdf-text-parser)

[![Build Status](https://camo.githubusercontent.com/8971840efde9670d3cda60af7acab1dfaa3a8a9e69a32cf34f761ad6f6d568d3/68747470733a2f2f7472617669732d63692e6f72672f736b756f6c612f7064662d746578742d7061727365722e706e673f6272616e63683d6d6173746572)](https://travis-ci.org/skuola/pdf-text-parser)[![Code Climate](https://camo.githubusercontent.com/8cc20d407aae324325273b5d86f070eff8394dcb1aa75cc10b040a106762a550/68747470733a2f2f636f6465636c696d6174652e636f6d2f6769746875622f736b756f6c612f7064662d746578742d7061727365722f6261646765732f6770612e737667)](https://codeclimate.com/github/skuola/pdf-text-parser)[![SensioLabsInsight](https://camo.githubusercontent.com/1a49096e6bab62bddbef9c8bc8cff760b5edaa4fd3b3427efe39b9ba49862953/68747470733a2f2f696e73696768742e73656e73696f6c6162732e636f6d2f70726f6a656374732f35343039643230302d326137312d343438362d383234642d6638393037393133303865612f6d696e692e706e67)](https://insight.sensiolabs.com/projects/5409d200-2a71-4486-824d-f890791308ea)

This library is a parser for XML text files obtained via [pdftotext](https://en.wikipedia.org/wiki/Pdftotext)

You can install it using `composer require skuola/pdf-text-parser`

Suppose you're just converted a pdf file, getting some text like the following:

```

    Lorem
    ipsum

```

The above text is the result of a command like `pdftotext -htmlmeta -bbox-layout yourfile.pdf -`.

You can use this library as follows:

```
