PHPackages                             mediamonks/crawler - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Search &amp; Filtering](/categories/search)
4. /
5. mediamonks/crawler

ActiveLibrary[Search &amp; Filtering](/categories/search)

mediamonks/crawler
==================

Crawl your own website with various clients for SEO and indexing purposes.

2.0.0(8y ago)211.1k41MITPHPPHP ^5.5|^7.0

Since Nov 28Pushed 8y ago7 watchersCompare

[ Source](https://github.com/mediamonks/crawler)[ Packagist](https://packagist.org/packages/mediamonks/crawler)[ Docs](https://www.mediamonks.com/)[ RSS](/packages/mediamonks-crawler/feed)WikiDiscussions master Synced 1mo ago

READMEChangelogDependencies (8)Versions (6)Used By (1)

[![Build Status](https://camo.githubusercontent.com/bcdf7d17abbcba3d2c6e19ccde1617598518ccbb8f733ed1435dbd9436bf3447/68747470733a2f2f7472617669732d63692e6f72672f6d656469616d6f6e6b732f637261776c65722e7376673f6272616e63683d6d6173746572)](https://travis-ci.org/mediamonks/crawler)[![Scrutinizer Code Quality](https://camo.githubusercontent.com/893323a5c427e5d3db4898d0a5e77f3d29fec7a37db5995639b3ff0e7d7bb473/68747470733a2f2f7363727574696e697a65722d63692e636f6d2f672f6d656469616d6f6e6b732f637261776c65722f6261646765732f7175616c6974792d73636f72652e706e673f623d6d6173746572)](https://scrutinizer-ci.com/g/mediamonks/crawler/?branch=master)[![Code Coverage](https://camo.githubusercontent.com/7009d6ac60236265c96eb98c47f68c00f93d91b764380db3b5866f2368cd379a/68747470733a2f2f7363727574696e697a65722d63692e636f6d2f672f6d656469616d6f6e6b732f637261776c65722f6261646765732f636f7665726167652e706e673f623d6d6173746572)](https://scrutinizer-ci.com/g/mediamonks/crawler/?branch=master)[![Total Downloads](https://camo.githubusercontent.com/1a299d077edbafc172df0b3d3908f4ba676b0db5bd42cac7886738cd0f749af0/68747470733a2f2f706f7365722e707567782e6f72672f6d656469616d6f6e6b732f637261776c65722f646f776e6c6f616473)](https://packagist.org/packages/mediamonks/crawler)[![Latest Stable Version](https://camo.githubusercontent.com/e6572b2aba99844077a54cd099d1d0d60f5e9e3c2a69f3b116ae4d949a53c2d8/68747470733a2f2f706f7365722e707567782e6f72672f6d656469616d6f6e6b732f637261776c65722f762f737461626c65)](https://packagist.org/packages/mediamonks/crawler)[![Latest Unstable Version](https://camo.githubusercontent.com/6a937209e21b1c36ad2664c8cf9bf55c336b1524e715fbe53633fa7e8a8bc8bf/68747470733a2f2f706f7365722e707567782e6f72672f6d656469616d6f6e6b732f637261776c65722f762f756e737461626c65)](https://packagist.org/packages/mediamonks/crawler)[![SensioLabs Insight](https://camo.githubusercontent.com/69b52711b1a86917a79e315bb78eec61aa5997f002468260f6ea48ab6bc2dfa4/68747470733a2f2f696d672e736869656c64732e696f2f73656e73696f6c6162732f692f32666434303765652d333232382d343663312d396562622d3430373435373837643435342e737667)](https://insight.sensiolabs.com/projects/2fd407ee-3228-46c1-9ebb-40745787d454)[![License](https://camo.githubusercontent.com/2ab534de4aefc3bfd730c4061b8b28e318f285eac91c2d7bbb7139415bf4db04/68747470733a2f2f706f7365722e707567782e6f72672f6d656469616d6f6e6b732f637261776c65722f6c6963656e7365)](https://packagist.org/packages/mediamonks/crawler)

MediaMonks Crawler
==================

[](#mediamonks-crawler)

This tool allows you to easily crawl a website and get a DOM object for every url that was found. We use this to crawl our own site pages regardless if it was generated with server and/or client side content by using the Prerender.io client. The resulting data can be used for creating a full site search and/or improving SEO for single-page applications.

Highlights
----------

[](#highlights)

- Ships with Prerender &amp; Prerender.io clients, uses Goutte by default
- Supports any Symfony BrowserKit client
- Supports both whitelisting and blacklisting of urls
- Supports url normalization which allow you to prevent duplicates based on minor url differences
- Implements the [PSR-3 Logger Interface](http://www.php-fig.org/psr/psr-3/)

Documentation
-------------

[](#documentation)

Documentation and examples can be found in the [/doc](/doc) folder.

System Requirements
-------------------

[](#system-requirements)

You need:

- **PHP &gt;= 5.5.0**

To use the library.

Install
-------

[](#install)

Install this package by using Composer.

```
$ composer require mediamonks/crawler

```

Security
--------

[](#security)

If you discover any security related issues, please email  instead of using the issue tracker.

License
-------

[](#license)

The MIT License (MIT). Please see [License File](LICENSE) for more information.

###  Health Score

34

—

LowBetter than 77% of packages

Maintenance20

Infrequent updates — may be unmaintained

Popularity25

Limited adoption so far

Community16

Small or concentrated contributor base

Maturity62

Established project with proven stability

 Bus Factor1

Top contributor holds 88.6% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~123 days

Total

4

Last Release

3078d ago

Major Versions

1.1.0 → 2.0.02017-12-04

PHP version history (2 changes)v1.0.0PHP ^5.5|~7.0

2.0.0PHP ^5.5|^7.0

### Community

Maintainers

![](https://www.gravatar.com/avatar/411c94cbec325fa9a747f4677bb07355b374a0f01008e48c33f809627004fa5d?d=identicon)[mediamonks](/maintainers/mediamonks)

---

Top Contributors

[![slootjes](https://avatars.githubusercontent.com/u/17158090?v=4)](https://github.com/slootjes "slootjes (39 commits)")[![mediamonks-robert](https://avatars.githubusercontent.com/u/7644785?v=4)](https://github.com/mediamonks-robert "mediamonks-robert (5 commits)")

---

Tags

browserkitcrawlercrawlingphpprerenderprerenderioseospidersearchprerenderdomcrawlerindexspiderseogoutterobotdomcrawlerprerender.io

###  Code Quality

TestsPHPUnit

### Embed Badge

![Health badge](/badges/mediamonks-crawler/health.svg)

```
[![Health](https://phpackages.com/badges/mediamonks-crawler/health.svg)](https://phpackages.com/packages/mediamonks-crawler)
```

###  Alternatives

[terminal42/escargot

A web crawler or spider library based on Symfony components

581.4M3](/packages/terminal42-escargot)[opensearch-project/opensearch-php

PHP Client for OpenSearch

15024.3M64](/packages/opensearch-project-opensearch-php)[jeroen-g/explorer

Next-gen Elasticsearch driver for Laravel Scout.

397612.3k](/packages/jeroen-g-explorer)[zrashwani/arachnid

A crawler to find all unique internal pages on a given website

25420.2k](/packages/zrashwani-arachnid)[opensearchserver/opensearchserver

PHP library for OpenSearchServer: professionnal search engine, crawlers (web, file, database), REST APIs, .... This library uses OpenSearchServer's V2 API.

5267.5k](/packages/opensearchserver-opensearchserver)[floriansemm/solr-bundle

Symfony Solr integration bundle

12280.2k2](/packages/floriansemm-solr-bundle)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
