PHPackages                             baqend/spider - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Utility &amp; Helpers](/categories/utility)
4. /
5. baqend/spider

ActiveLibrary[Utility &amp; Helpers](/categories/utility)

baqend/spider
=============

URL spider which crawls a page and all its subpages

1.0.0(8y ago)6629MITPHPPHP &gt;= 5.5.9

Since Mar 16Pushed 8y ago11 watchersCompare

[ Source](https://github.com/Baqend/PHP-Spider)[ Packagist](https://packagist.org/packages/baqend/spider)[ RSS](/packages/baqend-spider/feed)WikiDiscussions master Synced today

READMEChangelogDependencies (1)Versions (2)Used By (0)

PHP Spider
==========

[](#php-spider)

*URL spider which crawls a page and all its subpages*

- [Installation](#installation)
- [Usage](#usage)
- [Processors](#processors)
- [URL Handlers](#url-handlers)
- [Alternatives](#alternatives)

Installation
------------

[](#installation)

Make sure you have [Composer](https://getcomposer.org/) installed. Then execute:

```
composer require baqend/spider

```

This package requires at least **PHP 5.5.9** and has **no package dependencies!**

Usage
-----

[](#usage)

The entry point is the `Spider` class. For it to work, it requires the following services:

- **Queue:** Collects URLs to be processed. This package comes with a breadth-first and a depth-first implementation.
- **URL Handler:** Checks if a URL should be processed. If no URL handler is provided, every URL is processed. [More about URL handlers](#url-handlers)
- **Downloader:** Takes URLs and downloads them. To have no dependency on a HTTP client library like [Guzzle](https://packagist.org/packages/guzzlehttp/guzzle), you have to implement this class by yourself.
- **Processor:** Retrieves downloaded assets and performs operations on it. [More about Processors](#processors)

You initialize the spider in the following way:

```
