PHPackages                             mehrabx/web-crawler - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Utility &amp; Helpers](/categories/utility)
4. /
5. mehrabx/web-crawler

ActiveLibrary[Utility &amp; Helpers](/categories/utility)

mehrabx/web-crawler
===================

A web crawler package

v1.0(4y ago)411PHP

Since Dec 1Pushed 4y ago1 watchersCompare

[ Source](https://github.com/mehrabx/php-crawler)[ Packagist](https://packagist.org/packages/mehrabx/web-crawler)[ RSS](/packages/mehrabx-web-crawler/feed)WikiDiscussions master Synced 4w ago

READMEChangelogDependencies (1)Versions (2)Used By (0)

 [ ![Laravel Toman](./resources/imgs/logo.png?raw=true) ](https://mehrabx.github.io/web-crwaler/)

PHP Web Crawler
===============

[](#php-web-crawler)

This library is a php web crawler which takes collection of URLs and DOM selects to crawl through the webpages and executing customized analyzers on each page.

Installation
------------

[](#installation)

Install this library using composer :

```
composer require mehrabx/web-crawler
```

Usage
-----

[](#usage)

In current version use [xpath expressions](https://www.w3schools.com/xml/xpath_intro.asp) to select element

```
//set list of URLs and selects DOM elements of each URL page
$urls = [
    'https://test.exp/?page=1' => ["//img[@class='type1']","//a[@class='type1']"],
    'https://test.exp/?page=2' => ["//img[@class='type2'"],
    'https://test.exp/?page=3' => "//img[@class='type3']",
];

//return array of results
return \Crawler\Facades\CrawlFacade::make($urls)->start() ;
```

options
-------

[](#options)

### sleep

[](#sleep)

To avoid being blocked by the target url you can set sleep time between crawling each url :

```
$urls = [
    'https://test.exp/?page=1' => ["//img[@class='type1']","//a[@class='type1']"],
    'https://test.exp/?page=2' => ["//img[@class='type2'"],
];

//set 5 seconds sleep time
return \Crawler\Facades\CrawlFacade::make($urls)->sleep(10)->start() ;
```

### defualt select

[](#defualt-select)

You can set default select. URLs that have no selects can use it :

```
$urls = [
    'https://test.exp/?page=1', //this url has not select
    'https://test.exp/?page=2' => ["//img[@class='type2'"],
];

return \Crawler\Facades\CrawlFacade::make($urls)
                                    ->defaultSelect("//img[@class='type1']")
                                    ->start() ;
```

###  Health Score

23

—

LowBetter than 27% of packages

Maintenance20

Infrequent updates — may be unmaintained

Popularity9

Limited adoption so far

Community7

Small or concentrated contributor base

Maturity49

Maturing project, gaining track record

 Bus Factor1

Top contributor holds 100% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Unknown

Total

1

Last Release

1619d ago

### Community

Maintainers

![](https://www.gravatar.com/avatar/b839b69c38f9804f13cb8cbc6d4e035f693b473f53742036bdf433ef8b088961?d=identicon)[mehrabx](/maintainers/mehrabx)

---

Top Contributors

[![mehrabx](https://avatars.githubusercontent.com/u/83074328?v=4)](https://github.com/mehrabx "mehrabx (14 commits)")

### Embed Badge

![Health badge](/badges/mehrabx-web-crawler/health.svg)

```
[![Health](https://phpackages.com/badges/mehrabx-web-crawler/health.svg)](https://phpackages.com/packages/mehrabx-web-crawler)
```

###  Alternatives

[shlinkio/shlink

A self-hosted and PHP-based URL shortener application with CLI and REST interfaces

4.8k4.3k](/packages/shlinkio-shlink)[ralphjsmit/laravel-helpers

A package containing handy helpers for your Laravel-application.

13704.6k2](/packages/ralphjsmit-laravel-helpers)[dhlparcel/magento2-plugin

DHL Parcel plugin for Magento 2

11180.5k2](/packages/dhlparcel-magento2-plugin)[aedart/athenaeum

Athenaeum is a mono repository; a collection of various PHP packages

255.2k](/packages/aedart-athenaeum)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
