PHPackages                             crwlr/crawler - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Utility &amp; Helpers](/categories/utility)
4. /
5. crwlr/crawler

ActiveLibrary[Utility &amp; Helpers](/categories/utility)

crwlr/crawler
=============

Web crawling and scraping library.

v3.5.8(2mo ago)36917.4k↑82.9%13[1 issues](https://github.com/crwlrsoft/crawler/issues)[1 PRs](https://github.com/crwlrsoft/crawler/pulls)2MITPHPPHP ^8.1

Since Apr 18Pushed 2mo ago3 watchersCompare

[ Source](https://github.com/crwlrsoft/crawler)[ Packagist](https://packagist.org/packages/crwlr/crawler)[ Docs](https://www.crwlr.software/packages/crawler)[ GitHub Sponsors](https://github.com/sponsors/otsch)[ RSS](/packages/crwlr-crawler/feed)WikiDiscussions main Synced 2d ago

READMEChangelog (10)Dependencies (42)Versions (89)Used By (2)

[![crwlr.software logo](https://github.com/crwlrsoft/graphics/raw/eee6cf48ee491b538d11b9acd7ee71fbcdbe3a09/crwlr-logo.png)](https://www.crwlr.software)

Library for Rapid (Web) Crawler and Scraper Development
=======================================================

[](#library-for-rapid-web-crawler-and-scraper-development)

This library provides kind of a framework and a lot of ready to use, so-called **steps**, that you can use as building blocks, to build your own crawlers and scrapers with.

To give you an overview, here's a list of things that it helps you with:

- [Crawler **Politeness**](https://www.crwlr.software/packages/crawler/the-crawler/politeness) 😇 (respecting robots.txt, throttling,...)
- Load URLs using
    - [a **(PSR-18) HTTP client**](https://www.crwlr.software/packages/crawler/the-crawler/loaders) (default is of course Guzzle)
    - or a [**headless browser**](https://www.crwlr.software/packages/crawler/the-crawler/loaders#using-a-headless-browser) (chrome) to get source after Javascript execution
- [Get **absolute links** from HTML documents](https://www.crwlr.software/packages/crawler/included-steps/html#html-get-link) 🔗
- [Get **sitemaps** from robots.txt and get all URLs from those sitemaps](https://www.crwlr.software/packages/crawler/included-steps/sitemap)
- [**Crawl** (load) all pages of a website](https://www.crwlr.software/packages/crawler/included-steps/http#crawling) 🕷
- [Use **cookies** (or don't)](https://www.crwlr.software/packages/crawler/the-crawler/loaders#http-loader) 🍪
- [Use any **HTTP methods** (GET, POST,...) and send any headers or body](https://www.crwlr.software/packages/crawler/included-steps/http#http-requests)
- [Easily iterate over **paginated** list pages](https://www.crwlr.software/packages/crawler/included-steps/http#paginating) 🔁
- Extract data from:
    - [**HTML**](https://www.crwlr.software/packages/crawler/included-steps/html#extracting-data) and also [**XML**](https://www.crwlr.software/packages/crawler/included-steps/xml) (using CSS selectors or XPath queries)
    - [**JSON**](https://www.crwlr.software/packages/crawler/included-steps/json) (using dot notation)
    - [**CSV**](https://www.crwlr.software/packages/crawler/included-steps/csv) (map columns)
- [Extract **schema.org** structured data](https://www.crwlr.software/packages/crawler/included-steps/html#schema-org) in **JSON-LD** format from HTML documents
- [Keep memory usage low](https://www.crwlr.software/packages/crawler/crawling-procedure#memory-usage) by using PHP **Generators** 💪
- [**Cache** HTTP responses](https://www.crwlr.software/packages/crawler/response-cache) during development, so you don't have to load pages again and again after every code change
- [Get **logs**](https://www.crwlr.software/packages/crawler/the-crawler#loggers) about what your crawler is doing (accepts any PSR-3 LoggerInterface)
- And a lot more...

Documentation
-------------

[](#documentation)

You can find the documentation at [crwlr.software](https://www.crwlr.software/packages/crawler/getting-started).

Contributing
------------

[](#contributing)

If you consider contributing something to this package, read the [contribution guide (CONTRIBUTING.md)](CONTRIBUTING.md).

###  Health Score

61

—

FairBetter than 98% of packages

Maintenance88

Actively maintained with recent releases

Popularity46

Moderate usage in the ecosystem

Community21

Small or concentrated contributor base

Maturity73

Established project with proven stability

 Bus Factor1

Top contributor holds 93.1% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~17 days

Recently: every ~69 days

Total

87

Last Release

62d ago

Major Versions

v0.7.0 → v1.0.02023-02-08

v1.10.0 → v2.0.0-beta2024-08-08

v2.1.3 → v3.0.02024-12-08

### Community

Maintainers

![](https://www.gravatar.com/avatar/3074cef6e2926ede6d4c9c39a0cf29e2e86e6927255a17c103114d0a5957e1a7?d=identicon)[crwlr](/maintainers/crwlr)

---

Top Contributors

[![otsch](https://avatars.githubusercontent.com/u/4062813?v=4)](https://github.com/otsch "otsch (416 commits)")[![szepeviktor](https://avatars.githubusercontent.com/u/952007?v=4)](https://github.com/szepeviktor "szepeviktor (26 commits)")[![github-actions[bot]](https://avatars.githubusercontent.com/in/15368?v=4)](https://github.com/github-actions[bot] "github-actions[bot] (3 commits)")[![chr-hertel](https://avatars.githubusercontent.com/u/2852185?v=4)](https://github.com/chr-hertel "chr-hertel (2 commits)")

---

Tags

crawlercrawlinghacktoberfestphpscraperscrapingscraping-websitesweb-crawlerweb-crawlingweb-scraperweb-scrapingwebcrawlerbotcrawlscrapescrapercrawlingscrapingcrwlr

###  Code Quality

TestsPest

Static AnalysisPHPStan

Code StylePHP CS Fixer

Type Coverage Yes

### Embed Badge

![Health badge](/badges/crwlr-crawler/health.svg)

```
[![Health](https://phpackages.com/badges/crwlr-crawler/health.svg)](https://phpackages.com/packages/crwlr-crawler)
```

###  Alternatives

[laravel/framework

The Laravel Framework.

34.8k543.8M20.1k](/packages/laravel-framework)[craftcms/cms

Craft CMS

3.6k3.6M3.1k](/packages/craftcms-cms)[civicrm/civicrm-core

Open source constituent relationship management for non-profits, NGOs and advocacy organizations.

751291.4k43](/packages/civicrm-civicrm-core)[blackfire/player

A powerful web crawler and web scraper with Blackfire support

49617.1k](/packages/blackfire-player)[spatie/crawler

Crawl all internal links found on a website

2.8k18.5M67](/packages/spatie-crawler)[tempest/framework

The PHP framework that gets out of your way.

2.2k34.4k15](/packages/tempest-framework)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
