PHPackages                             ppajer/webscraper - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Utility &amp; Helpers](/categories/utility)
4. /
5. ppajer/webscraper

ActiveLibrary[Utility &amp; Helpers](/categories/utility)

ppajer/webscraper
=================

A straightforward web scraper written in PHP, with support for parallel processing and HTML5.

1261PHP

Since Jul 19Pushed 5y ago1 watchersCompare

[ Source](https://github.com/ppajer/WebScraper)[ Packagist](https://packagist.org/packages/ppajer/webscraper)[ RSS](/packages/ppajer-webscraper/feed)WikiDiscussions master Synced 1w ago

READMEChangelogDependenciesVersions (1)Used By (0)

WebScraper
==========

[](#webscraper)

A straightforward web scraper written in PHP, with support for parallel processing and HTML5.

Installation
------------

[](#installation)

To start using this package, add it to your `composer.json` file and call `composer install`, then include the generated `autoload.php` in your project. Alternatively, download and include the package along with its dependencies directly into your project.

### Dependencies

[](#dependencies)

- [PHP DOM Extractor](https://github.com/ppajer/PHP-DOM-Extractor)
- [PHP Request](https://github.com/ppajer/PHP-Request)

Usage
-----

[](#usage)

The scraper takes 2 inputs: an array of Request Options that define the resources to gather, and an array of Extracton Rules to specify what data we're looking for in those resources. For more information on [Request Options](https://github.com/ppajer/PHP-Request#multiple-requests---parallelrequest) or [Extraction Rules](https://github.com/ppajer/PHP-DOM-Extractor#defining-extraction-rules), read the respective docs.

```
require 'autoload.php';

$rules = 'path/to/rules.json';
$options = [
	'foo' => ['URL' => 'https://...']
];

$scraper = new WebScraper($rules);
$result = $scraper->start($options);

```

###  Health Score

19

—

LowBetter than 10% of packages

Maintenance20

Infrequent updates — may be unmaintained

Popularity9

Limited adoption so far

Community8

Small or concentrated contributor base

Maturity33

Early-stage or recently created project

 Bus Factor1

Top contributor holds 100% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

### Community

Maintainers

![](https://www.gravatar.com/avatar/1e6212007de7bf9a8e3591e60269ffaf607a962926182e21646b8f9975a879e4?d=identicon)[ppajer](/maintainers/ppajer)

---

Top Contributors

[![ppajer](https://avatars.githubusercontent.com/u/5861559?v=4)](https://github.com/ppajer "ppajer (5 commits)")

### Embed Badge

![Health badge](/badges/ppajer-webscraper/health.svg)

```
[![Health](https://phpackages.com/badges/ppajer-webscraper/health.svg)](https://phpackages.com/packages/ppajer-webscraper)
```

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
