PHPackages                             wykleph/html-scraper - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Utility &amp; Helpers](/categories/utility)
4. /
5. wykleph/html-scraper

ActiveLibrary[Utility &amp; Helpers](/categories/utility)

wykleph/html-scraper
====================

An API for taking json sitemaps generated by the webscraper.io extension, and emulating webscraper.io's scraping behavior.

v0.1.0(10y ago)120MITPHP

Since Feb 4Pushed 10y ago1 watchersCompare

[ Source](https://github.com/Wykleph/HtmlScraper)[ Packagist](https://packagist.org/packages/wykleph/html-scraper)[ RSS](/packages/wykleph-html-scraper/feed)WikiDiscussions master Synced 3w ago

READMEChangelog (1)Dependencies (3)Versions (4)Used By (0)

HTMLScraper
===========

[](#htmlscraper)

An API for taking json sitemaps generated by the `webscraper.io` extension, and emulating webscraper.io's scraping behavior in PHP.

This is great for creating scraping templates in no time at all..

*I have no affiliation with webscraper.io, so please refer to their documentation and their forums for anything you might need in regards to webscraper.io.*

Installation : `composer require wykleph/html-scraper`

**Note: Child selectors are not supported yet, but it's on the docket!**

To use, require this project with composer, then download [the webscraper.io extension for chrome](http://webscraper.io/). This is what we will use to generate our sitemap for crawling the html.

Once you have the `webscraper.io` extension, you will probably want to learn how to use the `webscraper.io` extension.

Once you have some selectors set up for your sitemap, click on `Sitemap (sitemap-name)`-&gt;`Export Sitemap`. The json output is what we will use to instantiate a `SiteMap` object:

```
$SiteMap = new SiteMap($json);
```

The next step is to instantiate a `HtmlScraper` object to consume the `SiteMap` and the HTML you would like to crawl:

```
$scraper = new HtmlScraper($SiteMap, $html);
$selections = $scraper->getSelections();
```

or:

```
$selections = new HtmlScraper($SiteMap, $html)->getSelections();
```

The `$selections` array now contains all of the selections for the `sitemap` that you used for the given `html`.

The `$selections` array should also contain the name of the selector that you set up with `webscraper.io` as the key, so accessing your selections is as easy as grabbing something like `$selections['username-field-name']` or `$selections['phone']`.

###  Health Score

25

—

LowBetter than 36% of packages

Maintenance20

Infrequent updates — may be unmaintained

Popularity8

Limited adoption so far

Community7

Small or concentrated contributor base

Maturity55

Maturing project, gaining track record

 Bus Factor1

Top contributor holds 100% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Unknown

Total

1

Last Release

3798d ago

### Community

Maintainers

![](https://www.gravatar.com/avatar/1eb544e92c0645bf1868916c700865b5ca9fc3af9a8cca17829224cb09022aea?d=identicon)[Wykleph](/maintainers/Wykleph)

---

Top Contributors

[![RattleyCooper](https://avatars.githubusercontent.com/u/6770988?v=4)](https://github.com/RattleyCooper "RattleyCooper (15 commits)")

### Embed Badge

![Health badge](/badges/wykleph-html-scraper/health.svg)

```
[![Health](https://phpackages.com/badges/wykleph-html-scraper/health.svg)](https://phpackages.com/packages/wykleph-html-scraper)
```

###  Alternatives

[craftcms/cms

Craft CMS

3.6k3.6M2.9k](/packages/craftcms-cms)[drupal/core-dev

require-dev dependencies from drupal/drupal; use in addition to drupal/core-recommended to run tests from drupal/core.

2022.0M321](/packages/drupal-core-dev)[blackfire/player

A powerful web crawler and web scraper with Blackfire support

49517.1k](/packages/blackfire-player)[civicrm/civicrm-core

Open source constituent relationship management for non-profits, NGOs and advocacy organizations.

751284.3k37](/packages/civicrm-civicrm-core)[spatie/laravel-pjax

A pjax middleware for Laravel 5

513381.2k11](/packages/spatie-laravel-pjax)[crwlr/crawler

Web crawling and scraping library.

36816.4k2](/packages/crwlr-crawler)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
