PHPackages                             topshelfcraft/scraper - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Parsing &amp; Serialization](/categories/parsing)
4. /
5. topshelfcraft/scraper

ActiveCraft-plugin[Parsing &amp; Serialization](/categories/parsing)

topshelfcraft/scraper
=====================

Easily fetch, parse, and rejigger HTML or XML from anywhere.

4.0.0(3y ago)162.2k2[6 issues](https://github.com/TopShelfCraft/Scraper/issues)proprietaryPHP

Since Jun 1Pushed 3y ago1 watchersCompare

[ Source](https://github.com/TopShelfCraft/Scraper)[ Packagist](https://packagist.org/packages/topshelfcraft/scraper)[ RSS](/packages/topshelfcraft-scraper/feed)WikiDiscussions master Synced 3d ago

READMEChangelogDependencies (3)Versions (9)Used By (0)

Scraper
=======

[](#scraper)

*Easily fetch, slice, dice, and output HTML (or XML) content from anywhere.*

**A [Top Shelf Craft](https://topshelfcraft.com) creation**
[Michael Rog](https://michaelrog.com), Proprietor

---

Installation
------------

[](#installation)

1. From your project directory, use Composer to require the plugin package:

    ```
    composer require topshelfcraft/scraper

    ```
2. In the Control Panel, go to Settings → Plugins and click the “Install” button for Scraper.
3. There is no Step 3.

*Scraper is also available for installation via the Craft CMS Plugin Store.*

Usage
-----

[](#usage)

The Scraper plugin exposes a full-featured crawler object to your Twig template, allowing you to fetch, parse, and filter DOM elements from a remote source document.

### Instantiating a client

[](#instantiating-a-client)

When invoking the plugin, you can choose whether to use SimpleHtmlDom or Symfony components to instantiate your crawler:

```
{% set crawler = craft.scraper.using('symfony').get('https://zombo.com') %}
```

```
{% set crawler = craft.scraper.using('simplehtmldom').get('https://zombo.com') %}
```

I generally recommend using the Symfony components; they are more powerful and resilient to malformed source code. (The SimpleHtmlDom crawler is included to provide backwards compatibility with Craft 2 projects.)

### Using the Symfony client

[](#using-the-symfony-client)

When you opt for Symfony components, the `get` method instantiates a full [BrowserKit](https://symfony.com/components/BrowserKit) client, giving you access to all the [BrowserKit](https://symfony.com/components/BrowserKit) and [DomCrawler](https://symfony.com/doc/current/components/dom_crawler.html) methods.

You can iterate over the DOM elements from your source document like this:

```
{% for node in crawler.filter('h2 > a') %}
    {{ node.text() }}
{% endfor %}
```

### Using the SimpleHtmlDom client

[](#using-the-simplehtmldom-client)

When you opt for the SimpleHtmlDom crawler, the `get` method instantiates a [SimpleHtmlDom](https://simplehtmldom.sourceforge.io/) client, giving you access to all the [SimpleHtmlDom methods](https://simplehtmldom.sourceforge.io/manual.htm).

You can iterate over the DOM elements from your source document like this:

```
{% for node in crawler.find('h1') %}
    {{ node.innertext() }}
{% endfor %}
```

### This is great! I still have questions.

[](#this-is-great-i-still-have-questions)

Ask a question on [StackExchange](https://craftcms.stackexchange.com/), and ping me with a URL via email or Discord.

### What are the system requirements?

[](#what-are-the-system-requirements)

Craft 4.2.1+

### I found a bug.

[](#i-found-a-bug)

Please open a GitHub Issue, submit a PR to the `4.x.dev` branch, or just email me.

---

#### Contributors:

[](#contributors)

- Plugin development: [Michael Rog](http://michaelrog.com) / @michaelrog
- Includes the ["Simple HTML DOM"](http://simplehtmldom.sourceforge.net/) library, created by S. C. Chen
- Includes the Symfony [DomCrawler](https://symfony.com/doc/current/components/dom_crawler.html) via [Goutte](https://github.com/FriendsOfPHP/Goutte), created by [Fabian Potencier](http://fabien.potencier.org/) / @fabpot
- Icon: ["Upright vacuum cleaner"](https://thenounproject.com/creaticca/collection/vacuum-cleaners-outline-collection/?i=960548) by [Creaticca Creative Agency](https://thenounproject.com/creaticca/), via [The Noun Project](https://thenounproject.com/)

###  Health Score

31

—

LowBetter than 68% of packages

Maintenance13

Infrequent updates — may be unmaintained

Popularity24

Limited adoption so far

Community9

Small or concentrated contributor base

Maturity64

Established project with proven stability

 Bus Factor1

Top contributor holds 100% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~245 days

Recently: every ~301 days

Total

6

Last Release

1315d ago

Major Versions

3.x-dev → 4.0.02022-10-07

### Community

Maintainers

![](https://www.gravatar.com/avatar/7fabbca3f39380eadb6f89517be9a0fbc550159a1eec925452da43d14338de66?d=identicon)[TopShelfCraft](/maintainers/TopShelfCraft)

---

Top Contributors

[![michaelrog](https://avatars.githubusercontent.com/u/102379?v=4)](https://github.com/michaelrog "michaelrog (15 commits)")

---

Tags

craft2craft3craftcmscraftcms-pluginscraperpluginhtmldomcmsparseremoteCraftcraftcmsscrapersimplehtmldomexternalfetch

### Embed Badge

![Health badge](/badges/topshelfcraft-scraper/health.svg)

```
[![Health](https://phpackages.com/badges/topshelfcraft-scraper/health.svg)](https://phpackages.com/packages/topshelfcraft-scraper)
```

###  Alternatives

[simplehtmldom/simplehtmldom

A fast, simple and reliable HTML document parser for PHP.

1921.3M14](/packages/simplehtmldom-simplehtmldom)[craftcms/store-hours

This plugin adds a new “Store Hours” field type to Craft, for collecting the opening and closing hours of a business for each day of the week.

61102.9k1](/packages/craftcms-store-hours)[am-impact/amcommand

Command palette in Craft.

8674.1k3](/packages/am-impact-amcommand)[putyourlightson/craft-log-to-file

Logs messages to a specific log file for Craft CMS.

29368.0k5](/packages/putyourlightson-craft-log-to-file)[jalendport/craft-fetch

Utilise the Guzzle HTTP client from within your Craft templates.

2327.6k1](/packages/jalendport-craft-fetch)[hexydec/htmldoc

A token based HTML document parser and minifier. Minify HTML documents including inline CSS, Javascript, and SVG's on the fly. Extract document text, attributes, and fragments. Full test suite.

2610.3k3](/packages/hexydec-htmldoc)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
