PHPackages                             hanson/crawler - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. hanson/crawler

ActiveLibrary

hanson/crawler
==============

0.2.51(9y ago)5211MITPHP

Since May 19Pushed 7y ago1 watchersCompare

[ Source](https://github.com/HanSon/crawler)[ Packagist](https://packagist.org/packages/hanson/crawler)[ RSS](/packages/hanson-crawler/feed)WikiDiscussions master Synced 2mo ago

READMEChangelogDependencies (2)Versions (13)Used By (0)

crawler
=======

[](#crawler)

A easy package to crawl a site list and detail

Installation
------------

[](#installation)

```
composer require hanccc/crawler

```

usage
-----

[](#usage)

This package require [Goutte](https://github.com/FriendsOfPHP/Goutte), you can get the dom by `$this->crawler();` in both of list and detail.

### example

[](#example)

```
        //or $listCrawler = new ExampleListCrawler(storage_path('logs'));
        $listCrawler = new ExampleListCrawler('http://example.com', storage_path('logs'));
        $listCrawler->setDetailCrawler(new ExampleDetailCrawler());
        $listCrawler->start();

```

#### ListCrawler

[](#listcrawler)

```
class ExampleListCrawler extends ListCrawler{
    public $url = 'http://example.com';

    //return links per page
    public function getEachPageUrl($page)
    {
        return 'http://example.com/list&page=' . $page;
    }

    // get the maximum number of pages
    public function setMaxPage()
    {
        $this->maxPage = $num;
    }
}

```

#### DetailCrawler

[](#detailcrawler)

```
class ExampleDetailCrawler extends DetailCrawler{

    //Returns boolean
    public function isDetailUrl($url)
    {
        if(preg_match('/example.com\/id(\d+)/, $url))
            return true;
    }

    // what you want to do about the detail page
    public function handle()
    {
        echo $this->crawler->filter('title')->text();
    }
}

```

License
-------

[](#license)

Crawler is open-sourced software licensed under the [MIT license](http://opensource.org/licenses/MIT).

###  Health Score

28

—

LowBetter than 54% of packages

Maintenance20

Infrequent updates — may be unmaintained

Popularity12

Limited adoption so far

Community10

Small or concentrated contributor base

Maturity60

Established project with proven stability

 Bus Factor1

Top contributor holds 84.2% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~70 days

Recently: every ~189 days

Total

12

Last Release

2876d ago

Major Versions

0.2.x-dev → v2.x-dev2018-06-30

### Community

Maintainers

![](https://avatars.githubusercontent.com/u/10583423?v=4)[hanson](/maintainers/hanson)[@Hanson](https://github.com/Hanson)

---

Top Contributors

[![Hanson](https://avatars.githubusercontent.com/u/10583423?v=4)](https://github.com/Hanson "Hanson (16 commits)")[![TinSkyFullOfStar](https://avatars.githubusercontent.com/u/13809581?v=4)](https://github.com/TinSkyFullOfStar "TinSkyFullOfStar (3 commits)")

### Embed Badge

![Health badge](/badges/hanson-crawler/health.svg)

```
[![Health](https://phpackages.com/badges/hanson-crawler/health.svg)](https://phpackages.com/packages/hanson-crawler)
```

###  Alternatives

[elgg/elgg

Elgg is an award-winning social networking engine, delivering the building blocks that enable businesses, schools, universities and associations to create their own fully-featured social networks and applications.

1.7k15.7k5](/packages/elgg-elgg)[pressbooks/pressbooks

Pressbooks is an open source book publishing tool built on a WordPress multisite platform. Pressbooks outputs books in multiple formats, including PDF, EPUB, web, and a variety of XML flavours, using a theming/templating system, driven by CSS.

44643.1k1](/packages/pressbooks-pressbooks)[drevops/git-artifact

Package artifact from your codebase in CI and push it to a separate git repo.

2133.2k](/packages/drevops-git-artifact)[doppar/framework

The Doppar Framework

366.7k8](/packages/doppar-framework)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
