PHPackages                             piedweb/crawler - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. piedweb/crawler

ActiveLibrary

piedweb/crawler
===============

Web Crawler to check few SEO basics.

0.1.890(2mo ago)1201MITPHPPHP &gt;=8.4

Since Dec 19Pushed 2mo agoCompare

[ Source](https://github.com/PiedWeb/crawler)[ Packagist](https://packagist.org/packages/piedweb/crawler)[ Docs](https://dev.piedweb.com)[ RSS](/packages/piedweb-crawler/feed)WikiDiscussions main Synced 1mo ago

READMEChangelogDependencies (6)Versions (161)Used By (0)

[![Open Source Package](https://raw.githubusercontent.com/PiedWeb/piedweb-devoluix-theme/master/src/img/logo_title.png)](https://dev.piedweb.com)

CLI Seo Pocket Crawler
======================

[](#cli-seo-pocket-crawler)

[![Latest Version](https://camo.githubusercontent.com/10ac7440ddb02d308eaa643cd0d2f1a433a4fa4f8a4af401fe4a00812a549227/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f7461672f506965645765622f437261776c65722e7376673f7374796c653d666c6174266c6162656c3d72656c65617365)](https://github.com/PiedWeb/Crawler/tags)[![Software License](https://camo.githubusercontent.com/f251623e510f5909f16ae3f4e6e548dac11340b9fde1a99be26b015b39272c00/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d4d49542d627269676874677265656e2e7376673f7374796c653d666c6174)](LICENSE)[![GitHub Tests Action Status](https://camo.githubusercontent.com/69db124527ccae76f0c5c7b94c9c909edf3e9617305f6b5817ba9968b93bccd7/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f776f726b666c6f772f7374617475732f506965645765622f437261776c65722f54657374733f6c6162656c3d7465737473)](https://github.com/PiedWeb/PiedWeb/actions)[![Quality Score](https://camo.githubusercontent.com/afefb9c71d9eb75d21eabf82f8606570641e7d80374b13c8d21a9f2745731000/68747470733a2f2f696d672e736869656c64732e696f2f7363727574696e697a65722f672f506965645765622f506965645765622e7376673f7374796c653d666c6174)](https://scrutinizer-ci.com/g/PiedWeb/PiedWeb)[![Code Coverage](https://camo.githubusercontent.com/ad1c5df148038d50b23b6d13436b75553ac9e6d133450cf73859c7bec532334b/68747470733a2f2f636f6465636f762e696f2f67682f506965645765622f506965645765622f6272616e63682f6d61696e2f67726170682f62616467652e737667)](https://codecov.io/gh/PiedWeb/PiedWeb/branch/main)[![Type Coverage](https://camo.githubusercontent.com/92a9cae37a1391395968437e74aca983df381125f0db54069dbe9d52f4a5d483/68747470733a2f2f73686570686572642e6465762f6769746875622f506965645765622f506965645765622f636f7665726167652e737667)](https://shepherd.dev/github/PiedWeb/PiedWeb)[![Total Downloads](https://camo.githubusercontent.com/46669998bf18112c868573971a66f3be41c64919a1ef69ee203cb5eaa5077f47/68747470733a2f2f696d672e736869656c64732e696f2f7061636b61676973742f64742f706965647765622f637261776c65722e7376673f7374796c653d666c6174)](https://packagist.org/packages/piedweb/crawler)

Web Crawler to check few SEO basics.

Use the collected data in your favorite spreadsheet software or retrieve them via your favorite language.

French documentation available :

Install
-------

[](#install)

Via [Packagist](https://img.shields.io/packagist/dt/piedweb/crawler.svg?style=flat)

```
$ composer create-project piedweb/crawler
```

Usage
-----

[](#usage)

### Crawler CLI

[](#crawler-cli)

```
$ bin/console crawler:go $start
```

#### Arguments:

[](#arguments)

```
  start                            Define where the crawl start. Eg: https://piedweb.com
                                   You can specify an id from a previous crawl. Other options will not be listen.
                                   You can use `last` to continue the last crawl (just stopped)

```

#### Options:

[](#options)

```
  -l, --limit=LIMIT                Define where a depth limit [default: 5]
  -i, --ignore=IGNORE              Virtual Robots.txt to respect (could be a string or an URL).
  -u, --user-agent=USER-AGENT      Define the user-agent used during the crawl. [default: "SEO Pocket Crawler - PiedWeb.com/seo/crawler"]
  -w, --wait=WAIT                  In Microseconds, the time to wait between 2 requests. Default 0,1s. [default: 100000]
  -c, --cache-method=CACHE-METHOD  In Microseconds, the time to wait between two request. Default : 100000 (0,1s). [default: 2]
  -r, --restart=RESTART            Permit to restart a previous crawl. Values 1 = fresh restart, 2 = restart from cache
  -h, --help                       Display this help message
  -q, --quiet                      Do not output any message
  -V, --version                    Display this application version
      --ansi                       Force ANSI output
      --no-ansi                    Disable ANSI output
  -n, --no-interaction             Do not ask any interactive question
  -v|vv|vvv, --verbose             Increase the verbosity of messages: 1 for normal output, 2 for more verbose output and 3 for debug

```

### Extract All External Links in 1s from a previous crawl

[](#extract-all-external-links-in-1s-from-a-previous-crawl)

```
$ bin/console crawler:external $id [--host]
```

```
    --id
        id from a previous crawl
        You can use  `last` too show external links from the last crawl.

    --host -ho
        flag permitting to get only host

```

### Calcul Page Rank

[](#calcul-page-rank)

Will update the previous `data.csv` generated. Then you can explore your website with the PoC `pagerank.html`(in a server `npx http-server -c-1 --port 3000`).

```
$ bin/console crawler:pagerank $id
```

```
    --id
        id from a previous crawl
        You can use `last` too calcul page rank from the last crawl.

```

Testing
-------

[](#testing)

```
$ composer test
```

Todo
----

[](#todo)

- Better Links Harvesting and Recording (record context (list, nav, sentence...))
- Transform the PoC (Page Rank Visualizer)
- Complex Page Rank Calculator (with 301, canonical, nofollow, etc.)

Contributing
------------

[](#contributing)

Please see [contributing](https://dev.piedweb.com/contributing)

Credits
-------

[](#credits)

- [PiedWeb](https://piedweb.com) ak [Robind4](https://twitter.com/Robind4)
- [All Contributors](https://github.com/PiedWeb/:package_skake/graphs/contributors)

License
-------

[](#license)

The MIT License (MIT). Please see [License File](LICENSE) for more information.

[![Latest Version](https://camo.githubusercontent.com/915a3c827db18da1da606d1fa38b9c926b4c0b8fab39d981142e2f949740c2c8/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f7461672f506965645765622f506965645765622e7376673f7374796c653d666c6174266c6162656c3d72656c65617365)](https://github.com/PiedWeb/PiedWeb/tags)[![Software License](https://camo.githubusercontent.com/f251623e510f5909f16ae3f4e6e548dac11340b9fde1a99be26b015b39272c00/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d4d49542d627269676874677265656e2e7376673f7374796c653d666c6174)](https://github.com/PiedWeb/PiedWeb/blob/master/LICENSE)[![Build Status](https://camo.githubusercontent.com/9b1b8c12a7f0580d9bfb1d8b01c3aa9fb8a502b1112da08df68ed2c03b5dd434/68747470733a2f2f696d672e736869656c64732e696f2f7472617669732f506965645765622f506965645765622f6d61737465722e7376673f7374796c653d666c6174)](https://travis-ci.org/PiedWeb/PiedWeb)[![Quality Score](https://camo.githubusercontent.com/afefb9c71d9eb75d21eabf82f8606570641e7d80374b13c8d21a9f2745731000/68747470733a2f2f696d672e736869656c64732e696f2f7363727574696e697a65722f672f506965645765622f506965645765622e7376673f7374796c653d666c6174)](https://scrutinizer-ci.com/g/PiedWeb/PiedWeb)[![Code Coverage](https://camo.githubusercontent.com/5a64a665b4fbe94580ed0ba15c93c2924955b9b01ccbde9b42ec23f47141b3b1/68747470733a2f2f696d672e736869656c64732e696f2f7363727574696e697a65722f636f7665726167652f672f506965645765622f506965645765622e7376673f7374796c653d666c6174)](https://scrutinizer-ci.com/g/PiedWeb/PiedWeb/code-structure)[![Total Downloads](https://camo.githubusercontent.com/46669998bf18112c868573971a66f3be41c64919a1ef69ee203cb5eaa5077f47/68747470733a2f2f696d672e736869656c64732e696f2f7061636b61676973742f64742f706965647765622f637261776c65722e7376673f7374796c653d666c6174)](https://packagist.org/packages/piedweb/crawler)

###  Health Score

50

—

FairBetter than 95% of packages

Maintenance93

Actively maintained with recent releases

Popularity14

Limited adoption so far

Community8

Small or concentrated contributor base

Maturity71

Established project with proven stability

 Bus Factor1

Top contributor holds 98.4% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~7 days

Total

160

Last Release

69d ago

PHP version history (4 changes)0.1.21PHP &gt;=8.1

0.1.77PHP &gt;=8.2

0.1.796PHP &gt;=8.3

0.1.878PHP &gt;=8.4

### Community

Maintainers

![](https://www.gravatar.com/avatar/afce4cf517928a50560237f1410d5957271fd808671b2216687ecf1422adaee0?d=identicon)[Robin D.](/maintainers/Robin%20D.)

---

Top Contributors

[![actions-user](https://avatars.githubusercontent.com/u/65916846?v=4)](https://github.com/actions-user "actions-user (62 commits)")[![RobinDev](https://avatars.githubusercontent.com/u/3944894?v=4)](https://github.com/RobinDev "RobinDev (1 commits)")

---

Tags

crawlerPied Web

### Embed Badge

![Health badge](/badges/piedweb-crawler/health.svg)

```
[![Health](https://phpackages.com/badges/piedweb-crawler/health.svg)](https://phpackages.com/packages/piedweb-crawler)
```

###  Alternatives

[phan/phan

A static analyzer for PHP

5.6k11.2M1.1k](/packages/phan-phan)[sylius/sylius

E-Commerce platform for PHP, based on Symfony framework.

8.4k5.6M647](/packages/sylius-sylius)[shopware/platform

The Shopware e-commerce core

3.3k1.5M3](/packages/shopware-platform)[magento/community-edition

Magento 2 (Open Source)

12.1k52.1k10](/packages/magento-community-edition)[shlinkio/shlink

A self-hosted and PHP-based URL shortener application with CLI and REST interfaces

4.8k4.3k](/packages/shlinkio-shlink)[sulu/sulu

Core framework that implements the functionality of the Sulu content management system

1.3k1.3M152](/packages/sulu-sulu)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
