PHPackages                             piedweb/url-harvester - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Utility &amp; Helpers](/categories/utility)
4. /
5. piedweb/url-harvester

ActiveLibrary[Utility &amp; Helpers](/categories/utility)

piedweb/url-harvester
=====================

Harvest statistics and meta data from an URL or his source code (seo oriented).

0.0.31(4y ago)120.9k12MITPHPPHP ^7.3|^8.0

Since Jan 8Pushed 4y ago1 watchersCompare

[ Source](https://github.com/PiedWeb/UrlHarvester)[ Packagist](https://packagist.org/packages/piedweb/url-harvester)[ Docs](https://dev.piedweb.com)[ RSS](/packages/piedweb-url-harvester/feed)WikiDiscussions master Synced 2d ago

READMEChangelogDependencies (12)Versions (32)Used By (2)

[![Open Source Package](https://raw.githubusercontent.com/PiedWeb/piedweb-devoluix-theme/master/src/img/logo_title.png)](https://dev.piedweb.com)

Url Meta Data Harvester
=======================

[](#url-meta-data-harvester)

[![Latest Version](https://camo.githubusercontent.com/2d43b58774ed665af845f2d0693f5f49ed90a43b4e2299648985b59239b8520d/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f7461672f506965645765622f55726c4861727665737465722e7376673f7374796c653d666c6174266c6162656c3d72656c65617365)](https://github.com/PiedWeb/UrlHarvester/tags)[![Software License](https://camo.githubusercontent.com/f251623e510f5909f16ae3f4e6e548dac11340b9fde1a99be26b015b39272c00/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d4d49542d627269676874677265656e2e7376673f7374796c653d666c6174)](LICENSE)[![GitHub Tests Action Status](https://camo.githubusercontent.com/8a703efa856ba93bed96c4e26c8b7d18109e267bb91381ea344ce4c89011113e/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f776f726b666c6f772f7374617475732f506965645765622f55726c4861727665737465722f54657374733f6c6162656c3d7465737473)](https://github.com/PiedWeb/UrlHarvester/actions)[![Quality Score](https://camo.githubusercontent.com/8fe9755e0a70a3c3e9e18865f08b8954c7e8d1b983a27581446657e0d094fe7f/68747470733a2f2f696d672e736869656c64732e696f2f7363727574696e697a65722f672f506965645765622f55726c4861727665737465722e7376673f7374796c653d666c6174)](https://scrutinizer-ci.com/g/PiedWeb/UrlHarvester)[![Code Coverage](https://camo.githubusercontent.com/dbc24d550a33e196a54366c14bb2cb9d5d3ca44208eac8278d8798c942e963fc/68747470733a2f2f636f6465636f762e696f2f67682f506965645765622f55726c4861727665737465722f6272616e63682f6d61696e2f67726170682f62616467652e737667)](https://codecov.io/gh/PiedWeb/UrlHarvester/branch/main)[![Type Coverage](https://camo.githubusercontent.com/11cf378e49a78bcf693bf330b40dd0aefaac6357d8396ea2beaeeae60124fbaa/68747470733a2f2f73686570686572642e6465762f6769746875622f506965645765622f55726c4861727665737465722f636f7665726167652e737667)](https://shepherd.dev/github/PiedWeb/UrlHarvester)[![Total Downloads](https://camo.githubusercontent.com/74d6ddcc46a69dc463ef2415f9cb55cbc4835d0c76ae489d6f07edfad0a30f7c/68747470733a2f2f696d672e736869656c64732e696f2f7061636b61676973742f64742f706965647765622f75726c2d6861727665737465722e7376673f7374796c653d666c6174)](https://packagist.org/packages/piedweb/url-harvester)

Harvest statistics and meta data from an URL or his source code (seo oriented).

Implemented in [Seo Pocket Crawler](https://piedweb.com/seo/crawler) ([source on github](https://github.com/PiedWeb/SeoPocketCrawler)).

Install
-------

[](#install)

Via [Packagist](https://img.shields.io/packagist/dt/piedweb/url-harvester.svg?style=flat)

```
$ composer require piedweb/url-harvester
```

Usage
-----

[](#usage)

Harvest Methods :

```
use \PiedWeb\UrlHarvester\Harvest;
use \PiedWeb\UrlHarvester\Link;

$url = 'https://piedweb.com';

Harvest::fromUrl($url)
    ->getResponse()->getInfo('total_time') // load time
    ->getResponse()->getInfo('size_download')
    ->getResponse()->getStatusCode()
    ->getResponse()->getContentType()
    ->getRes...

    ->getTag('h1') // @return first tag content (could be html)
    ->getUniqueTag('h1') // @return first tag content in utf8 (could contain html)
    ->getMeta('description') // @return string from content attribute or NULL
    ->getCanonical() // @return string|NULL
    ->isCanonicalCorrect() // @return bool
    ->getRatioTxtCode() // @return int
    ->getTextAnalysis() // @return \PiedWeb\TextAnalyzer\Analysis
    ->getKws() // @return 10 more used words
    ->getBreadCrumb()
    ->indexable($userAgent = 'googlebot') // @return int corresponding to a const from Indexable

    ->getLinks()
    ->getLinks(Link::LINK_SELF)
    ->getLinks(Link::LINK_INTERNAL)
    ->getLinks(Link::LINK_SUB)
    ->getLinks(Link::LINK_EXTERNAL)
    ->getLinkedRessources() // Return an array with all attributes containing a href or a src property
    ->mayFollow() // check headers and meta and return bool

    ->getDomain()
    ->getBaseUrl()

    ->getRobotsTxt() // @return \Spatie\Robots\RobotsTxt or empty string
    ->setRobotsTxt($content) // @param string or RobotsTxt
```

Testing
-------

[](#testing)

```
$ composer test
```

Contributing
------------

[](#contributing)

Please see [contributing](https://dev.piedweb.com/contributing)

Credits
-------

[](#credits)

- [Pied Web](https://piedweb.com)
- [All Contributors](https://github.com/PiedWeb/:package_skake/graphs/contributors)

License
-------

[](#license)

The MIT License (MIT). Please see [License File](LICENSE) for more information.

###  Health Score

34

—

LowBetter than 77% of packages

Maintenance20

Infrequent updates — may be unmaintained

Popularity23

Limited adoption so far

Community12

Small or concentrated contributor base

Maturity67

Established project with proven stability

 Bus Factor1

Top contributor holds 100% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~34 days

Recently: every ~73 days

Total

31

Last Release

1648d ago

PHP version history (3 changes)0.0.1PHP ~7.1

0.0.15PHP ~7.3

v0.0.27PHP ^7.3|^8.0

### Community

Maintainers

![](https://www.gravatar.com/avatar/afce4cf517928a50560237f1410d5957271fd808671b2216687ecf1422adaee0?d=identicon)[Robin D.](/maintainers/Robin%20D.)

---

Top Contributors

[![RobinDev](https://avatars.githubusercontent.com/u/3944894?v=4)](https://github.com/RobinDev "RobinDev (22 commits)")

---

Tags

urlcrawlseometa datacanonicalPied WebUrl Meta Data Harvester

###  Code Quality

TestsPHPUnit

Static AnalysisPsalm

Code StylePHP CS Fixer

Type Coverage Yes

### Embed Badge

![Health badge](/badges/piedweb-url-harvester/health.svg)

```
[![Health](https://phpackages.com/badges/piedweb-url-harvester/health.svg)](https://phpackages.com/packages/piedweb-url-harvester)
```

###  Alternatives

[jbroadway/urlify

A fast PHP slug generator and transliteration library that converts non-ascii characters for use in URLs.

6737.4M62](/packages/jbroadway-urlify)[league/uri-components

URI components manipulation library

31932.3M67](/packages/league-uri-components)[pid/speakingurl

Generate of so called 'static' or 'Clean URL' or 'Pretty URL' or 'nice-looking URL' or 'Speaking URL' or 'user-friendly URL' or 'SEO-friendly URL' or 'slug' from a string.

1.1k5.3k1](/packages/pid-speakingurl)[dusterio/link-preview

Link preview generation for PHP with Laravel support

126326.6k3](/packages/dusterio-link-preview)[crwlr/crawler

Web crawling and scraping library.

37214.8k2](/packages/crwlr-crawler)[wazum/sluggi

TYPO3 extension for URL slug management with inline editing, auto-sync, locking, access control, and redirects

39488.5k](/packages/wazum-sluggi)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
