PHPackages                             chrisullyott/php-url-extractor - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Utility &amp; Helpers](/categories/utility)
4. /
5. chrisullyott/php-url-extractor

ActiveLibrary[Utility &amp; Helpers](/categories/utility)

chrisullyott/php-url-extractor
==============================

Extract URLs from HTML content.

v0.2.3(7y ago)22.6kMITHTMLPHP &gt;=5.4.0

Since Jan 18Pushed 5y agoCompare

[ Source](https://github.com/chrisullyott/php-url-extractor)[ Packagist](https://packagist.org/packages/chrisullyott/php-url-extractor)[ Docs](https://github.com/chrisullyott/php-url-extractor)[ RSS](/packages/chrisullyott-php-url-extractor/feed)WikiDiscussions master Synced 3w ago

READMEChangelog (4)DependenciesVersions (8)Used By (0)

php-url-extractor
=================

[](#php-url-extractor)

[![Latest Stable Version](https://camo.githubusercontent.com/ce6fa8851e9a5b5f8b7b762b41d9b363978a48fd581f148d8bba6cac7016998b/68747470733a2f2f706f7365722e707567782e6f72672f6368726973756c6c796f74742f7068702d75726c2d657874726163746f722f762f737461626c65)](https://packagist.org/packages/chrisullyott/php-url-extractor)[![Total Downloads](https://camo.githubusercontent.com/b2ea960e92a34c07b2c05577f1bb2b2c1e28eb5d641b89a26127187ac691a591/68747470733a2f2f706f7365722e707567782e6f72672f6368726973756c6c796f74742f7068702d75726c2d657874726163746f722f646f776e6c6f616473)](https://packagist.org/packages/chrisullyott/php-url-extractor)

Extract URLs from HTML content, applying optional filters.

Installation
------------

[](#installation)

With [Composer](https://getcomposer.org/):

```
$ composer require chrisullyott/php-url-extractor

```

Usage
-----

[](#usage)

```
$html = file_get_contents('about-us.html');

$extractor = new UrlExtractor($html);
$extractor->setHomeUrl('http://www.site.com');
$extractor->setFilesOnly(true);

$urls = $extractor->getUrls();
print_r($urls);

```

```
(
    [0] => stdClass Object
        (
            [attribute] => href
            [value] => /_assets/img/icons/favicon-96.png
            [url] => https://www.site.com/_assets/img/icons/favicon-96.png
        )
    ...

```

Options
-------

[](#options)

### setAttributeFilter *(array)*

[](#setattributefilter-array)

The `#getUrls` method creates a [DOMDocument](http://php.net/manual/en/class.domdocument.php) and checks given element attributes, such as `src` and `href`, for URLs you might be interested in. Use `#setAttributeFilter` to override the default set of attributes with your own.

### setHomeUrl *(string)*

[](#sethomeurl-string)

Providing a home URL filters results to those local to the domain. Any relative URL beginning with one slash `/` and not two slashes is considered local as well. Setting this also builds the `url` property (an absolute URL) for the objects returned by the `#getUrls` method.

### setAlternateDomains *(array)*

[](#setalternatedomains-array)

Used with `#setHomeUrl`. If set, the returned URLs will include those whose domain is found in the array. In this array, you may enter strings, like `media.site.com` and/or regular expressions, like `/.*\.site\.com/`.

### setFilesOnly *(boolean)*

[](#setfilesonly-boolean)

Whether we should only return URLs with file extensions.

### setIgnoredExtensions *(array)*

[](#setignoredextensions-array)

Used with `#setFilesOnly`. Excludes URLs whose file extension is found in the array.

###  Health Score

28

—

LowBetter than 52% of packages

Maintenance20

Infrequent updates — may be unmaintained

Popularity21

Limited adoption so far

Community6

Small or concentrated contributor base

Maturity53

Maturing project, gaining track record

 Bus Factor1

Top contributor holds 100% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~49 days

Recently: every ~58 days

Total

6

Last Release

2834d ago

PHP version history (2 changes)v0.0.1PHP &gt;=5.3.0

v0.2.3PHP &gt;=5.4.0

### Community

Maintainers

![](https://avatars.githubusercontent.com/u/1007459?v=4)[Chris](/maintainers/chrisullyott)[@chrisullyott](https://github.com/chrisullyott)

---

Top Contributors

[![chrisullyott](https://avatars.githubusercontent.com/u/1007459?v=4)](https://github.com/chrisullyott "chrisullyott (28 commits)")

### Embed Badge

![Health badge](/badges/chrisullyott-php-url-extractor/health.svg)

```
[![Health](https://phpackages.com/badges/chrisullyott-php-url-extractor/health.svg)](https://phpackages.com/packages/chrisullyott-php-url-extractor)
```

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
