PHPackages                             dmoraschi/sitemap-common - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Utility &amp; Helpers](/categories/utility)
4. /
5. dmoraschi/sitemap-common

ActiveLibrary[Utility &amp; Helpers](/categories/utility)

dmoraschi/sitemap-common
========================

Sitemap generator and crawler library

v1.1.0(9y ago)0106MITPHPPHP &gt;=5.6

Since Aug 20Pushed 9y agoCompare

[ Source](https://github.com/danielemoraschi/sitemap-common)[ Packagist](https://packagist.org/packages/dmoraschi/sitemap-common)[ RSS](/packages/dmoraschi-sitemap-common/feed)WikiDiscussions master Synced 4w ago

READMEChangelogDependencies (4)Versions (4)Used By (0)

A PHP sitemap generator and crawler
===================================

[](#a-php-sitemap-generator-and-crawler)

[![Build Status](https://camo.githubusercontent.com/02de81a94940d8fe19ed022525a49104d17587859fb5a1b445bd32fe91afca97/68747470733a2f2f7472617669732d63692e6f72672f64616e69656c656d6f7261736368692f736974656d61702d636f6d6d6f6e2e706e673f6272616e63683d6d6173746572)](https://travis-ci.org/danielemoraschi/sitemap-common)[![Scrutinizer Quality Score](https://camo.githubusercontent.com/bf552439b90fec3893e4259464d7cdb004aeeb341f538205de17038ae882c3c7/68747470733a2f2f7363727574696e697a65722d63692e636f6d2f672f64616e69656c656d6f7261736368692f736974656d61702d636f6d6d6f6e2f6261646765732f7175616c6974792d73636f72652e706e673f623d6d6173746572)](https://scrutinizer-ci.com/g/danielemoraschi/sitemap-common/)

This package provides all of the components to crawl a website and build and write sitemaps file.

Example of console application using the library: [dmoraschi/sitemap-app](https://github.com/danielemoraschi/sitemap-app)

Installation
------------

[](#installation)

Run the following command and provide the latest stable version (e.g v1.0.0):

```
composer require dmoraschi/sitemap-common
```

or add the following to your `composer.json` file :

```
"dmoraschi/sitemap-common": "1.0.*"
```

`SiteMapGenerator`
------------------

[](#sitemapgenerator)

**Basic usage**

```
$generator = new SiteMapGenerator(
    new FileWriter($outputFileName),
    new XmlTemplate()
);
```

Add a URL:

```
$generator->addUrl($url, $frequency, $priority);
```

Add a single `SiteMapUrl` object or array:

```
$siteMapUrl = new SiteMapUrl(
    new Url($url), $frequency, $priority
);

$generator->addSiteMapUrl($siteMapUrl);

$generator->addSiteMapUrls([
    $siteMapUrl, $siteMapUrl2
]);
```

Set the URLs of the sitemap via `SiteMapUrlCollection`:

```
$siteMapUrl = new SiteMapUrl(
    new Url($url), $frequency, $priority
);

$collection = new SiteMapUrlCollection([
    $siteMapUrl, $siteMapUrl2
]);

$generator->setCollection($collection);
```

Generate the sitemap:

```
$generator->execute();
```

`Crawler`
---------

[](#crawler)

**Basic usage**

```
$crawler = new Crawler(
    new Url($baseUrl),
    new RegexBasedLinkParser(),
    new HttpClient()
);
```

You can tell the `Crawler` **not to visit** certain url's by adding policies. Below the default policies provided by the library:

```
$crawler->setPolicies([
    'host' => new SameHostPolicy($baseUrl),
    'url'  => new UniqueUrlPolicy(),
    'ext'  => new ValidExtensionPolicy(),
]);
// or
$crawler->setPolicy('host', new SameHostPolicy($baseUrl));
```

`SameHostPolicy`, `UniqueUrlPolicy`, `ValidExtensionPolicy` are provided with the library, you can define your own policies by implementing the interface `Policy`.

Calling the function `crawl` the object will start from the base url in the contructor and crawl all the web pages with the specified depth passed as a argument. The function will return with the array of all unique visited `Url`'s:

```
$urls = $crawler->crawl($deep);
```

You can also instruct the `Crawler` to collect custom data while visiting the web pages by adding `Collector`'s to the main object:

```
$crawler->setCollectors([
    'images' => new ImageCollector()
]);
// or
$crawler->setCollector('images', new ImageCollector());
```

And then retrive the collected data:

```
$crawler->crawl($deep);

$imageCollector = $crawler->getCollector('images');
$data = $imageCollector->getCollectedData();
```

`ImageCollector` is provided by the library, you can define your own collector by implementing the interface `Collector`.

###  Health Score

27

—

LowBetter than 47% of packages

Maintenance20

Infrequent updates — may be unmaintained

Popularity9

Limited adoption so far

Community6

Small or concentrated contributor base

Maturity60

Established project with proven stability

 Bus Factor1

Top contributor holds 100% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~1 days

Total

2

Last Release

3600d ago

### Community

Maintainers

![](https://www.gravatar.com/avatar/42b2a335947d064294cc9d52cd90dcfeb92924f4a5642fb398fe0f80614bf708?d=identicon)[DXI-8x8](/maintainers/DXI-8x8)

---

Top Contributors

[![danielemoraschi](https://avatars.githubusercontent.com/u/872066?v=4)](https://github.com/danielemoraschi "danielemoraschi (1 commits)")

---

Tags

crawlerphpphp-libraryphp-sitemap-generatorsitemaplinkcrawlerwebsitecrawlSitemap

###  Code Quality

TestsPHPUnit

### Embed Badge

![Health badge](/badges/dmoraschi-sitemap-common/health.svg)

```
[![Health](https://phpackages.com/badges/dmoraschi-sitemap-common/health.svg)](https://phpackages.com/packages/dmoraschi-sitemap-common)
```

###  Alternatives

[spatie/crawler

Crawl all internal links found on a website

2.8k18.5M59](/packages/spatie-crawler)[neuron-core/neuron-ai

The PHP Agentic Framework.

2.0k656.1k34](/packages/neuron-core-neuron-ai)[tencentcloud/tencentcloud-sdk-php

TencentCloudApi php sdk

3741.3M46](/packages/tencentcloud-tencentcloud-sdk-php)[crwlr/crawler

Web crawling and scraping library.

36816.4k2](/packages/crwlr-crawler)[fof/sitemap

Generate a sitemap

1899.7k2](/packages/fof-sitemap)[eslazarev/wildberries-sdk

Wildberries OpenAPI clients (generated).

252.5k](/packages/eslazarev-wildberries-sdk)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
