PHPackages                             codeq/linkchecker - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Utility &amp; Helpers](/categories/utility)
4. /
5. codeq/linkchecker

ActiveNeos-package[Utility &amp; Helpers](/categories/utility)

codeq/linkchecker
=================

Finds broken and misconfigured links in your Neos project

v4.0.0(1w ago)713.1k↑97.9%3[2 issues](https://github.com/NEOSidekick/NEOSidekick.LinkChecker/issues)GPL-3.0-or-laterPHPPHP ^8.1

Since Sep 19Pushed 1w ago1 watchersCompare

[ Source](https://github.com/NEOSidekick/NEOSidekick.LinkChecker)[ Packagist](https://packagist.org/packages/codeq/linkchecker)[ RSS](/packages/codeq-linkchecker/feed)WikiDiscussions main Synced 2d ago

READMEChangelog (6)Dependencies (15)Versions (10)Used By (0)

[![Latest Stable Version](https://camo.githubusercontent.com/70510cd0414e858c5f38f16f50f2181f442068404630ed48c9bee11964d57dfd/68747470733a2f2f706f7365722e707567782e6f72672f6e656f736964656b69636b2f6c696e6b636865636b65722f762f737461626c65)](https://packagist.org/packages/neosidekick/linkchecker)[![License](https://camo.githubusercontent.com/ede1cf5627e4c39f6b03039fe27cd2814bf4a82631dcf39509b26546af709c65/68747470733a2f2f706f7365722e707567782e6f72672f6e656f736964656b69636b2f6c696e6b636865636b65722f6c6963656e7365)](LICENSE)

NEOSidekick.LinkChecker
=======================

[](#neosidekicklinkchecker)

Keep your Neos website free of broken links with this easy-to-use link checker
------------------------------------------------------------------------------

[](#keep-your-neos-website-free-of-broken-links-with-this-easy-to-use-link-checker)

NEOSidekick.LinkChecker makes sure all your links are working smoothly in Neos projects. It validates internal page and asset references, external links and phone numbers in node data, as well crawls all rendered pages to ensure that no hidden pages fall through the cracks!

[![Backend Module screenshot](screenshot.png)](screenshot.png)

The backend module allows you to mark errors as fixed and as to-be-ignored. Editing buttons directly open relevant pages in the Neos inline editing to easily fix the issues.

The link checker has the following methods to find broken links:

- The backend module can validate all internal page links `node://XXX` and assets `asset://XXX` in all node properties
    - Additionally, it validates phone numbers to be in international format (`+99 999999999`)
- The command controller `./flow checklinks:crawl` will crawl all in the settings configured domains and pages and check the following:
    - Do all internal links `node://XXX` point to visible pages (not hidden, not hidden via visible before or visible after)
    - Are all phone numbers in international format (`+99 999999999`)
    - Do external links point to valid pages (http status code 2xx)
- The command controller `./flow checklinks:crawlnodes` will only validate internal links, assets and phone numbers
- The command controller `./flow checklinks:crawlexternallinks` will crawl the website and validate external links

Installation
------------

[](#installation)

NEOSidekick.LinkChecker is available via packagist run `composer require neosidekick/linkchecker`. We use semantic versioning so every breaking change will increase the major-version number.

### Upgrade from CodeQ.LinkChecker

[](#upgrade-from-codeqlinkchecker)

This package replaces `codeq/linkchecker` and ships a Flow code migration for existing projects.

After changing the Composer dependency, run:

```
./flow flow:core:migrate Your.SitePackage --force
./flow doctrine:migrate
```

The code migration updates PHP namespaces, package keys, Fusion references, command identifiers and settings paths from `CodeQ.LinkChecker` to `NEOSidekick.LinkChecker`. The Doctrine migration renames the persisted result table from `codeq_linkchecker_domain_model_resultitem` to `neosidekick_linkchecker_domain_model_resultitem`.

Usage
-----

[](#usage)

Configure the link checker sync in your settings, like this:

```
NEOSidekick:
  LinkChecker:
    # how many concurrent requests should the command controller perform
    # If set too high, you will DDoS your server
    concurrency: 10
```

Make sure the domains are also added in the "Sites Management"!

Setup a cronjob e.g. daily to execute `./flow checklinks:crawl`.

### Backend module crawl queue

[](#backend-module-crawl-queue)

The backend module starts crawls through `Flowpack.JobQueue.Common`. The package ships a Doctrine-backed queue named `NEOSidekick.LinkChecker.Crawl` and stores its messages in the table `neosidekick_linkchecker_jobqueue_crawl`. When the worker starts a backend-triggered crawl, it first removes all non-ignored previous findings and then runs the normal crawl command.

After installing the package, initialize the queue once:

```
./flow flowpack.jobqueue.common:queue:setup NEOSidekick.LinkChecker.Crawl
```

You can verify the queue configuration with:

```
./flow flowpack.jobqueue.common:queue:list
./flow flowpack.jobqueue.common:queue:describe NEOSidekick.LinkChecker.Crawl
```

Run a worker for the crawl queue:

```
./flow flowpack.jobqueue.common:job:work NEOSidekick.LinkChecker.Crawl --verbose
```

For production, run the worker under a process supervisor such as systemd, supervisord or your container platform. The worker should be restarted if it exits. A minimal systemd service looks like:

```
[Unit]
Description=NEOSidekick LinkChecker crawl worker
After=network.target

[Service]
Type=simple
WorkingDirectory=/var/www/html
ExecStart=/var/www/html/flow flowpack.jobqueue.common:job:work NEOSidekick.LinkChecker.Crawl --verbose
Restart=always
RestartSec=5
User=www-data

[Install]
WantedBy=multi-user.target
```

If you cannot run a permanent worker, run short-lived workers from cron:

```
* * * * * cd /var/www/html && ./flow flowpack.jobqueue.common:job:work NEOSidekick.LinkChecker.Crawl --exit-after 55
```

Inspect queued jobs and failed messages with:

```
./flow flowpack.jobqueue.common:job:list NEOSidekick.LinkChecker.Crawl --limit 10
./flow flowpack.jobqueue.common:queue:list
```

With the default Doctrine queue, operators can also inspect the queue table directly:

```
SELECT state, COUNT(*) FROM neosidekick_linkchecker_jobqueue_crawl GROUP BY state;
SELECT id, state, failures, scheduled FROM neosidekick_linkchecker_jobqueue_crawl ORDER BY id;
```

Successful jobs are removed from the table. Failed jobs remain with `state = 'failed'`.

Projects that already use another JobQueue backend can override only this queue. For example, to use Redis instead of Doctrine:

```
Flowpack:
  JobQueue:
    Common:
      queues:
        'NEOSidekick.LinkChecker.Crawl':
          className: 'Flowpack\JobQueue\Redis\Queue\RedisQueue'
          options:
            client:
              host: 127.0.0.1
              port: 6379
```

### Reducing false positives

[](#reducing-false-positives)

Not every non-2xx response means a link is dead. To keep the report trustworthy, findings are classified into two states:

- **broken**: genuinely dead links (`404`, `410`, other `4xx`/`5xx`, missing `node://`/`asset://`targets). Only these trigger email notifications and lower the health score.
- **warning**: results that cannot be verified and should not be treated as errors — auth walls (`401`), bot blocks (`403`), rate limiting (`429`), Cloudflare bot challenges, hosts that are known to block crawlers, unfollowed redirects and invalid phone number formats.

The checker also follows redirects (so a `301`/`302` to a working page is no longer reported), retries only transient failures (timeouts, `429`, `502`–`504`) with exponential backoff while honoring `Retry-After`, and sends an honest, configurable `User-Agent`.

All of this is configurable:

```
NEOSidekick:
  LinkChecker:
    # Regex rules (full patterns incl. delimiters) that suppress matching findings entirely.
    # Each entry is either a pattern string or {pattern: '/.../', statusCodes: [404]}.
    ignoreRules:
      - '#^https://intranet\.example\.com/#'

    classification:
      # Status codes that are reported as warnings instead of broken links.
      treatAsWarning: [401, 403, 429]
      # Treat Cloudflare bot challenges (cf-mitigated / cf-ray headers) as warnings.
      detectCloudflareChallenge: true
      # Hosts that routinely block crawlers; findings for these are downgraded to warnings.
      knownBlockerDomains:
        - 'linkedin.com'
        - 'x.com'

    clientOptions:
      # Follow redirects so a 301/302 to a working page is not reported as broken.
      allowRedirects: true
      maxRedirects: 5
      # Some servers block the default Guzzle user agent.
      userAgent: 'NEOSidekickLinkChecker/1.0 (+https://neosidekick.com/)'
```

### Performance &amp; scale

[](#performance--scale)

External link checks are the slow part of a crawl. Several measures keep crawls fast and polite:

- **HEAD-first**: external links only need their status, so they are checked with a cheap `HEAD`request (with an automatic `GET` fallback for servers that reject `HEAD`). Internal pages still use `GET` because their body is needed to discover links.
- **Byte cap**: external `GET` fallback requests carry a `Range` header and the body read is capped, so a link to a huge PDF or video is never fully downloaded.
- **Per-host rate limiting**: external hosts are limited to a few requests per second; the site's own host is governed by `concurrency`. Connections are kept alive and reused.
- **In-run deduplication**: each unique URL is checked once per crawl, even if it appears on many pages (e.g. navigation/footer links).
- **Between-run cache** (opt-in): external links confirmed healthy can be skipped on the next run until the cached result expires.
- **Incremental internal crawl** (opt-in, `./flow checklinks:crawl --only-changed`): only re-checks content nodes modified since the last run. Note this can miss links broken by changes on the *target* side, so it is best combined with periodic full crawls.

```
NEOSidekick:
  LinkChecker:
    performance:
      maximumResponseSize: 2097152     # max body bytes read per page
      headFirst: true
      externalRangeBytes: 65536        # Range: bytes=0-N for external GET fallbacks (0 = off)
      perHostRequestsPerSecond: 4      # 0 = no per-host limit
      betweenRunCache:
        enabled: false
        okLifetime: 604800             # seconds a healthy external link may be skipped
    incremental:
      enabled: false                   # or pass --only-changed per run
```

### Email reports

[](#email-reports)

The link checker can also send an email if it finds broken links. To enable this, you need to configure the email service like this:

```
NEOSidekick:
  LinkChecker:
    notifications:
      enabled: true
      subject: 'Link checker results'
      minimumStatusCode: 300
      mail:
        sender:
          default:
            name: 'Link Checker'
            address: 'no-reply@example.com'
        recipient:
          default:
            name: 'John Doe'
            address: 'recipient@example.com'
        ccRecipient:
          default:
            name: 'Client'
            address: 'client@example.com'
```

Limitations and possible future Features:
-----------------------------------------

[](#limitations-and-possible-future-features)

- Support additional languages
- Update the link checks after a page is published via a job queue
- Check external links against malware oder security adviser lists
- Find all occurrences of external links to internal pages
- Check against deny list (e.g. list of competitors)
- Check for broken links in other workspaces

License
-------

[](#license)

The GNU GENERAL PUBLIC LICENSE, please see [License File](LICENSE) for more information.

Sponsors &amp; Contribution
---------------------------

[](#sponsors--contribution)

The development of this plugin was kindly sponsored by [Code Q](https://codeq.at/).

The package is based on the `Unikka/LinkChecker` package, which does a great job at finding all broken external links. This package extends the features a lot, offers a new UI and introduces new dependencies.

We will gladly accept contributions. Please send us pull requests.

Tests
-----

[](#tests)

Run the unit tests from the project root inside DDEV:

```
ddev exec ./bin/phpunit --configuration UnitTests.xml DistributionPackages/NEOSidekick.LinkChecker/Tests/Unit
```

Alternatively, run the package-local PHPUnit configuration:

```
ddev exec ./bin/phpunit --configuration DistributionPackages/NEOSidekick.LinkChecker/Tests/UnitTests.xml
```

###  Health Score

53

—

FairBetter than 96% of packages

Maintenance92

Actively maintained with recent releases

Popularity32

Limited adoption so far

Community14

Small or concentrated contributor base

Maturity62

Established project with proven stability

 Bus Factor1

Top contributor holds 58.9% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~274 days

Recently: every ~343 days

Total

6

Last Release

11d ago

Major Versions

v1.0.1 → v2.0.02022-10-24

v2.0.0 → v3.0.02023-02-06

v3.1.0 → v4.0.02026-06-22

PHP version history (2 changes)v2.0.0PHP ^7.4 || ^8.0

v3.1.0PHP ^8.1

### Community

Maintainers

![](https://www.gravatar.com/avatar/c898f86932821ebd9add02d4f01864bd4804e1a620083fbc5951af0bb3e9c850?d=identicon)[codeq](/maintainers/codeq)

---

Top Contributors

[![mhsdesign](https://avatars.githubusercontent.com/u/85400359?v=4)](https://github.com/mhsdesign "mhsdesign (43 commits)")[![rolandschuetz](https://avatars.githubusercontent.com/u/735982?v=4)](https://github.com/rolandschuetz "rolandschuetz (18 commits)")[![gradinarufelix](https://avatars.githubusercontent.com/u/4405087?v=4)](https://github.com/gradinarufelix "gradinarufelix (7 commits)")[![simonschaufi](https://avatars.githubusercontent.com/u/941794?v=4)](https://github.com/simonschaufi "simonschaufi (4 commits)")[![brendt](https://avatars.githubusercontent.com/u/6905297?v=4)](https://github.com/brendt "brendt (1 commits)")

### Embed Badge

![Health badge](/badges/codeq-linkchecker/health.svg)

```
[![Health](https://phpackages.com/badges/codeq-linkchecker/health.svg)](https://phpackages.com/packages/codeq-linkchecker)
```

###  Alternatives

[composer/composer

Composer helps you declare, manage and install dependencies of PHP projects. It ensures you have the right stack everywhere.

29.5k196.2M3.1k](/packages/composer-composer)[civicrm/civicrm-core

Open source constituent relationship management for non-profits, NGOs and advocacy organizations.

751291.4k43](/packages/civicrm-civicrm-core)[grumpydictator/firefly-iii

Firefly III: a personal finances manager.

23.9k69.5k](/packages/grumpydictator-firefly-iii)[symfony/lock

Creates and manages locks, a mechanism to provide exclusive access to a shared resource

514139.2M691](/packages/symfony-lock)[kimai/kimai

Kimai - Time Tracking

4.8k9.0k1](/packages/kimai-kimai)[helsingborg-stad/municipio

A bootstrap theme for creating municipality sites.

4028.5k10](/packages/helsingborg-stad-municipio)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
