PHPackages                             mage2kishan/module-robots-seo - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Utility &amp; Helpers](/categories/utility)
4. /
5. mage2kishan/module-robots-seo

ActiveMagento2-module[Utility &amp; Helpers](/categories/utility)

mage2kishan/module-robots-seo
=============================

Panth Robots SEO — dedicated robots.txt, X-Robots-Tag, and LLM-bot (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, etc.) policy control for Magento 2. Extracted from Panth\_AdvancedSEO. Self-contained: emits a per-store robots.txt, adds X-Robots-Tag response headers, validates robots meta directives, and exposes admin CRUD for user-agent / path policies. Hyva and Luma compatible.

1.0.6(1mo ago)019↓50%1proprietaryPHPPHP ~8.1.0||~8.2.0||~8.3.0||~8.4.0

Since Apr 20Pushed yesterdayCompare

[ Source](https://github.com/mage2sk/module-robots-seo)[ Packagist](https://packagist.org/packages/mage2kishan/module-robots-seo)[ Docs](https://kishansavaliya.com)[ RSS](/packages/mage2kishan-module-robots-seo/feed)WikiDiscussions main Synced 1w ago

READMEChangelogDependencies (8)Versions (8)Used By (0)

Panth Robots SEO — Dedicated robots.txt, X-Robots-Tag &amp; LLM Bot Policy for Magento 2 (Hyva + Luma)
======================================================================================================

[](#panth-robots-seo--dedicated-robotstxt-x-robots-tag--llm-bot-policy-for-magento-2-hyva--luma)

[![Magento 2.4.4 - 2.4.8](https://camo.githubusercontent.com/079c832211eed4f9451ebe264e3865f825b0f9f31b041cbf03676c6e254535d4/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4d6167656e746f2d322e342e342532302d2d253230322e342e382d6f72616e67653f6c6f676f3d6d6167656e746f266c6f676f436f6c6f723d7768697465)](https://magento.com)[![PHP 8.1 - 8.4](https://camo.githubusercontent.com/56b3cce18841623e2cbed2ebf09b06be1be8807e99e6e054a89d304ab4790b8e/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f5048502d382e312532302d2d253230382e342d626c75653f6c6f676f3d706870266c6f676f436f6c6f723d7768697465)](https://php.net)[![Hyva Compatible](https://camo.githubusercontent.com/14365166e02048aff917dd0a015feecdae28499fbde05fa17abd4f7821ea1139/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f487976612d436f6d70617469626c652d3134623861363f6c6f676f3d616c70696e65646f746a73266c6f676f436f6c6f723d7768697465)](https://hyva.io)![Luma Compatible](https://camo.githubusercontent.com/3c1945ee121ef64870a6f3583c91ffdfb4d8ffc35809e7ce34ca549a357e1ded/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4c756d612d436f6d70617469626c652d6f72616e6765)[![Packagist](https://camo.githubusercontent.com/321e0bb7d1e1991b286db1e682b569c43cc27d33675717fa1f33ea67c4f124d6/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f5061636b61676973742d6d616765326b697368616e2532466d6f64756c652d2d726f626f74732d2d73656f2d6f72616e67653f6c6f676f3d7061636b6167697374266c6f676f436f6c6f723d7768697465)](https://packagist.org/packages/mage2kishan/module-robots-seo)[![Upwork Top Rated Plus](https://camo.githubusercontent.com/6f72584179420c41ed90432fd2579a4ed36199d4229e8181d20f353c1c4ee4eb/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f5570776f726b2d546f702532305261746564253230506c75732d3134613830303f6c6f676f3d7570776f726b266c6f676f436f6c6f723d7768697465)](https://www.upwork.com/freelancers/~016dd1767321100e21)[![Website](https://camo.githubusercontent.com/f1ae86d28e2b505aee60f240d3e5508e390b0a8dc7a9b7ecf1b450fad862053f/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f576562736974652d6b697368616e736176616c6979612e636f6d2d304439343838)](https://kishansavaliya.com)[![Get a Quote](https://camo.githubusercontent.com/0b6c02cc1ad00f11bf1b0164a9998734bd716473db36cc2a5c1517e3d3578d1b/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4765742532306125323051756f74652d46726565253230457374696d6174652d444332363236)](https://kishansavaliya.com/get-quote)

> **Complete robots and crawler-policy control for Magento 2.** One module takes over `/robots.txt` at the router layer, emits an `X-Robots-Tag` HTTP response header on every frontend HTML page, adds per-user-agent allow/disallow rows via an admin grid, and toggles fourteen modern LLM / AI crawlers (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Bytespider, CCBot, Applebot-Extended, Meta-ExternalAgent, Amazonbot, Cohere-AI, and more) with a single click. Every directive passes a CRLF-safe validator before it ever reaches the wire. Works identically on **Hyva** and **Luma**.

Magento's native robots handling is three things that no longer add up: a **static `robots.txt` file** on disk, a **single admin textarea** buried under *Content → Design → Configuration* that overwrites it, and **no `X-Robots-Tag` header control** whatsoever. There is also no UI for the new generation of AI crawlers — GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Bytespider — so stores either open their data to every model trainer by default or hand-edit the file on every deploy. **Panth Robots SEO** unifies all three layers (robots.txt body, robots meta, `X-Robots-Tag` header) into one coherent admin surface with a dedicated controller, a declarative schema-backed policy grid, and a directive validator that makes CRLF header injection structurally impossible.

---

Need Custom Magento 2 Development?
----------------------------------

[](#need-custom-magento-2-development)

 [ ![Get a Free Quote](https://camo.githubusercontent.com/eac8c45d21cff8b139ddc392325f3bd6c8266a6f3d7b23f15131c958f3d3c8d0/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f476574253230612532304672656525323051756f74652532302545322538362539322d5265706c7925323077697468696e2532303234253230686f7572732d4443323632363f7374796c653d666f722d7468652d6261646765) ](https://kishansavaliya.com/get-quote)

### Kishan Savaliya

[](#kishan-savaliya)

**Top Rated Plus on Upwork**

[![Hire on Upwork](https://camo.githubusercontent.com/b69353d3c6e192f4d03cc36bb8883612004e32f54dd2dbcc1e700dd791acd875/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f486972652532306f6e2532305570776f726b2d546f702532305261746564253230506c75732d3134613830303f7374796c653d666f722d7468652d6261646765266c6f676f3d7570776f726b266c6f676f436f6c6f723d7768697465)](https://www.upwork.com/freelancers/~016dd1767321100e21)

### Panth Infotech Agency

[](#panth-infotech-agency)

[![Visit Agency](https://camo.githubusercontent.com/bbf04bdd2aff502082508568ec42ace3a7475c98756f596e2013056c89726ed6/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f56697369742532304167656e63792d50616e7468253230496e666f746563682d3134613830303f7374796c653d666f722d7468652d6261646765266c6f676f3d7570776f726b266c6f676f436f6c6f723d7768697465)](https://www.upwork.com/agencies/1881421506131960778/)

---

Table of Contents
-----------------

[](#table-of-contents)

- [Preview](#preview)
- [Features](#features)
- [How It Works](#how-it-works)
- [Supported LLM Bots](#supported-llm-bots)
- [Compatibility](#compatibility)
- [Installation](#installation)
- [Configuration](#configuration)
- [Managing Robots Policies](#managing-robots-policies)
- [robots.txt Endpoint](#robotstxt-endpoint)
- [X-Robots-Tag Header](#x-robots-tag-header)
- [Security](#security)
- [Troubleshooting](#troubleshooting)
- [Support](#support)

---

Preview
-------

[](#preview)

### Live walkthrough

[](#live-walkthrough)

End-to-end admin flow — enable the module, toggle a few LLM bots, add a policy row, preview the generated `robots.txt`, curl `/robots.txt` on both Hyva and Luma, and confirm the `X-Robots-Tag` header on a customer-account page. Click to play.

[![Panth Robots SEO demo](docs/images/demo.gif)](docs/images/demo.gif)

### Admin

[](#admin)

**Global configuration** — toggle the module, pick the default `` value, configure layered-nav and catalogsearch noindex, edit the noindex path list, set `max-image-preview` / `max-snippet` / `Crawl-delay`.

[![Admin configuration](docs/images/admin-config.png)](docs/images/admin-config.png)

**Robots Policies grid** — one row per (user-agent, path, directive, store\_id) tuple. Filter by store, mass-enable / disable / delete, inline priority column so the evaluator knows which rule wins when two patterns overlap.

[![Admin grid](docs/images/admin-grid.png)](docs/images/admin-grid.png)

**Edit form — policy row** — pick a user-agent (`*` for the default block, or `GPTBot`, `ClaudeBot`, a custom UA, etc.), pick allow / disallow, enter a path, scope to a store view, set priority and active flag. The UA and path fields are validated against a whitelist regex before save.

[![Admin edit form](docs/images/admin-edit.png)](docs/images/admin-edit.png)

**robots.txt preview** — dedicated **Panth Infotech → Robots &amp; LLM Bots → robots.txt Preview** page renders the live body exactly as the frontend will serve it, with a store-switcher so you can verify each store view before rolling to production.

[![robots.txt preview](docs/images/robots-txt-preview.png)](docs/images/robots-txt-preview.png)

---

Features
--------

[](#features)

FeatureDescription**Dynamic `/robots.txt` per store view**Built on the fly from LLM-bot toggles (emitted as `User-agent: \nDisallow: /` blocks when disabled), a `User-agent: *` block with admin policy rows and `Crawl-delay`, then `Sitemap:` and `Host:` lines. No static file ever leaves disk.**14 LLM / AI crawler toggles**One-click allow/disallow for GPTBot (`GPTBot`), ChatGPT-User (`ChatGPT-User`), OAI-SearchBot (`OAI-SearchBot`), ClaudeBot (maps both `ClaudeBot` and `Claude-Web`), Anthropic-AI (`anthropic-ai`), Google-Extended (maps both `Google-Extended` and `GoogleOther`), PerplexityBot (`PerplexityBot`), Cohere-AI (`cohere-ai`), CCBot (`CCBot`), Bytespider (`Bytespider`), Amazonbot (`Amazonbot`), Applebot-Extended (`Applebot-Extended`), FacebookBot (`FacebookBot`), Meta-ExternalAgent (`meta-externalagent`).**`X-Robots-Tag` response header**Added to every frontend HTML response by `Plugin\Response\XRobotsTagPlugin` with `max-image-preview:` and `max-snippet:` appended to the chosen directive. Handled before `Response::sendResponse()` so the header is always present.**Noindex path matcher**`Service\NoindexPathMatcher` walks an admin-editable list of path patterns (`*` wildcards supported). Defaults cover `/customer/*`, `/checkout*`, `/wishlist*`, `/sales/*`, `/contact*`, `/catalogsearch/*`, `/multishipping/*`, `/newsletter/manage*`, `/review/customer/*`, `/captcha*`, `/sendfriend/*`, `/paypal/*`, `/downloadable/customer/*`, `/vault/*`, `/giftcard/customer/*`, `/rewards/*`, `/oauth/*`, `/connect/*`.**Layered-nav / sort-filter noindex**When a catalog listing has any `?p=`, `?dir=`, `?order=`, `?limit=`, or layered-nav attribute query parameter, the header flips to `noindex, follow` so filtered permutations don't dilute the canonical listing.**Catalogsearch noindex**`/catalogsearch/result/*` pages emit `noindex, follow` by default — searches are inherently ephemeral and shouldn't be indexed.**HTTP-status-aware override**404, 410, 500 and 503 responses hard-override the header to `noindex, nofollow` regardless of config, so error pages can never leak into the index.**Non-HTML asset noindex**Requests ending in `.pdf`, `.doc`, `.docx`, `.xls`, `.xlsx` emit `noindex, nofollow` — stops support docs and spec sheets from displacing the canonical product page in the SERP.**`robots.txt` custom-body override**`robots_txt/override_enabled = 1` pastes `robots_txt/custom_body` verbatim into the response and skips the entire generation pipeline. CRLF is normalised to LF on write.**Admin CRUD grid**`panth_seo_robots_policy` table with a full UI-component grid — per-UA, per-path, per-store-view allow/disallow rows with priority and active flag. Dedicated **robots.txt Preview** admin page renders the live output.**CRLF-injection-safe**Every directive string passes `Service\DirectiveValidator` (printable-ASCII whitelist, rejects `\r`, `\n`, `\0`). Every path and UA is validated against a whitelist regex before the DB write.---

How It Works
------------

[](#how-it-works)

Seven cooperating pieces:

1. **`Controller\Robots\Index`** at route `seo_robots/robots/index` serves `GET /robots.txt` with the generated or override body, `Content-Type: text/plain; charset=utf-8`.
2. **`Setup\Patch\Data\InstallRobotsTxtRewrite`** writes the `url_rewrite` row that maps `/robots.txt` to the module controller at install time; **`RefreshRobotsTxtRewrite`** re-points an existing stale target\_path row left behind by Panth\_AdvancedSEO so upgrades are a no-op.
3. **`etc/frontend/di.xml`** disables the core `Magento\Framework\App\RouterList` entry for `robots` — Magento's built-in robots router no longer intercepts `/robots.txt` before the url\_rewrite layer, so our controller wins.
4. **`Plugin\Response\XRobotsTagPlugin`** is a `beforeSendResponse` plugin on `Magento\Framework\App\Response\Http`. It inspects the request path, status code, and rendered Content-Type, then sets `X-Robots-Tag` once per response.
5. **`Model\Robots\PolicyResolver`** aggregates `panth_robots_seo/llm_bots/*` toggles + rows from `panth_seo_robots_policy` + the configured `Crawl-delay` + `Sitemap:` references into the final robots.txt body for a given store.
6. **`Model\Robots\MetaResolver`** computes the per-entity robots meta string — used by the plugin and (when `Panth_AdvancedSEO` is present) by the shared `panth_seo_resolved.robots` cache column.
7. **`Service\NoindexPathMatcher`** + **`Service\DirectiveValidator`** — the first decides whether a given request path is "private"; the second is the single chokepoint every directive string passes through before it hits a response header or the robots.txt body.

---

Supported LLM Bots
------------------

[](#supported-llm-bots)

Per-bot allow/disallow lives at **Stores → Configuration → Panth Infotech → Robots &amp; LLM Bots → LLM Bot Policy**. Turning a toggle to **No** emits `User-agent: \nDisallow: /` in the generated robots.txt; turning it to **Yes** omits the block entirely (equivalent to allow).

BotUA string(s)DefaultConfig path**GPTBot** (OpenAI)`GPTBot`Yes`panth_robots_seo/llm_bots/gptbot`**ChatGPT-User**`ChatGPT-User`Yes`panth_robots_seo/llm_bots/chatgpt_user`**OAI-SearchBot**`OAI-SearchBot`Yes`panth_robots_seo/llm_bots/oai_searchbot`**ClaudeBot** (Anthropic)`ClaudeBot`, `Claude-Web`Yes`panth_robots_seo/llm_bots/claudebot`**Anthropic-AI**`anthropic-ai`Yes`panth_robots_seo/llm_bots/anthropic_ai`**Google-Extended**`Google-Extended`, `GoogleOther`Yes`panth_robots_seo/llm_bots/google_extended`**PerplexityBot**`PerplexityBot`Yes`panth_robots_seo/llm_bots/perplexitybot`**Cohere-AI**`cohere-ai`Yes`panth_robots_seo/llm_bots/cohere_ai`**CCBot** (Common Crawl)`CCBot`**No**`panth_robots_seo/llm_bots/ccbot`**Bytespider** (ByteDance)`Bytespider`**No**`panth_robots_seo/llm_bots/bytespider`**Amazonbot**`Amazonbot`Yes`panth_robots_seo/llm_bots/amazonbot`**Applebot-Extended**`Applebot-Extended`Yes`panth_robots_seo/llm_bots/applebot_extended`**FacebookBot**`FacebookBot`Yes`panth_robots_seo/llm_bots/facebookbot`**Meta-ExternalAgent**`meta-externalagent`Yes`panth_robots_seo/llm_bots/meta_externalagent`### Always allowed (no dedicated toggle)

[](#always-allowed-no-dedicated-toggle)

The following bots are not blocked by default and have no dedicated config key. If you need to block them, add a `Disallow: /` row to the Robots Policies grid with the UA as the user-agent:

- **YouBot** — You.com's search crawler
- **PetalBot** — Huawei / Petal Search crawler
- **Diffbot** — knowledge-graph crawler
- **AI2Bot** — Allen Institute research crawler
- **Omgilibot** — Webz.io crawler
- **Timpibot** — Timpi decentralised search crawler

---

Compatibility
-------------

[](#compatibility)

RequirementSupportedMagento Open Source2.4.4, 2.4.5, 2.4.6, 2.4.7, 2.4.8Adobe Commerce2.4.4 — 2.4.8PHP8.1, 8.2, 8.3, 8.4Hyva Theme1.0+ (fully compatible)Luma ThemeNative supportPanth Core^1.0 (installed automatically)---

Installation
------------

[](#installation)

```
composer require mage2kishan/module-robots-seo
bin/magento module:enable Panth_Core Panth_RobotsSeo
bin/magento setup:upgrade
bin/magento setup:di:compile
bin/magento cache:flush
```

### Verify

[](#verify)

```
bin/magento module:status Panth_RobotsSeo
# Module is enabled

curl -s -o /dev/null -w '%{http_code}\n' https://your-store.test/robots.txt
# 200

curl -sI https://your-store.test/customer/account/login | grep -i x-robots-tag
# X-Robots-Tag: noindex, nofollow, max-image-preview:large, max-snippet:-1
```

Visit **Admin → Panth Infotech → Robots &amp; LLM Bots → Robots Policies** to see the seeded policy grid.

---

Configuration
-------------

[](#configuration)

Navigate to **Stores → Configuration → Panth Infotech → Robots &amp; LLM Bots**.

### General

[](#general)

SettingPathDefaultWhat it controls**Enable Module**`panth_robots_seo/general/enabled`YesMaster switch. When No, the `X-Robots-Tag` plugin is a no-op and `/robots.txt` serves a stock `User-agent: *\nAllow: /`.**Debug Logging**`panth_robots_seo/general/debug`NoWhen Yes, every header and meta decision is written to `var/log/panth_robots_seo.log`.**Default Meta Robots**`panth_robots_seo/general/default_directive``index,follow`Baseline directive applied when no per-entity / per-path override fires. Allowed tokens: `index`, `noindex`, `follow`, `nofollow`, `noarchive`, `nosnippet`, `noimageindex`, `max-snippet`, `max-image-preview`, `max-video-preview`, `unavailable_after`, `none`, `all`.**Noindex Layered-Nav Filtered Pages**`panth_robots_seo/general/noindex_filtered`YesEmit `noindex, follow` when a catalog listing has layered-nav or sort/limit/page query parameters.**Noindex Search Result Pages**`panth_robots_seo/general/noindex_search_results`YesEmit `noindex, follow` on `/catalogsearch/result/*`.**Noindex URL Paths**`panth_robots_seo/general/noindex_paths`(18-line seeded list — see above)One-path-per-line list of private patterns; `*` matches anything. Matched by `Service\NoindexPathMatcher`.**max-image-preview Directive**`panth_robots_seo/general/max_image_preview``large`Appended to every `X-Robots-Tag`. `large` is recommended for Google Discover eligibility.**max-snippet Directive**`panth_robots_seo/general/max_snippet``-1``-1` = unlimited. A positive integer caps SERP snippet length.**Crawl-delay (seconds)**`panth_robots_seo/general/crawl_delay``0`Emitted under `User-agent: *` in robots.txt. `0` omits the directive.### LLM Bot Policy

[](#llm-bot-policy)

SettingPathDefaultWhat it controls**Allow GPTBot (OpenAI)**`panth_robots_seo/llm_bots/gptbot`YesNo = emits `User-agent: GPTBot\nDisallow: /`.**Allow ClaudeBot (Anthropic)**`panth_robots_seo/llm_bots/claudebot`YesCovers both `ClaudeBot` and `Claude-Web`.**Allow Google-Extended**`panth_robots_seo/llm_bots/google_extended`YesCovers both `Google-Extended` and `GoogleOther`.**Allow CCBot (Common Crawl)**`panth_robots_seo/llm_bots/ccbot`**No**CCBot feeds dataset-scale training pipelines; blocked by default.**Allow PerplexityBot**`panth_robots_seo/llm_bots/perplexitybot`Yes**Allow Bytespider (ByteDance)**`panth_robots_seo/llm_bots/bytespider`**No**Bytespider ignores partial disallows; blocked by default.**Allow ChatGPT-User**`panth_robots_seo/llm_bots/chatgpt_user`Yes**Allow OAI-SearchBot**`panth_robots_seo/llm_bots/oai_searchbot`Yes**Allow Anthropic-AI**`panth_robots_seo/llm_bots/anthropic_ai`Yes**Allow Cohere-AI**`panth_robots_seo/llm_bots/cohere_ai`Yes**Allow Amazonbot**`panth_robots_seo/llm_bots/amazonbot`Yes**Allow Applebot-Extended**`panth_robots_seo/llm_bots/applebot_extended`Yes**Allow Facebookbot**`panth_robots_seo/llm_bots/facebookbot`Yes**Allow Meta-ExternalAgent**`panth_robots_seo/llm_bots/meta_externalagent`Yes### robots.txt Override

[](#robotstxt-override)

SettingPathDefaultWhat it controls**Use Custom robots.txt Body**`panth_robots_seo/robots_txt/override_enabled`NoWhen Yes, the custom body below REPLACES the generated output — every LLM toggle and policy row is ignored.**Custom robots.txt Body**`panth_robots_seo/robots_txt/custom_body`(empty)Pasted verbatim into the response. CRLF is normalised to LF to prevent HTTP header smuggling. Leave empty to use the generated output.Every setting resolves at **store-view** scope, so each store can have a different LLM policy, noindex path list, or override body.

---

Managing Robots Policies
------------------------

[](#managing-robots-policies)

Open **Admin → Panth Infotech → Robots &amp; LLM Bots → Robots Policies** to reach the grid (route `panth_robots_seo/policy/index`).

### Fields

[](#fields)

FieldDescription**User-agent**The UA string to match — `*` for the default block, `GPTBot`, `ClaudeBot`, `Applebot-Extended`, a custom crawler, etc. Validated against `/^[A-Za-z0-9._\-+*\/ ]+$/` on save.**Directive**`allow` or `disallow`. Single source of truth consumed by `PolicyResolver`.**Path**The path fragment the directive applies to. Must start with `/`, no control bytes. `*` wildcards allowed.**Store View**`0` applies to all stores; a non-zero value scopes the row to one store view. Foreign-keyed to `store.store_id` with `ON DELETE CASCADE`.**Priority**Lower numbers are emitted first within the same user-agent block.**Active**Per-row enable/disable. Inactive rows are never rendered.### Mass actions

[](#mass-actions)

Select rows and choose **Enable**, **Disable** or **Delete** from the grid mass-action menu.

### robots.txt Preview

[](#robotstxt-preview)

The **Panth Infotech → Robots &amp; LLM Bots → robots.txt Preview** sub-menu (route `panth_robots_seo/robots/index`) renders the live body for the currently selected store, exactly as the frontend controller would serve it — helpful for dry-running changes before they go public.

---

robots.txt Endpoint
-------------------

[](#robotstxt-endpoint)

- **URL:** `GET /robots.txt`
- **Content-Type:** `text/plain; charset=utf-8`
- **Controller:** `Panth\RobotsSeo\Controller\Robots\Index` at route `seo_robots/robots/index`.

`/robots.txt` is served by our controller via a `url_rewrite` row installed by `Setup\Patch\Data\InstallRobotsTxtRewrite`. The core `Magento_Robots` router is disabled via `etc/frontend/di.xml` so it never intercepts the request ahead of the url\_rewrite layer.

If you are upgrading from `Panth_AdvancedSEO` where `/robots.txt` was already mapped to that module's controller, the `RefreshRobotsTxtRewrite` patch runs on the next `setup:upgrade` and rewrites the stale `target_path` to point at the new controller — zero manual DB surgery required.

### Generated body shape

[](#generated-body-shape)

```
User-agent: CCBot
Disallow: /

User-agent: Bytespider
Disallow: /

User-agent: *
Crawl-delay: 0
Disallow: /customer/
Disallow: /checkout/
Allow: /

Sitemap: https://your-store.test/sitemap.xml
Host: your-store.test

```

---

X-Robots-Tag Header
-------------------

[](#x-robots-tag-header)

`Plugin\Response\XRobotsTagPlugin` runs `beforeSendResponse` on `Magento\Framework\App\Response\Http` and applies the following order of precedence:

1. **Self-skip on `/robots.txt`** — never sets a header on the robots.txt response itself.
2. **Error-code override** — 404, 410, 500, 503 → hard `noindex, nofollow`, no further checks.
3. **Non-HTML asset override** — `.pdf`, `.doc`, `.docx`, `.xls`, `.xlsx` → hard `noindex, nofollow`.
4. **Catalogsearch noindex** — `/catalogsearch/result/*` when `noindex_search_results = Yes` → `noindex, follow`.
5. **Configured `noindex_paths` match** — `Service\NoindexPathMatcher` → `noindex, nofollow` (and the matching meta).
6. **Layered-nav / sort-filter** — when listing page has query parameters → `noindex, follow`.
7. **Default directive** — `panth_robots_seo/general/default_directive` (e.g. `index, follow`).

In every case the final string is appended with `, max-image-preview:` and `, max-snippet:` from general config, then passed through `Service\DirectiveValidator` before being set on the response.

---

Security
--------

[](#security)

- **ACL + FormKey on every admin controller.** Every Adminhtml controller extends `Panth\RobotsSeo\Controller\Adminhtml\AbstractAction`, declares its own `ADMIN_RESOURCE` constant (`Panth_RobotsSeo::policies`, `Panth_RobotsSeo::preview`), and enforces ACL via `_isAllowed()`. No route is reachable without a valid admin session.
- **`HttpPostActionInterface` on mutating paths.** `Save`, `Delete`, `MassDelete`, `MassStatus` all implement `HttpPostActionInterface` so GET is rejected at the framework level. Form-key validation runs on every POST.
- **`DirectiveValidator` whitelist + control-byte rejection.** Every directive string written to `X-Robots-Tag` or the robots.txt body passes through `Service\DirectiveValidator::assertSafe()` — rejects any string containing `\r`, `\n`, `\0`, or bytes outside printable-ASCII. CRLF header injection is structurally impossible.
- **CRLF normalisation in custom body.** `robots_txt/custom_body` has `\r\n` → `\n` normalisation applied on render so a pasted Windows-style newline can't smuggle a second response header.
- **Per-store scope on every config value.** `enabled`, `noindex_paths`, every `llm_bots/*` toggle, and the custom body resolve at `ScopeInterface::SCOPE_STORES` — a store-specific value never leaks into another store.
- **UA + path validation on save.** Admin policy rows reject user-agents outside `/^[A-Za-z0-9._\-+*\/ ]+$/` and paths that do not start with `/` or contain control bytes, before the row is written.
- **XSS-safe admin preview.** The robots.txt Preview page renders the body through `escapeHtml()` and wraps it in `` tags, so a hostile custom body can never execute script on an admin browser.

---

Troubleshooting
---------------

[](#troubleshooting)

### `/robots.txt` returns a 404 or a Luma 404 HTML page

[](#robotstxt-returns-a-404-or-a-luma-404-html-page)

You are likely sitting on a stale `url_rewrite` row left behind by `Panth_AdvancedSEO` whose `target_path` still points at the old controller. Run `bin/magento setup:upgrade` — the `RefreshRobotsTxtRewrite` patch fires idempotently and rewrites the row to the new target. Follow with `bin/magento cache:clean config full_page`.

### `X-Robots-Tag` not appearing on `/customer/*` pages

[](#x-robots-tag-not-appearing-on-customer-pages)

Upgrade to **≥ 1.0.2**. Earlier releases had a constructor-argument ordering bug that made the plugin skip the response when `Panth_AdvancedSEO` wasn't installed; 1.0.2 makes the dependency DI-nullable and the plugin always runs.

### LLM bot block missing from `robots.txt`

[](#llm-bot-block-missing-from-robotstxt)

1. Check the toggle at the **right scope** — `panth_robots_seo/llm_bots/` resolves at store-view scope, not website or default.
2. Flush config + FPC: `bin/magento cache:clean config full_page`. The `/robots.txt` body is built live per request but the config it reads from is cached.
3. Confirm `robots_txt/override_enabled` is **No** — when the override is on, every LLM toggle is ignored.

### I turned on `override_enabled` but nothing changes

[](#i-turned-on-override_enabled-but-nothing-changes)

1. `bin/magento cache:clean config full_page` — the override flag and custom body are both pulled from cached config.
2. Confirm `custom_body` was saved at the **store scope** you are viewing, not the default scope. Check with `SELECT scope, scope_id, value FROM core_config_data WHERE path = 'panth_robots_seo/robots_txt/custom_body';`.

### Meta robots tag not showing in page HTML

[](#meta-robots-tag-not-showing-in-page-html)

The module sets the `X-Robots-Tag` **HTTP response header** — not the `` element. A layout hook that injects the `` tag into the page `` is only wired when `Panth_AdvancedSEO` is also installed (it owns the `page/main` block override). If you need both, install `mage2kishan/module-advanced-seo` alongside this module; they share the `panth_seo_robots_policy` table and do not collide.

---

Support
-------

[](#support)

- **Agency:** [Panth Infotech on Upwork](https://www.upwork.com/agencies/1881421506131960778/)
- **Direct:** [kishansavaliya.com](https://kishansavaliya.com) — [Get a free quote](https://kishansavaliya.com/get-quote)

###  Health Score

45

—

FairBetter than 91% of packages

Maintenance97

Actively maintained with recent releases

Popularity9

Limited adoption so far

Community7

Small or concentrated contributor base

Maturity56

Maturing project, gaining track record

 Bus Factor1

Top contributor holds 100% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~3 days

Total

7

Last Release

32d ago

### Community

Maintainers

![](https://www.gravatar.com/avatar/343e344aa298f189db888b32d62f9202d31ced1a5ea23411850a63dc4a30299c?d=identicon)[kishansavaliya](/maintainers/kishansavaliya)

---

Top Contributors

[![KishanSavaliya](https://avatars.githubusercontent.com/u/16853223?v=4)](https://github.com/KishanSavaliya "KishanSavaliya (13 commits)")

---

Tags

robots-txtseorobotsmagento2magento2 modulex-robots-taghyvalumapanthgptbotclaudebotllm-botsperplexitybot

### Embed Badge

![Health badge](/badges/mage2kishan-module-robots-seo/health.svg)

```
[![Health](https://phpackages.com/badges/mage2kishan-module-robots-seo/health.svg)](https://phpackages.com/packages/mage2kishan-module-robots-seo)
```

###  Alternatives

[run-as-root/magento2-prometheus-exporter

Magento2 Prometheus Exporter

68353.9k](/packages/run-as-root-magento2-prometheus-exporter)[mollie/magento2

Mollie Payment Module for Magento 2

1131.8M12](/packages/mollie-magento2)[loki/magento2-components

Core module for defining Alpine.js components with advanced AJAX features

1010.0k22](/packages/loki-magento2-components)[opengento/module-category-import-export

This module add the capability to import and export the categories from the back-office.

1310.2k1](/packages/opengento-module-category-import-export)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
