PHPackages                             jmajors/robotstxt - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. jmajors/robotstxt

ActiveLibrary

jmajors/robotstxt
=================

A small package for parsing websites' robots.txt files

v1.7.1(9y ago)323MITPHPPHP &gt;=5.4.0

Since Jan 30Pushed 9y ago1 watchersCompare

[ Source](https://github.com/jasonmajors/robotstxt)[ Packagist](https://packagist.org/packages/jmajors/robotstxt)[ RSS](/packages/jmajors-robotstxt/feed)WikiDiscussions master Synced 2mo ago

READMEChangelogDependencies (1)Versions (9)Used By (0)

Robotstxt Parser
================

[](#robotstxt-parser)

[![Build Status](https://camo.githubusercontent.com/bfe0125cf0897a6be3c992e331e098af6cd1e7d65af5fd03a90a2b55d52e581b/68747470733a2f2f7472617669732d63692e6f72672f6a61736f6e6d616a6f72732f726f626f74737478742e7376673f6272616e63683d6d6173746572)](https://travis-ci.org/jasonmajors/robotstxt)

This is a small package to make parsing robots.txt rules easier. The URL matching follows the rules outlined by Google in their [webmasters guide](https://developers.google.com/webmasters/control-crawl-index/docs/robots_txt#url-matching-based-on-path-values).

#### Quick example:

[](#quick-example)

```
// basic usage
$robots  = new Robots\RobotsTxt();
$allowed = $robots->isAllowed("https://www.example.com/some/path"); // true
$allowed = $robots->isAllowed("https://www.another.com/example");   // false
```

Setup
-----

[](#setup)

Install via [composer](https://getcomposer.org/):

```
$ composer require jmajors/robotstxt
```

Make sure composer's autoloader is included in your project:

```
require __DIR__ . '/vendor/autoload.php';
```

That's it.

Usage
-----

[](#usage)

This package is a class made mainly for checking if a crawler is allowed to visit a particular URL. Use the `isAllowed(string $url)` method to check whether or not a crawler is disallowed from crawling a particular path, which returns `true` if the URL's path is not included in the robots.txt Disallowed rules (i.e. you're free to crawl), and `false` if the path is disallowed (no crawling!). Here's an example:

```