PHPackages                             luyadev/luya-module-crawler - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Search &amp; Filtering](/categories/search)
4. /
5. luyadev/luya-module-crawler

ActiveLuya-module[Search &amp; Filtering](/categories/search)

luyadev/luya-module-crawler
===========================

An full search page crawler to enable complex and customized searching abilities.

3.7.2(2y ago)733.7k5MITPHPPHP &gt;=7.1

Since Feb 9Pushed 2y ago2 watchersCompare

[ Source](https://github.com/luyadev/luya-module-crawler)[ Packagist](https://packagist.org/packages/luyadev/luya-module-crawler)[ Docs](https://luya.io)[ RSS](/packages/luyadev-luya-module-crawler/feed)WikiDiscussions master Synced 1mo ago

READMEChangelog (10)Dependencies (4)Versions (43)Used By (0)

 [![LUYA Logo](https://raw.githubusercontent.com/luyadev/luya/master/docs/logo/luya-logo-0.2x.png)](https://raw.githubusercontent.com/luyadev/luya/master/docs/logo/luya-logo-0.2x.png)

Crawler
=======

[](#crawler)

[![LUYA](https://camo.githubusercontent.com/c30b61934591d3a6fcb8718a93fd61bf840c0abd8a8d49aa0fdd4ab99567bdf4/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f506f776572656425323062792d4c5559412d627269676874677265656e2e737667)](https://luya.io)[![Latest Stable Version](https://camo.githubusercontent.com/042418954da8ec873f2b2a4104c2aba081466b4633f83ad66079ba8e1cde1165/68747470733a2f2f706f7365722e707567782e6f72672f6c7579616465762f6c7579612d6d6f64756c652d637261776c65722f762f737461626c65)](https://packagist.org/packages/luyadev/luya-module-crawler)[![Test Coverage](https://camo.githubusercontent.com/828bbc67d085c565d107bc0b0ef63d0fce3fcd85351ce8252550eb56475ffd58/68747470733a2f2f6170692e636f6465636c696d6174652e636f6d2f76312f6261646765732f66626637353262643865643538346465343237622f746573745f636f766572616765)](https://codeclimate.com/github/luyadev/luya-module-crawler/test_coverage)[![Total Downloads](https://camo.githubusercontent.com/36082e0417316fe6456c6ca1c129d4a00768018e86c7cfeeeb54a4a992468d71/68747470733a2f2f706f7365722e707567782e6f72672f6c7579616465762f6c7579612d6d6f64756c652d637261776c65722f646f776e6c6f616473)](https://packagist.org/packages/luyadev/luya-module-crawler)[![Tests](https://github.com/luyadev/luya-module-crawler/workflows/Tests/badge.svg)](https://github.com/luyadev/luya-module-crawler/workflows/Tests/badge.svg)

An easy to use full-website page crawler to make provide search results on your page. The crawler module gather all information about the sites on the configured domain and stores the index in the database. From there you can now create search queries to provide search results. There are also helper methods which provide intelligent search results by splitting the input into multiple search queries (used by default).

[![LUYA Crawler Search Stats](https://raw.githubusercontent.com/luyadev/luya-module-crawler/master/crawler-stats.png)](https://raw.githubusercontent.com/luyadev/luya-module-crawler/master/crawler-stats.png)

Installation
------------

[](#installation)

Install the module via composer:

```
composer require luyadev/luya-module-crawler:^3.0
```

After installation via Composer include the module to your configuration file within the modules section.

```
'modules' => [
    //...
    'crawler' => [
        'class' => 'luya\crawler\frontend\Module',
        'baseUrl' => 'https://luya.io',
        /*
        'filterRegex' => [
            '#.html#i', // filter all links with `.html`
            '#/agenda#i', // filter all links which contain the word with leading slash agenda,
            '#date\=#i, // filter all links with the word date inside. for example when using an agenda which will generate infinite links
        ],
        'on beforeProcess' => function() {
            // optional add or filter data from the BuilderIndex, which will be processed to the Index afterwards
        },
        'on afterIndex' => function() {
            // optional add or filter data from the freshly built Index
        }
        */
    ],
    'crawleradmin' => 'luya\crawler\admin\Module',
]
```

> Where `baseUrl` is the domain you want to crawler all information.

After setup the module in your config you have to run the migrations and import command (to setup permissions):

```
./vendor/bin/luya migrate
./vendor/bin/luya import
```

Running the Crawler
-------------------

[](#running-the-crawler)

To execute the command (and run the crawler proccess) use the crawler command `crawl`, you should put this command in cronjob to make sure your index is up-to-date:

> Make sure your page is in utf8 mode (``) and make sure to set the language `
