PHPackages                             sleimanx2/grawler - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Utility &amp; Helpers](/categories/utility)
4. /
5. sleimanx2/grawler

ActiveLibrary[Utility &amp; Helpers](/categories/utility)

sleimanx2/grawler
=================

A guided html crawler with media meta extraction

0.2.4(6y ago)132971MITPHPPHP &gt;=5.5CI failing

Since Jan 25Pushed 6y ago2 watchersCompare

[ Source](https://github.com/sleimanx2/grawler)[ Packagist](https://packagist.org/packages/sleimanx2/grawler)[ Docs](https://github.com/sleimanx2/grawler)[ RSS](/packages/sleimanx2-grawler/feed)WikiDiscussions master Synced 2mo ago

READMEChangelogDependencies (9)Versions (16)Used By (0)

Grawler
=======

[](#grawler)

[![Software License](https://camo.githubusercontent.com/55c0218c8f8009f06ad4ddae837ddd05301481fcf0dff8e0ed9dadda8780713e/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d4d49542d627269676874677265656e2e7376673f7374796c653d666c61742d737175617265)](LICENSE.md) [![Build Status](https://camo.githubusercontent.com/aebe3cce0687b1ad156821b7e9dc06b1f93d4e370182bf96e12d553519b1a064/68747470733a2f2f7472617669732d63692e6f72672f736c65696d616e78322f677261776c65722e7376673f6272616e63683d6d6173746572)](https://travis-ci.org/sleimanx2/grawler)

Install
-------

[](#install)

Via Composer

```
$ composer require sleimanx2/grawler
```

Basic Usage
-----------

[](#basic-usage)

##### getting the page dom

[](#getting-the-page-dom)

```
require_once('vendor/autoload.php');

$client = new Bowtie\Grawler\Client();

$grawler = $client->download('http://example.com');
```

##### finding basic attributes

[](#finding-basic-attributes)

```
$grawler->title();
```

```
// provide a css path to find the attribute
$grawler->body($path = '.main-content');
```

```
// extracts meta keywords (array)
$grawler->keywords();
```

```
// extracts meta description
$grawler->description();
```

##### finding media

[](#finding-media)

```
$grawler->images('.content img');
```

```
$grawler->videos('iframe');
```

```
$grawler->audio('.audio iframe');
```

Resolving media attributes
--------------------------

[](#resolving-media-attributes)

In order resolve media attributes you need to [load providers's configuration](#grawler-config)

#### videos

[](#videos)

Current video resolvers (youtube , vimeo)

```
// resolve all videos at once
$videos = $grawler->videos('iframe')->resolve();
```

then you can access videos attributes as follow

```
foreach($videos as $video)
{
  $video->id; // the video provider id
  $video->title;
  $video->description;
  $video->url;
  $video->embedUrl;
  $video->images; // Collection of Image instances
  $video->author;
  $video->authorId;
  $video->duration;
  $video->provider; //video source
}
```

you can also resolve videos individually as follow

```
$videos = $grawler->videos('iframe')

foreach($videos as $video)
{
  $video->resolve();
  $video->title;
  //...
}
```

#### audio

[](#audio)

Current video resolvers (soundcloud)

```
// resolve all audio at once
$audio = $grawler->audio('.audio iframe')->resolve();
```

then you can access videos attributes as follow

```
foreach($audio as $track)
{
  $track->id; // the video provider id
  $track->title;
  $track->description;
  $track->url;
  $track->embedUrl;
  $track->images; // Collection of cover photo instances
  $track->author;
  $track->authorId;
  $track->duration;
  $track->provider; //video source
}
```

you can also resolve audio individually as follow

```
$track = $grawler->track('.audio iframe')

foreach($audio as $track)
{
  $track->resolve();
  $track->title;
  //...
}
```

Resolving page urls
-------------------

[](#resolving-page-urls)

```
$links = $grawler->links('.main thumb a')

foreach($links as $link)
{
  print $link
  //or
  print $link->uri
  //or
  print $link->getUri()
}
```

Configuration
-------------

[](#configuration)

### Client Config

[](#client-config)

##### Set user agent

[](#set-user-agent)

```
$client->agent('Googlebot/2.1')->download('http://example.com');
```

Recomended :

##### Set request auth

[](#set-request-auth)

```
$client->auth('me', '**')
```

you can change the auth type as follow

```
$client->auth('me', '**', $type = 'basic');
```

##### Set request method

[](#set-request-method)

```
$client->method('post');
```

###  Grawler config

[](#-grawler-config)

By default the grawler tries to access those environment variables

```
GRAWLER_YOUTUBE_KEY

GRAWLER_VIMEO_KEY
GRAWLER_VIMEO_SECRET

GRAWLER_SOUNDCLOUD_KEY
GRAWLER_SOUNDCLOUD_SECRET

```

if you don't use env vars you can load configuration as follow.

```
$config = [
  'youtubeKey'   =>'',
  'soundcloudKey'=>''

  'vimeoKey'    => '',
  'vimeoSecret' => '',

  'soundcloudKey'    => '',
  'soundcloudSecret' => '',
];

$grawler->loadConfig($config);
```

Testing
-------

[](#testing)

```
$ phpunit --testsuite unit
```

```
$ phpunit --testsuite integration
```

NB: you should set your ptoviders key (youtube,vimeo,soundcloud...) to run integration tests

Contributing
------------

[](#contributing)

Please see [CONTRIBUTING](CONTRIBUTING.md)

Security
--------

[](#security)

If you discover any security related issues, please email  instead of using the issue tracker.

License
-------

[](#license)

The MIT License (MIT). Please see [License File](LICENSE.md) for more information.

###  Health Score

29

—

LowBetter than 59% of packages

Maintenance20

Infrequent updates — may be unmaintained

Popularity19

Limited adoption so far

Community9

Small or concentrated contributor base

Maturity56

Maturing project, gaining track record

 Bus Factor1

Top contributor holds 100% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~96 days

Recently: every ~2 days

Total

15

Last Release

2420d ago

### Community

Maintainers

![](https://www.gravatar.com/avatar/5bac404ade62f4b12588587dbf602e0ddd19aea41276e218b4de716a13c20ce8?d=identicon)[sleimanx2](/maintainers/sleimanx2)

---

Top Contributors

[![sleimanx2](https://avatars.githubusercontent.com/u/5547347?v=4)](https://github.com/sleimanx2 "sleimanx2 (50 commits)")

---

Tags

phphtmlcrawlerguided

###  Code Quality

TestsPHPUnit

### Embed Badge

![Health badge](/badges/sleimanx2-grawler/health.svg)

```
[![Health](https://phpackages.com/badges/sleimanx2-grawler/health.svg)](https://phpackages.com/packages/sleimanx2-grawler)
```

###  Alternatives

[wa72/htmlpagedom

jQuery-inspired DOM manipulation extension for Symfony's Crawler

3383.9M34](/packages/wa72-htmlpagedom)[artem_c/emmet

emmet implementation for php

141.8k](/packages/artem-c-emmet)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
