PHPackages                             nizek/crawler - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Utility &amp; Helpers](/categories/utility)
4. /
5. nizek/crawler

ActiveLibrary[Utility &amp; Helpers](/categories/utility)

nizek/crawler
=============

1.1.2(1y ago)09PHP

Since Nov 6Pushed 1y ago1 watchersCompare

[ Source](https://github.com/mawebcoder/selenium)[ Packagist](https://packagist.org/packages/nizek/crawler)[ RSS](/packages/nizek-crawler/feed)WikiDiscussions master Synced 1mo ago

READMEChangelog (4)DependenciesVersions (5)Used By (0)

Selenium PHP Crawler
====================

[](#selenium-php-crawler)

A PHP package to automate web crawling and element retrieval using Selenium WebDriver. This package allows you to connect to a Selenium server, navigate to web pages, and interact with elements by tags, classes, IDs, and CSS selectors. The package is set up for Chrome in headless mode, making it suitable for use in server environments.

Requirements
------------

[](#requirements)

- PHP 7.4 or newer
- Selenium Server
- ChromeDriver
- chromium

Using Docker
------------

[](#using-docker)

If you prefer to use Docker, a compatible Dockerfile can be found [here](https://github.com/mawebcoder/dockerfiles/blob/master/Dockerfile). Simply run the following commands to build and start the Selenium server

```
docker build -t selenium .
docker run -d --name selenium -p 4444:4444 selenium:latest

```

After running these commands, you can access the Selenium server by opening  in your browser.

Usage
-----

[](#usage)

### Initialization

[](#initialization)

To use the `Crawler`, instantiate it using the `init` static method, which provides a preconfigured WebDriver instance connected to Selenium.

```
use Nizek\Crawler\Crawler;

$crawler = Crawler::init();

```

### Setting the URL

[](#setting-the-url)

To set the URL for the crawler to navigate to:

```
$crawler->setUrl('https://example.com');

```

### Methods

[](#methods)

The following methods are provided for interacting with and retrieving elements from the webpage:

#### setUrl(string $url)

[](#seturlstring-url)

Sets the URL for the crawler to visit.

```
Parameters:
    $url: The URL to navigate to.
Returns:
    Returns the Crawler instance for method chaining.

```

Example:

```
$crawler->setUrl('https://example.com');

```

#### parseXMLUrls()

[](#parsexmlurls)

Parses all URLs from XML content in tags. Useful for sitemaps.

```
Returns:
    An array of URLs found in  tags on the page.

```

Example:

```
$crawler->setUrl('https://example.com/sitemap.xml');
$urls = $crawler->parseXMLUrls();

foreach ($urls as $url) {
    echo $url . PHP_EOL;
}

```

#### getElementByTagName(string $tagName)

[](#getelementbytagnamestring-tagname)

Finds the first element with the given tag name.

```
Parameters:
    $tagName: The name of the tag to search for.
Returns:
    A WebElement object representing the element.

```

Example:

```
$crawler->setUrl('https://example.com');
$element = $crawler->getElementByTagName('h1');
echo $element->getText();

```

#### getElementsByTagName(string $tagName)

[](#getelementsbytagnamestring-tagname)

Finds all elements with the given tag name.

```
Parameters:
    $tagName: The name of the tag to search for.
Returns:
    An array of WebElement objects representing the elements.

```

Example:

```
$crawler->setUrl('https://example.com');
$elements = $crawler->getElementsByTagName('a');

foreach ($elements as $element) {
    echo $element->getAttribute('href') . PHP_EOL;
}

```

#### getElementBySelector(string $selector)

[](#getelementbyselectorstring-selector)

Finds the first element with the given selector.

```
Parameters:
    $className: The name of the class to search for.
Returns:
    A WebElement object representing the element.

```

Example:

```
$crawler->setUrl('https://example.com');
$element = $crawler->getElementBySelector('img.img-thumb[alt="Apple iphone 12"]');
echo $element->getAttribute('src');

```

This will find all `img` tags with class name `img.thumb` with `alt` value equals `Apple iphone 12`(just like filtering page element)

#### getElementsBySelector(string $selector)

[](#getelementsbyselectorstring-selector)

Finds all elements with the given selector.

```
Parameters:
    $className: The name of the class to search for.
Returns:
    An array of WebElement objects representing the elements.

```

Example:

```
$crawler->setUrl('https://example.com');
$elements = $crawler->getElementsBySelector('list-item');

foreach ($elements as $element) {
    echo $element->getText() . PHP_EOL;
}

```

#### getElementByClassName(string $className)

[](#getelementbyclassnamestring-classname)

Finds the first element with the given class name.

```
Parameters:
    $className: The name of the class to search for.
Returns:
    A WebElement object representing the element.

```

Example:

```
$crawler->setUrl('https://example.com');
$element = $crawler->getElementByClassName('header');
echo $element->getText();

```

#### getElementsByClassName(string $className)

[](#getelementsbyclassnamestring-classname)

Finds all elements with the given class name.

```
Parameters:
    $className: The name of the class to search for.
Returns:
    An array of WebElement objects representing the elements.

```

Example:

```
$crawler->setUrl('https://example.com');
$elements = $crawler->getElementsByClassName('list-item');

foreach ($elements as $element) {
    echo $element->getText() . PHP_EOL;
}

```

#### getElementById(string $id)

[](#getelementbyidstring-id)

Finds the element with the given ID.

```
Parameters:
    $id: The ID of the element to search for.
Returns:
    A WebElement object representing the element.

```

Example:

```
$crawler->setUrl('https://example.com');
$element = $crawler->getElementById('main-content');
echo $element->getText();

```

#### getPageContent()

[](#getpagecontent)

Retrieves the inner HTML content of the current page.

```
Returns:
    A string containing the inner HTML of the page.

```

Example:

```
$crawler->setUrl('https://example.com');
$content = $crawler->getPageContent();
echo $content;

```

###  Health Score

24

—

LowBetter than 32% of packages

Maintenance38

Infrequent updates — may be unmaintained

Popularity4

Limited adoption so far

Community7

Small or concentrated contributor base

Maturity42

Maturing project, gaining track record

 Bus Factor1

Top contributor holds 100% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~1 days

Total

4

Last Release

554d ago

### Community

Maintainers

![](https://www.gravatar.com/avatar/7946decd6abf520864cd73b4679e2cb8f942bc36b30c56ef78174a41ff4e3e50?d=identicon)[mawebcoder](/maintainers/mawebcoder)

---

Top Contributors

[![mawebcoder](https://avatars.githubusercontent.com/u/53963857?v=4)](https://github.com/mawebcoder "mawebcoder (9 commits)")

### Embed Badge

![Health badge](/badges/nizek-crawler/health.svg)

```
[![Health](https://phpackages.com/badges/nizek-crawler/health.svg)](https://phpackages.com/packages/nizek-crawler)
```

###  Alternatives

[mainwp/mainwp-child

This is the Child plugin for the MainWP Dashboard

973.4k](/packages/mainwp-mainwp-child)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)