PHPackages                             antheta/falcon - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Parsing &amp; Serialization](/categories/parsing)
4. /
5. antheta/falcon

AbandonedArchivedLibrary[Parsing &amp; Serialization](/categories/parsing)

antheta/falcon
==============

A PHP Website Scraper &amp; Parser. Scrape a single or multiple websites for email addresses or ip addresses / proxies.

0.0.1(2y ago)03[1 PRs](https://github.com/Antheta/falcon-php/pulls)MITPHPCI passing

Since Oct 9Pushed 7mo agoCompare

[ Source](https://github.com/Antheta/falcon-php)[ Packagist](https://packagist.org/packages/antheta/falcon)[ RSS](/packages/antheta-falcon/feed)WikiDiscussions main Synced today

READMEChangelog (1)Dependencies (2)Versions (3)Used By (0)

 [   ![](./assets/falcon.png)  ](https://antheta.com)

 [ ![](https://github.com/Antheta/falcon-php/actions/workflows/run-tests.yml/badge.svg) ](https://github.com/Antheta/falcon-php/actions)Falcon is an open-source (MIT licensed) high-performance PHP web scraper with built-in parsers and extendability.

Please notice that this library is not intended to be used to gather emails or any other personal data for spam.

Documentation
-------------

[](#documentation)

[Documentation](http://docs.antheta.com/)

Features
--------

[](#features)

- Many different built-in parsers.
- Near-endless extendability
    - Custom parser support.
    - Custom regex support.
    - Custom driver (scraper) support.

Installation
------------

[](#installation)

### Composer

[](#composer)

```
composer require antheta/falcon
```

Usage
-----

[](#usage)

Running the scraper:

```
$falcon = Falcon::getInstance()->run("https://example.com/");
$result = $falcon->parse()->results(); // use all available parsers and get all results
```

The example above scrapes the url and returns an array.

### Use specific parsers

[](#use-specific-parsers)

If you wish to get specific resources from the results

```
$falcon = Falcon::getInstance()->run("https://example.com/");
// only returns emails
$emails = $falcon->parse(["email", "ip"])->emails();
```

Methods
-------

[](#methods)

Helper methods for returning the results:

NameresultsemailsphonenumbersipaddressesformslinksimagesstylesheetsscriptsfontsCustom regexes
--------------

[](#custom-regexes)

To add your own regexes to parsers you can just use the `addRegexes` helper:

```
// this will attempt to parse emails with the given regex
$falcon = Falcon::getInstance()
          ->addRegexes("email", ["/[\._a-zA-Z0-9-]+@[\._a-zA-Z0-9-ddd]+/i"])
          ->run("https://example.com/")->parse()->emails();

// you can extend this to other parsers as well and add as many regexes as needed
$falcon = Falcon::getInstance()
            // regexes for emails
            ->addRegexes("email", [
              "/[\._a-zA-Z0-9-]+@[\._a-zA-Z0-9-ddd]+/i",
              "/[\._a-zA-Z0-9-]+\(at\)[\._a-zA-Z0-9-]+/i",
            ])
            // regexes for phonenumbers
            ->addRegexes("phonenumber", [
              "/([\+]?[(]?[0-9]{3}[)]?[-\s\.]?[0-9]{3}[-\s\.]?[0-9]{4,12})/",
            ])
            ->run("https://example.com/")
            ->parse()
            ->results();
```

Custom parsers
--------------

[](#custom-parsers)

With custom parsers you are in control of what kind of data will the parser return:

```
$falcon = Falcon::getInstance();

$falcon->addParser("myCustomParser", fn ($payload) => MyParser($payload));

function MyParser($payload) {
  // your custom logic here
}

// or
$falcon->addParser("myCustomParser", function($payload) {
  // your custom logic here
});

// result from your parser
$falcon->parse("myCustomParser")->results()["myCustomParser"];
```

Custom drivers
--------------

[](#custom-drivers)

Drivers are used for scraping the sites and returning the html to falcon. Drivers can also be used to write completely custom logic and saving it to falcon for later use. Start by creating your own driver class that extends the `DriverInterface` interface and implement the driver specific logic within class.

```
$falcon = Falcon::getInstance();

$falcon->addDrivers([
  "myDriver" => MyDriver::class
]);
```

Scraping dynamic content
------------------------

[](#scraping-dynamic-content)

You could migrate from hQuery to headless JavaScript browser like CapserJS &amp; Phantom to load dynamic content. This way you can also scrape data that is loaded dynamically (after the inital page load).

Check out [Falcon Drivers](https://github.com/Antheta/falcon-drivers) for getting started.

License
-------

[](#license)

The MIT License (MIT). Please see [License File](LICENSE) for more information.

###  Health Score

24

—

LowBetter than 31% of packages

Maintenance45

Moderate activity, may be stable

Popularity3

Limited adoption so far

Community8

Small or concentrated contributor base

Maturity34

Early-stage or recently created project

 Bus Factor1

Top contributor holds 96.1% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Unknown

Total

1

Last Release

997d ago

### Community

Maintainers

![](https://www.gravatar.com/avatar/4728b3460b234ea41a33eaa9d849af76f8dd0348228441208dba89527d2c63df?d=identicon)[marcosraudkett](/maintainers/marcosraudkett)

---

Top Contributors

[![elythi0n](https://avatars.githubusercontent.com/u/23305471?v=4)](https://github.com/elythi0n "elythi0n (49 commits)")[![dependabot[bot]](https://avatars.githubusercontent.com/in/29110?v=4)](https://github.com/dependabot[bot] "dependabot[bot] (2 commits)")

---

Tags

dynamicgatewayscraperscraper-gatewayscrapersweb-scraper

###  Code Quality

TestsPest

### Embed Badge

![Health badge](/badges/antheta-falcon/health.svg)

```
[![Health](https://phpackages.com/badges/antheta-falcon/health.svg)](https://phpackages.com/packages/antheta-falcon)
```

###  Alternatives

[mck89/peast

Peast is PHP library that generates AST for JavaScript code

19139.2M45](/packages/mck89-peast)[sauladam/shipment-tracker

Parses tracking information for several carriers, like UPS, USPS, DHL and GLS by simply scraping the data. No need for any kind of API access.

9843.5k](/packages/sauladam-shipment-tracker)[jstewmc/rtf

Read and write Rich Text Format (RTF) documents with PHP

45153.1k6](/packages/jstewmc-rtf)[tcds-io/php-jackson

A lightweight, flexible object serializer for PHP, inspired by FasterXML/jackson

113.2k10](/packages/tcds-io-php-jackson)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
