PHPackages                             fathkoc/php-web-scraper - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Utility &amp; Helpers](/categories/utility)
4. /
5. fathkoc/php-web-scraper

ActiveLibrary[Utility &amp; Helpers](/categories/utility)

fathkoc/php-web-scraper
=======================

A lightweight PHP web scraper with multiple features.

00PHP

Since Oct 7Pushed 1y ago1 watchersCompare

[ Source](https://github.com/fathkoc/php-web-scraper)[ Packagist](https://packagist.org/packages/fathkoc/php-web-scraper)[ RSS](/packages/fathkoc-php-web-scraper/feed)WikiDiscussions main Synced 1mo ago

READMEChangelogDependenciesVersions (1)Used By (0)

**Web Scraper Kütüphanesi (PHP &amp; Node.js)**
===============================================

[](#web-scraper-kütüphanesi-php--nodejs)

Bu web kazıyıcı kütüphanesi, web sitelerinden veri kazıma, belirli HTML elementlerini seçme ve sayfa güncellemelerini izleme için tasarlanmıştır. Hem **PHP** hem de **Node.js** için destek sunar ve pagination, meta veri çıkarma, dinamik sayfa kazıma ve form gönderimi gibi özellikleri içerir.

**Özellikler**
--------------

[](#özellikler)

- **HTML Element Kazıma:** CSS seçicileri kullanarak belirli elementleri seçme ve veri çıkarma.
- **Pagination Desteği:** Birden fazla sayfa üzerinden veri kazıma.
- **Meta Veri Çıkarma:** HTML sayfalarından başlık, açıklama ve anahtar kelime meta etiketlerini çıkarma.
- **Dinamik İçerik Kazıma:** JavaScript ile içerik yükleyen sayfaları Puppeteer (Node.js) kullanarak kazıma.
- **Proxy ve Kullanıcı Ajanı Desteği:** Proxy'leri ve User-Agent başlıklarını özelleştirme.
- **Sayfa Güncellemelerini İzleme:** Belirli aralıklarla bir web sayfasındaki değişiklikleri izleme.
- **Form Gönderme:** Veri kazımadan önce giriş yapma ve form gönderme işlemleri gerçekleştirme.
- **JSON'a Veri Dışa Aktarma:** Kazınan verileri JSON formatında kaydetme.

**Kurulum**
-----------

[](#kurulum)

### PHP Versiyonu:

[](#php-versiyonu)

1. Bağımlılıkları Composer ile yükleyin:

```
composer install
```

2. Aşağıdaki `composer.json` yapılandırmasını ekleyin:

```
{
    "name": "fathkoc/php-web-scraper",
    "require": {
        "php": ">=7.4",
        "guzzlehttp/guzzle": "^7.0",
        "symfony/dom-crawler": "^5.0",
        "symfony/css-selector": "^5.0"
    }
}
```

### Node.js Versiyonu:

[](#nodejs-versiyonu)

1. Bağımlılıkları npm ile yükleyin:

```
npm install axios cheerio puppeteer
```

2. Aşağıdaki `package.json` yapılandırmasını ekleyin:

```
{
  "name": "node-web-scraper",
  "version": "1.0.0",
  "main": "src/scraper.js",
  "dependencies": {
    "axios": "^0.21.1",
    "cheerio": "^1.0.0-rc.10",
    "puppeteer": "^10.0.0"
  }
}
```

**Kullanım**
------------

[](#kullanım)

### PHP:

[](#php)

```
use WebScraper\Scraper;

$scraper = new Scraper();
$html = $scraper->fetchPageContent('https://example.com');
$data = $scraper->scrapeElement($html, 'h1');
print_r($data); // Tüm  elementlerini çıktı verir
```

### Node.js:

[](#nodejs)

```
const Scraper = require('./src/scraper');

const scraper = new Scraper();
(async () => {
    const html = await scraper.fetchPageContent('https://example.com');
    const data = scraper.scrapeElement(html, 'h2');
    console.log(data); // Tüm  elementlerini çıktı verir
})();
```

---

**Web Scraper Library (PHP &amp; Node.js)**
===========================================

[](#web-scraper-library-php--nodejs)

This web scraper library is designed to scrape data from websites, extract specific HTML elements, and track page updates. It supports both **PHP** and **Node.js** implementations, with features like pagination, meta data extraction, dynamic page scraping, and form submissions.

**Features**
------------

[](#features)

- **HTML Element Scraping:** Select and extract specific elements (e.g., headings, paragraphs) using CSS selectors.
- **Pagination Support:** Scrape data across multiple pages with pagination.
- **Meta Data Extraction:** Extract meta tags such as title, description, and keywords from HTML pages.
- **Dynamic Content Scraping:** Use Puppeteer (Node.js) to scrape pages that load content dynamically using JavaScript.
- **Proxy and User-Agent Support:** Customize User-Agent headers and use proxies to avoid detection.
- **Track Page Updates:** Continuously monitor a webpage for changes at set intervals.
- **Form Submission:** Perform login actions and form submissions before scraping.
- **Export to JSON:** Save scraped data in JSON format for easy use.

**Installation**
----------------

[](#installation)

### PHP Version:

[](#php-version)

1. Install dependencies via Composer:

```
composer install
```

2. Add the following `composer.json` configuration:

```
{
    "name": "fathkoc/php-web-scraper",
    "require": {
        "php": ">=7.4",
        "guzzlehttp/guzzle": "^7.0",
        "symfony/dom-crawler": "^5.0",
        "symfony/css-selector": "^5.0"
    }
}
```

### Node.js Version:

[](#nodejs-version)

1. Install dependencies via npm:

```
npm install axios cheerio puppeteer
```

2. Add the following `package.json` configuration:

```
{
  "name": "node-web-scraper",
  "version": "1.0.0",
  "main": "src/scraper.js",
  "dependencies": {
    "axios": "^0.21.1",
    "cheerio": "^1.0.0-rc.10",
    "puppeteer": "^10.0.0"
  }
}
```

**Usage**
---------

[](#usage)

### PHP:

[](#php-1)

```
use WebScraper\Scraper;

$scraper = new Scraper();
$html = $scraper->fetchPageContent('https://example.com');
$data = $scraper->scrapeElement($html, 'h1');
print_r($data); // Output all  elements
```

### Node.js:

[](#nodejs-1)

```
const Scraper = require('./src/scraper');

const scraper = new Scraper();
(async () => {
    const html = await scraper.fetchPageContent('https://example.com');
    const data = scraper.scrapeElement(html, 'h2');
    console.log(data); // Output all  elements
})();
```

**License**
-----------

[](#license)

MIT License

###  Health Score

13

—

LowBetter than 1% of packages

Maintenance29

Infrequent updates — may be unmaintained

Popularity0

Limited adoption so far

Community7

Small or concentrated contributor base

Maturity17

Early-stage or recently created project

 Bus Factor1

Top contributor holds 100% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

### Community

Maintainers

![](https://www.gravatar.com/avatar/b331020703cc43880b10eb992574be0d19f8a762cfc2b821be6d7917af064442?d=identicon)[fatihk](/maintainers/fatihk)

---

Top Contributors

[![fathkoc](https://avatars.githubusercontent.com/u/79542645?v=4)](https://github.com/fathkoc "fathkoc (1 commits)")

### Embed Badge

![Health badge](/badges/fathkoc-php-web-scraper/health.svg)

```
[![Health](https://phpackages.com/badges/fathkoc-php-web-scraper/health.svg)](https://phpackages.com/packages/fathkoc-php-web-scraper)
```

###  Alternatives

[haruncpi/laravel-id-generator

Easy way to generate custom ID in laravel framework

280436.1k2](/packages/haruncpi-laravel-id-generator)[hipchat/hipchat-php

PHP library for HipChat

1721.0M18](/packages/hipchat-hipchat-php)[marcelweidum/filament-expiration-notice

Customize the livewire expiration notice

9169.0k4](/packages/marcelweidum-filament-expiration-notice)[haringsrob/livewire-datepicker

A standalone livewire datepicker component without dependencies

4113.9k](/packages/haringsrob-livewire-datepicker)[prawee/yii2-vuejs

Vue.js library for Yii2

1712.2k](/packages/prawee-yii2-vuejs)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
