PHPackages                             artryazanov/laravel-wikipedia-games-db - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Database &amp; ORM](/categories/database)
4. /
5. artryazanov/laravel-wikipedia-games-db

ActiveLaravel-package[Database &amp; ORM](/categories/database)

artryazanov/laravel-wikipedia-games-db
======================================

A Laravel package to build a video game database by scraping English Wikipedia.

v0.1.0(10mo ago)270UnlicensePHPPHP ^8.1CI passing

Since Sep 2Pushed 9mo agoCompare

[ Source](https://github.com/artryazanov/laravel-wikipedia-games-db)[ Packagist](https://packagist.org/packages/artryazanov/laravel-wikipedia-games-db)[ RSS](/packages/artryazanov-laravel-wikipedia-games-db/feed)WikiDiscussions main Synced today

READMEChangelog (1)Dependencies (10)Versions (3)Used By (0)

Laravel Wikipedia Games DB
==========================

[](#laravel-wikipedia-games-db)

A Laravel package to build a normalized database of video games by scraping Wikipedia. It uses a queue-driven architecture to traverse categories and parse game pages via the Wikipedia (MediaWiki) API and an HTML infobox parser.

By default, the package targets English Wikipedia and allows full configuration via environment variables.

Features
--------

[](#features)

- Queue-driven, resumable scraping workflow
- MediaWiki API client abstraction
- Robust infobox HTML parser (developers, publishers, genres, platforms, modes, series, engines, release date, cover image)
- Normalized schema with many-to-many relations
- All tables are prefixed (wikipedia\_\*) — core table is `wikipedia_games`
- Single consolidated migration for easier setup
- Configurable via `.env` (endpoint, user agent, throttling, queues)

Requirements
------------

[](#requirements)

- PHP &gt;= 8.1
- Laravel 10.x, 11.x, or 12.x
- Extensions: curl, dom, mbstring, gd (typical for Laravel + DOM parsing)

Installation
------------

[](#installation)

If this package is included as a path repository in your monorepo (as in this project), ensure your root composer.json has a repository entry pointing to `packages/artryazanov/laravel-wikipedia-games-db`, then require it:

```
composer require artryazanov/laravel-wikipedia-games-db:dev-main
```

If installing from a VCS/Packagist in another project, require it the same way and ensure Composer discovers the service provider (auto-discovery enabled). If needed, register the provider manually in `config/app.php`:

```
'providers' => [
    // ...
    Artryazanov\WikipediaGamesDb\WikipediaGamesDbServiceProvider::class,
],
```

Publish configuration and migrations
------------------------------------

[](#publish-configuration-and-migrations)

```
php artisan vendor:publish --provider="Artryazanov\\WikipediaGamesDb\\WikipediaGamesDbServiceProvider" --tag=config
php artisan vendor:publish --provider="Artryazanov\\WikipediaGamesDb\\WikipediaGamesDbServiceProvider" --tag=migrations
```

Then migrate:

```
php artisan migrate
```

Configuration (.env)
--------------------

[](#configuration-env)

All settings can be overridden via environment variables.

- `WIKIPEDIA_GAMES_DB_API_ENDPOINT` (default: `https://en.wikipedia.org/w/api.php`)
- `WIKIPEDIA_GAMES_DB_USER_AGENT` (default example: `LaravelWikipediaGamesDb/1.0 (+https://example.com; contact@example.com)`)
- `WIKIPEDIA_GAMES_DB_ROOT_CATEGORY` (default: `Category:Video games`)
- `WIKIPEDIA_GAMES_DB_THROTTLE_MS` (default: `1000`)
- `WIKIPEDIA_GAMES_DB_QUEUE_CONNECTION` (default: `null` — uses Laravel default)
- `WIKIPEDIA_GAMES_DB_QUEUE_NAME` (default: `default`)

Example snippet for your `.env`:

```
WIKIPEDIA_GAMES_DB_API_ENDPOINT=https://en.wikipedia.org/w/api.php
WIKIPEDIA_GAMES_DB_USER_AGENT="YourApp/1.0 (+https://your-site; you@example.com)"
WIKIPEDIA_GAMES_DB_ROOT_CATEGORY="Category:Video games"
WIKIPEDIA_GAMES_DB_THROTTLE_MS=1000
WIKIPEDIA_GAMES_DB_QUEUE_CONNECTION=
WIKIPEDIA_GAMES_DB_QUEUE_NAME=default
```

Please set a meaningful User-Agent per MediaWiki API etiquette.

Database schema
---------------

[](#database-schema)

This package ships migrations that create the following tables (with comments):

- `wikipedia_game_wikipages`: central storage for Wikipedia page meta reused by multiple entities. Columns: `title`, `wikipedia_url`, `description`, `wikitext`, timestamps.
- `wikipedia_games` (core games) — now has `wikipage_id` pointing to `wikipedia_game_wikipages`; still stores `clean_title`, `cover_image_url`, `release_date`, `release_year`.
- `wikipedia_game_genres` — has `wikipage_id`.
- `wikipedia_game_platforms` — has `wikipage_id` and keeps platform-specific fields like `cover_image_url`, `release_date`, `website_url`.
- `wikipedia_game_companies` — has `wikipage_id` and keeps `cover_image_url`, `founded`, `website_url`.
- `wikipedia_game_modes` — has `wikipage_id`.
- `wikipedia_game_series` — has `wikipage_id`.
- `wikipedia_game_engines` — has `wikipage_id` and keeps `cover_image_url`, `release_date`, `website_url`.
- `wikipedia_game_game_genre` (pivot)
- `wikipedia_game_game_platform` (pivot)
- `wikipedia_game_game_mode` (pivot)
- `wikipedia_game_game_series` (pivot)
- `wikipedia_game_game_engine` (pivot)
- `wikipedia_game_game_company` (pivot, with `role` column: developer|publisher)

The migrations check for existence prior to creation, making it safer for incremental adoption. A data migration backfills `wikipage_id` and moves `title`, `wikipedia_url`, `description`, `wikitext` into `wikipedia_game_wikipages`.

Usage
-----

[](#usage)

You can kick off discovery in multiple ways. The fastest, high-precision path is via template transclusions.

1. Run all discovery strategies in one go (templates + categories):

```
php artisan games:scan-all
```

2. Discover via Infobox template (recommended for precise bootstrap):

```
php artisan games:discover-by-template
```

This enumerates all pages that include `Template:Infobox video game` (main namespace) and enqueues parsing jobs. To also include series/franchises:

```
php artisan games:discover-by-template --series
```

3. Traverse categories (broad coverage; longer):

```
php artisan games:scrape-wikipedia --category="Category:Video games"
```

Or seed multiple high-value roots (platforms and genres):

```
php artisan games:scrape-wikipedia --seed-high-value
```

If `--category` is omitted, the command uses `game-scraper.root_category` from config (by default, English `Category:Video games`).

2. Run your queue worker so jobs are processed:

```
php artisan queue:work --queue="${WIKIPEDIA_GAMES_DB_QUEUE_NAME:-default}"
```

Tips:

- Adjust `WIKIPEDIA_GAMES_DB_THROTTLE_MS` to respect API limits (start with 1000 ms).
- Set a meaningful `WIKIPEDIA_GAMES_DB_USER_AGENT`.
- Ensure your queue driver is configured (`QUEUE_CONNECTION` and, optionally, `WIKIPEDIA_GAMES_DB_QUEUE_CONNECTION`).
- Prefer running the template-based discovery first to quickly build a large, accurate dataset; use category traversal to expand coverage over time.

Scheduling (optional)
---------------------

[](#scheduling-optional)

You can schedule periodic updates (e.g., weekly) in `app/Console/Kernel.php`:

```
protected function schedule(\Illuminate\Console\Scheduling\Schedule $schedule): void
{
    $schedule->command('games:scrape-wikipedia')->weekly()->sundays()->at('03:00');
}
```

Troubleshooting
---------------

[](#troubleshooting)

- "No jobs processed" — ensure a queue worker is running and the queue name/connection align with your config and env.
- 429 / throttling from API — increase `WIKIPEDIA_GAMES_DB_THROTTLE_MS` and verify your User-Agent.
- Migrations not found — run the vendor:publish step for migrations or rely on the package-loaded migrations.

Background jobs &amp; conditional dispatch
------------------------------------------

[](#background-jobs--conditional-dispatch)

This package processes pages via queued jobs. The main entry point parses a game page and conditionally enqueues per-taxonomy jobs for additional details.

- ProcessGamePageJob: Parses a game page, upserts a central `Wikipage`, persists game-specific fields, and dispatches taxonomy jobs for linked items found in the infobox (developers, publishers, platforms, engines, genres, modes, series).
- ProcessCompanyPageJob: Upserts `Wikipage` and persists company-specific fields (`cover_image_url`, `founded`, `website_url`).
- ProcessPlatformPageJob: Upserts `Wikipage` and persists platform-specific fields (`cover_image_url`, `release_date`, `website_url`).
- ProcessEnginePageJob: Upserts `Wikipage` and persists engine-specific fields (`cover_image_url`, `release_date`, `website_url`).
- ProcessGenrePageJob: Upserts `Wikipage` and links the genre.
- ProcessModePageJob: Upserts `Wikipage` and links the mode.
- ProcessSeriesPageJob: Upserts `Wikipage` and links the series.

Conditional dispatch

- ProcessGamePageJob will only enqueue a Process\*PageJob when the corresponding record is missing or when its linked `wikipage.wikipedia_url` is empty.
- This minimizes redundant requests and focuses fetching on missing details.

Throttling and deduping

- All jobs inherit a throttle helper that respects `game-scraper.throttle_milliseconds` to avoid exceeding API limits.
- Jobs are idempotent: models are upserted and relations synced, so reprocessing the same page is safe. Queue-level uniqueness is not enabled by default; if you need strict uniqueness, you can implement `ShouldBeUnique` on specific jobs in your app fork.

Testing
-------

[](#testing)

This repository includes a full test suite based on Orchestra Testbench with an in-memory SQLite database.

- Install dependencies:
    - composer install
- Run tests (Windows):
    - .\\vendor\\bin\\phpunit --configuration phpunit.xml
- Run tests (Unix/macOS):
    - vendor/bin/phpunit --configuration phpunit.xml

You can also run via Composer script: `composer test`.

If phpunit cannot be found, ensure Composer finished installing dependencies successfully.

License
-------

[](#license)

The Unlicense. This is free and unencumbered software released into the public domain. See the LICENSE file or  for details.

Credits
-------

[](#credits)

- Vendor: Artryazanov
- Built with Laravel Queue, HTTP client, and Symfony DomCrawler.

###  Health Score

29

—

LowBetter than 57% of packages

Maintenance57

Moderate activity, may be stable

Popularity10

Limited adoption so far

Community6

Small or concentrated contributor base

Maturity36

Early-stage or recently created project

 Bus Factor1

Top contributor holds 100% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Unknown

Total

1

Last Release

304d ago

### Community

Maintainers

![](https://avatars.githubusercontent.com/u/4519328?v=4)[Artem Ryazanov](/maintainers/artryazanov)[@artryazanov](https://github.com/artryazanov)

---

Top Contributors

[![artryazanov](https://avatars.githubusercontent.com/u/4519328?v=4)](https://github.com/artryazanov "artryazanov (81 commits)")

###  Code Quality

TestsPHPUnit

Code StyleLaravel Pint

### Embed Badge

![Health badge](/badges/artryazanov-laravel-wikipedia-games-db/health.svg)

```
[![Health](https://phpackages.com/badges/artryazanov-laravel-wikipedia-games-db/health.svg)](https://phpackages.com/packages/artryazanov-laravel-wikipedia-games-db)
```

###  Alternatives

[laravel/pulse

Laravel Pulse is a real-time application performance monitoring tool and dashboard for your Laravel application.

1.7k15.1M132](/packages/laravel-pulse)[psalm/plugin-laravel

Psalm plugin for Laravel

3355.3M346](/packages/psalm-plugin-laravel)[roots/acorn

Framework for Roots WordPress projects built with Laravel components.

9762.4M131](/packages/roots-acorn)[craftcms/cms

Craft CMS

3.6k3.6M3.1k](/packages/craftcms-cms)[mike-bronner/laravel-model-caching

Automatic caching for Eloquent models.

2.4k90.5k1](/packages/mike-bronner-laravel-model-caching)[spatie/laravel-export

Create a static site bundle from a Laravel app

674146.0k6](/packages/spatie-laravel-export)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
