PHPackages                             larra-press/blog-poster - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Utility &amp; Helpers](/categories/utility)
4. /
5. larra-press/blog-poster

ActiveLibrary[Utility &amp; Helpers](/categories/utility)

larra-press/blog-poster
=======================

Automatized poster for your Laravel based blog. Configure the scraper to properly get the data from the selected third party resource, test it, set a Job and enjoy.

1.1.2(4y ago)1517MITCSSPHP &gt;=7.2

Since Aug 28Pushed 4y ago2 watchersCompare

[ Source](https://github.com/LarraPress/blog-poster)[ Packagist](https://packagist.org/packages/larra-press/blog-poster)[ RSS](/packages/larra-press-blog-poster/feed)WikiDiscussions develop Synced 3d ago

READMEChangelog (2)Dependencies (7)Versions (3)Used By (0)

 [![](https://github.com/LarraPress/blog-poster/raw/master/public/assets/img/logo_trans.png)](https://github.com/LarraPress/blog-poster/blob/master/public/assets/img/logo_trans.png)
 LarraPress BlogPoster
==============================================================================================================================================================================================================

[](#------larrapress-blogposter)

###  Autoscraping from third party sources, automatically posting to DB, downloading media files to your storage!

[](#--autoscraping-from-third-party-sources-automatically-posting-to-db-downloading-media-files-to-your-storage)

[![Latest Version on Packagist](https://camo.githubusercontent.com/5489793547fe9af081bd440facc79c3a5af8868cba03b9596e3a287e6ab8c5cf/68747470733a2f2f696d672e736869656c64732e696f2f7061636b61676973742f762f6c617272612d70726573732f626c6f672d706f737465722e7376673f7374796c653d666c61742d737175617265)](https://packagist.org/packages/larra-press/blog-poster)[![StyleCI](https://camo.githubusercontent.com/f42d9f7a19046580ca65e25aa0b11a0ce45c681ed008a111aacf9591697fbc9d/68747470733a2f2f7374796c6563692e696f2f7265706f732f3339363037313530372f736869656c64)](https://github.styleci.io/repos/396071507)[![TESTED OS](https://camo.githubusercontent.com/c3fd06ac280c8676b38c42f14025fd582158aa3972777837e59064c1412febc9/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f5465737465642532304f532d4c696e75782d627269676874677265656e2e737667)](https://camo.githubusercontent.com/c3fd06ac280c8676b38c42f14025fd582158aa3972777837e59064c1412febc9/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f5465737465642532304f532d4c696e75782d627269676874677265656e2e737667)[![Total Downloads](https://camo.githubusercontent.com/93e695f81d77b1ca0f24466481d2d0f31a686361d6ff1419973c05d5ffcec102/68747470733a2f2f696d672e736869656c64732e696f2f7061636b61676973742f64742f6c617272612d70726573732f626c6f672d706f737465722e7376673f7374796c653d666c61742d737175617265)](https://packagist.org/packages/larra-press/blog-poster)[![](https://camo.githubusercontent.com/e60b156bedce9b2ab378266cd2d706c7e326f8f24b52e1398b6eb90d548f8559/68747470733a2f2f6b6f6d617265762e636f6d2f67687076632f3f757365726e616d653d6c617272612d70726573732d626c6f672d706f73746572266c6162656c3d5265706f2b766965777326636f6c6f723d627269676874677265656e267374796c653d666c61742d737175617265)](https://camo.githubusercontent.com/e60b156bedce9b2ab378266cd2d706c7e326f8f24b52e1398b6eb90d548f8559/68747470733a2f2f6b6f6d617265762e636f6d2f67687076632f3f757365726e616d653d6c617272612d70726573732d626c6f672d706f73746572266c6162656c3d5265706f2b766965777326636f6c6f723d627269676874677265656e267374796c653d666c61742d737175617265)

About Package
=============

[](#about-package)

This package was developed by [Alexey Khachatryan](https://github.com/alkhachatryan) for personal usage, but then author decided to make it public package for world usage and created [LarraPress Project](https://github.com/LarraPress)This project will help developers to create powerful blogs and use third party package for better blog owning. The meaning of this package is to scrape articles from third party sources and post on your blog. There are many things to add and fix, because this is in Alpha version. Feel free to report bugs, ask questions and create PRs.

So far this package has the following features:

- Scrape posts from third party resources
- Download selected media files
- Remove useless elements from scraped articles
- Work with lazy-loaded media files by replacing HTML tag attributes values
- Detect duplications
- Scrape multi-value elements such as article tags
- Create thumbnails
- Test the job before publishing

Installation
============

[](#installation)

```
composer require larra-press/blog-poster
```

Configuration
=============

[](#configuration)

### Publish package assets

[](#publish-package-assets)

```
php artisan vendor:publish --tag=larrapress-blog-poster
```

### Add routes

[](#add-routes)

```
LarraPress\BlogPoster\Facades\BlogPoster::routes();
```

ATTENTION! These routes MUST to be added under some **auth** middleware to prevent everybody to edit your blog poster. For example:

```
Route::middleware('auth', function (){
    LarraPress\BlogPoster\Facades\BlogPoster::routes();
});
```

### Run migrations, create required tables

[](#run-migrations-create-required-tables)

```
php artisan migrate
```

### Create your scraping job

[](#create-your-scraping-job)

You can create one scraping job class for all jobs you'll create or have different job classes for each of your scraping job. Creating a scraping job class which will work for all of yours scraping jobs

```
php artisan make:scraping_job ScrapingJobName
```

Or you can create a separated job special for CNN or whatever you want

```
php artisan make:scraping_job ScrapingCNNSource
```

No matter how you call them, but how you use them.

Queues
======

[](#queues)

As website scraping job takes some time to finish we use laravel queues for proper work. If you don't want to use the queues you can override parent ScrapingJob class: \\LarraPress\\BlogPoster\\Jobs\\ScrapingJob and remove queueable traits and interfaces.

Setting Up your first scraping job
==================================

[](#setting-up-your-first-scraping-job)

ScrapingJob classes handle ScrapingJobModel with all configs. To create your scraping job, go to dashboard. The URL of the dashboard depends on how and where you put its routes. If you not sure where are they kindly run this command:

```
php artisan route:list # on UNIX machines you can filter by adding "| grep blog-poster" without quotation marks
```

1. Click on Add New Job button [![image](https://user-images.githubusercontent.com/22774727/131214205-35a46ff7-38d5-4ae3-b9a7-29c2b30d5021.png)](https://user-images.githubusercontent.com/22774727/131214205-35a46ff7-38d5-4ae3-b9a7-29c2b30d5021.png)
2. Fill Job Properties Form [![image](https://user-images.githubusercontent.com/22774727/131214265-85796f37-6028-45e9-98a6-685027dbd374.png)](https://user-images.githubusercontent.com/22774727/131214265-85796f37-6028-45e9-98a6-685027dbd374.png)

- **Name** - the name of the source, it's a hint just for you
- **Source** - the full URL of the web page where the articles/posts are. The list of posts
- **Icon** - the icon of the source. You can manually put some icon URL here or click on PARSE button to fetch it
- **Identifier In List** - the selector of single post in the list. You need to put a selector of anchor
- **Category** - tell the system in which category you want to post the articles came from this source
- **Daily Limit** - some of the source posts a lot of articles. You can set a daily limit for this source
- **Is Draft** - the status of the scraping job. Useful when you do some tests or decided to pause scraping from this source

3. Add New Attribute [![image](https://user-images.githubusercontent.com/22774727/131214380-d52ebab3-97b7-47db-9c6a-59473da85bcb.png)](https://user-images.githubusercontent.com/22774727/131214380-d52ebab3-97b7-47db-9c6a-59473da85bcb.png)

Each post/article has title, body, image(with thumb), tags and so on. We call that elements here Article Attribute. If you want to parse titles, bodies and images you need to create 3 Article Attributes.

In this box you can see 3 tabs:

**Attribute Main Configs** - the basis of the information about attribute. It contains:

- As Thumbnail - if you set a selector to some image and want to make it a thumbnail - enable it. Note that the real file will not be downloaded. To have both of full image and thumb you need to create two Article Attributes
- Is File - let Crawler know that it must to download the content of the selector
- Is HTML - this is usefull for articles bodies where you can get comments in HTML or other bad staff
- Attribute Name - this name will be processed with a Crawler and then passed to the ScrapingJob class where you can play with it. It'll be the index of the attribute.
- Attribute Selector - the CSS selector of the attribute
- Attribute Type - There are 3 types so far: array, URL and default. If you want to scrape and image or some file, set the type to URL. By that way you tell Crawler that it's a URL. Sometimes there can be not full URL like this: /path/to/image.jpg If you want to scrape article tags (there are many tags) use array type. By this way you tell Crawler that there are many elements in the article with this selector and all of them must to be scraped
- Custom Tag Attribute - There are lazy loading in modern blogs. So the real URL of the media will not be in SRC attr, but, let's say, in SRCSET. Set **srcset** here to get URL from different attr.

**Ignoring Elements**[![image](https://user-images.githubusercontent.com/22774727/131214667-6c97a419-ce8d-4da7-a673-204bc83d9683.png)](https://user-images.githubusercontent.com/22774727/131214667-6c97a419-ce8d-4da7-a673-204bc83d9683.png)

You can have elements in original article body which need to be removed. Elements such as injected ads, or some referal links. Just create a new Ignoring Attribute and add that selector of the HTML tag you want to remove from body or whatever.

**Replacing Elements**[![image](https://user-images.githubusercontent.com/22774727/131214721-cbe81d3b-6d21-4f8d-a8f9-656ee7310783.png)](https://user-images.githubusercontent.com/22774727/131214721-cbe81d3b-6d21-4f8d-a8f9-656ee7310783.png)

If the body of the article you want to scrape has lazy-loaded media you can use this feature. Unlike *Custom Tag attribute* field from Attribute Main Configs tab this feature will work in a body or whereever. For example if you want to scrape a single image and get the URL from custom attribute, you use *Custom Tag attribute*. If you want to scrape an article body, but it **contains** media with lazyloading, you need to use it. The differense between these features is that Custom Tag attribute work for a single element with specific selector, while Replaing Elements feature works with CHILD elements in the element with a specific selector.

Run scraping job
================

[](#run-scraping-job)

After you create a scraping job class a model with all configs, you can start the scraping process. Just dispatch the ScrapingJob job and pass the new created model to the job construct.

TODO
====

[](#todo)

- Handle errors from Crawler and pass to the user while testing
- Handle all errors from Crawler and properly log
- Create queue management in dashboard to check the health and status of scraping queue
- Write tests
- Write full documentation

Security
========

[](#security)

If you discover any security related issues, please email instead of using the issue tracker.

Credits
=======

[](#credits)

- [Larra Press](https://github.com/larrapress/)
- [All contributors](https://github.com/larrapress/blog-poster/graphs/contributors)

Used packages
=============

[](#used-packages)

- [Theme by Creative Tim](https://www.creative-tim.com/)
- [Spatie Enum](https://github.com/spatie/enum)
- [Symfony CSS Selector](https://github.com/symfony/css-selector)
- [Symfony DOM Crawler](https://github.com/symfony/dom-crawler)

Versioning
==========

[](#versioning)

The version example: 1.0.0 The package version is divided by 3 parts:

- Global update
- Feature
- Bugfix

###  Health Score

24

—

LowBetter than 32% of packages

Maintenance20

Infrequent updates — may be unmaintained

Popularity13

Limited adoption so far

Community8

Small or concentrated contributor base

Maturity47

Maturing project, gaining track record

 Bus Factor1

Top contributor holds 100% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Unknown

Total

1

Last Release

1718d ago

### Community

Maintainers

![](https://www.gravatar.com/avatar/b1f08584e53602ccd15b0148ce1ffec061304848cb0e9af2658fee4100fdc6a7?d=identicon)[alkhachatryan](/maintainers/alkhachatryan)

---

Top Contributors

[![alkhachatryan](https://avatars.githubusercontent.com/u/22774727?v=4)](https://github.com/alkhachatryan "alkhachatryan (15 commits)")

---

Tags

laravelscrapescrapingautoposterlarrapress

###  Code Quality

TestsPHPUnit

### Embed Badge

![Health badge](/badges/larra-press-blog-poster/health.svg)

```
[![Health](https://phpackages.com/badges/larra-press-blog-poster/health.svg)](https://phpackages.com/packages/larra-press-blog-poster)
```

###  Alternatives

[barryvdh/laravel-ide-helper

Laravel IDE Helper, generates correct PHPDocs for all Facade classes, to improve auto-completion.

14.9k123.0M687](/packages/barryvdh-laravel-ide-helper)[laravolt/avatar

Turn name, email, and any other string into initial-based avatar or gravatar.

2.0k5.4M31](/packages/laravolt-avatar)[spatie/laravel-pjax

A pjax middleware for Laravel 5

513371.8k11](/packages/spatie-laravel-pjax)[dusterio/link-preview

Link preview generation for PHP with Laravel support

126326.6k3](/packages/dusterio-link-preview)[crwlr/crawler

Web crawling and scraping library.

37214.8k2](/packages/crwlr-crawler)[interaction-design-foundation/laravel-geoip

Support for multiple Geographical Location services.

17221.0k3](/packages/interaction-design-foundation-laravel-geoip)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
