PHPackages                             layered/url-preview - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Utility &amp; Helpers](/categories/utility)
4. /
5. layered/url-preview

Abandoned → [layered/page-meta](/?search=layered%2Fpage-meta)Library[Utility &amp; Helpers](/categories/utility)

layered/url-preview
===================

Get detailed info for any URL on the internet! Scraper for HTML, OpenGraph, Schema data

2.0.1(5y ago)117173[2 issues](https://github.com/LayeredStudio/page-meta/issues)MITPHP

Since Sep 4Pushed 5y agoCompare

[ Source](https://github.com/LayeredStudio/page-meta)[ Packagist](https://packagist.org/packages/layered/url-preview)[ RSS](/packages/layered-url-preview/feed)WikiDiscussions master Synced 4d ago

READMEChangelog (8)Dependencies (3)Versions (9)Used By (0)

Page Meta 🕵
===========

[](#page-meta-)

**Page Meta** is a PHP library than can retrieve detailed info on any URL from the internet! It uses data from HTML meta tags and [OpenGraph](http://ogp.me/) with fallback to detailed HTML scraping.

### Highlights

[](#highlights)

- Works for any valid URL on the internet!
- Follows page redirects
- Uses all scraping methods available: HTML tags, OpenGraph, Schema data

### Potential use cases

[](#potential-use-cases)

- Display Info Cards for links in a article
- Rich preview for links in messaging apps
- Extract info from a user-submitted URL

[![layered-page-meta-link-card](https://user-images.githubusercontent.com/263021/100539808-35ad3300-3239-11eb-8f47-381153246e32.png)](https://user-images.githubusercontent.com/263021/100539808-35ad3300-3239-11eb-8f47-381153246e32.png)

How to use
----------

[](#how-to-use)

#### Installation

[](#installation)

Add `layered/page-meta` as a dependency in your project's `composer.json` file:

```
$ composer require layered/page-meta
```

#### Usage

[](#usage)

Create a `UrlPreview` instance, then call `loadUrl($url)` method with your URL as first argument. Preview data is retrieved with `get($section)` or `getAll()` methods:

```
require 'vendor/autoload.php';

$preview = new Layered\PageMeta\UrlPreview([
	'HTTP_USER_AGENT'	=>	'Mozilla/5.0 (compatible; YourApp/1.0; +https://example.com)'
]);
$preview->loadUrl('https://www.instagram.com/p/BbRyo_Kjqt1/');

$allPageData = $preview->getAll();	// contains all scraped data
$siteInfo = $preview->get('site');	// get general info about the website

```

#### Behind the scenes

[](#behind-the-scenes)

The library downloads the HTML source of the url you provided, then uses specialized scrapers to extract pieces of information. Core scrapers can be seen in `src/scrapers/`, and they extract general info for a page: title, author, description, page type, main image, etc. If you would like to extract a new field, see [Extending the library](#extending-the-library) section.

User Agent or extra headers can make a big difference when downloading HTML from a website. There are some websites that forbid scraping and hide the content when they detect a tool like this one. Make sure to read their dev docs &amp; TOS.

The default User Agent is blocked on sites like Twitter, Instagram, Facebook and others. A workaround is to use this one (thanks for the tip [PVGrad](https://github.com/LayeredStudio/page-meta/issues/2)):

`'HTTP_USER_AGENT'	=>	'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)'`

#### Returned data

[](#returned-data)

Returned data will be an `Array` with following format:

```
{
	"site": {
		"secure":		true,
		"url":			"https:\/\/www.instagram.com",
		"icon":			"https:\/\/www.instagram.com\/static\/images\/ico\/favicon-192.png\/b407fa101800.png",
		"language":		"en",
		"responsive":	true,
		"name":			"Instagram"
	},
	"page": {
		"type":			"photo",
		"url":			"https:\/\/www.instagram.com\/p\/BbRyo_Kjqt1\/",
		"title":		"GitHub on Instagram",
		"description":	"There\u2019s still time to join the #GitHubGameOff and build a game inspired by throwbacks. Get started\u2026",
		"image":		{
			"url": "https:\/\/scontent-mad1-1.cdninstagram.com\/vp\/73b1790d77548031327e64ee83196706\/5B4AD567\/t51.2885-15\/e35\/23421974_1768724519826754_3855913942043852800_n.jpg"
		}
	},
	"author": {
		"name":			"GitHub",
		"handle":		"@github",
		"url":			"https:\/\/www.instagram.com\/github\/"
	},
	"app_links": {
		"ios": {
			"url": "nflx:\/\/www.netflix.com\/title\/80014749",
			"app_store_id": "363590051",
			"app_name": "Netflix",
			"store_url": "https:\/\/itunes.apple.com\/us\/app\/Netflix\/id363590051"
		},
		"android": {
			"url": "nflx:\/\/www.netflix.com\/title\/80014749",
			"package": "com.netflix.mediaclient",
			"app_name": "Netflix",
			"store_url": "https:\/\/play.google.com\/store\/apps\/details?id=com.netflix.mediaclient"
		}
	}
}

```

See [`UrlPreview::getAll()`](#getall-array) for info on each returned field.

Public API
----------

[](#public-api)

`UrlPreview` class provides the following public methods:

#### `__construct(array $headers): UrlPreview`

[](#__constructarray-headers-urlpreview)

Start the UrlPreview instance. Pass extra headers to send when requesting the page URL

#### `loadUrl(string $url): UrlPreview`

[](#loadurlstring-url-urlpreview)

Load and start the scrape process for any valid URL

#### `getAll(): array`

[](#getall-array)

Get all data scraped from page

**Return:** `Array` with scraped data in following format:

- `site` - info about the website
    - `url` - main site URL
    - `name` - site name, ex: 'Instagram' or 'Medium'
    - `secure` - Boolean true|false depending on http connection
    - `responsive` - Boolean true|false. `True` if site has `viewport` meta tag present. Basic check for responsiveness
    - `icon` - site icon
    - `language` - ISO 639-1 language code, ex: `en`, `es`
- `page` - info about the page at current URL
    - `type` - page type, ex: `website`, `article`, `profile`, `video`, etc
    - `url` - canonical URL for the page
    - `title` - page title
    - `description` - page description
    - `image` - `Array` containing image info, if present:
        - `url` - image URL
        - `width` - image width
        - `height` - image width
    - `video` - `Array` containing video info, if found on page:
        - `url` - video URL
        - `width` - video width
        - `height` - video width
- `author` - info about the content author, ex:
    - `name` - Author's name on a blog, person's name on social network sites
    - `handle` - Social media site username
    - `url` - Author URL for more articles or Profile URL on social network sites
- `app_links` - `Array` containing apps linked to page, like:
    - `ios` - iOS app
        - `url` - link for in-app action, ex: 'nflx://[www.netflix.com/title/80014749](http://www.netflix.com/title/80014749)'
        - `app_store_id` - Apple AppStore app ID
        - `app_name` - name of the app
        - `store_url` - link to installable app
    - `android` - Android app
        - `url` - link for in-app action, ex: 'nflx://[www.netflix.com/title/80014749](http://www.netflix.com/title/80014749)'
        - `package` - Android PlayStore app ID
        - `app_name` - name of the app
        - `store_url` - link to installable app

#### `get(string $section): array`

[](#getstring-section-array)

Get data in one scraped section `site`, `page`, `profile` or `app_links`

**Return:** `Array` with section scraped data. See [`UrlPreview::getAll()`](#getall-array) for data format

#### `addListener(string $eventName, callable $listener, int $priority = 0): UrlPreview`

[](#addlistenerstring-eventname-callable-listener-int-priority--0-urlpreview)

Attach an event on `UrlPreview` for data processing or scrape process. Arguments:

- `$eventName` - on which event to listen. Available:
    - `page.scrape` - fired when the scraping process starts
    - `data.filter` - fired when data is requested by `getData()` or `getAll()` methods
- `$listener` - a callable reference, which will get the `$event` parameter with available data
- `$priority` - order on which the callable should be executed

### Extending the library

[](#extending-the-library)

If there's need to more scraped data for a URL, more functionality can be attached to **PageMeta** library. Example for returing the 'Terms and Conditions' link from pages:

```
use Symfony\Component\EventDispatcher\Event;

$previewer = new \Layered\PageMeta\UrlPreview;
$previewer->addListener('page.scrape', function(Event $event) {
	$currentScrapedData = $event->getData();	// check data from other scrapers
	$crawler = $event->getCrawler();			// instance of DomCrawler Symfony Component
	$termsLink = '';

	$crawler->filter('a[href*=terms]')->each(function($node) use(&$termsLink) {
		$termsLink = $node->attr('href');
	});

	// forwards the scraped data
	$event->addData('site', [
		'termsLink'	=>	$termsLink
	]);
});
$previewer->loadUrl('http://github.com');

```

More
----

[](#more)

Please report any issues here on GitHub.

[Any contributions are welcome](CONTRIBUTING.md)

###  Health Score

32

—

LowBetter than 72% of packages

Maintenance16

Infrequent updates — may be unmaintained

Popularity22

Limited adoption so far

Community8

Small or concentrated contributor base

Maturity68

Established project with proven stability

 Bus Factor1

Top contributor holds 100% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~168 days

Recently: every ~238 days

Total

8

Last Release

1994d ago

Major Versions

v0.2 → v1.02018-03-17

1.1.1 → 2.02020-09-27

### Community

Maintainers

![](https://www.gravatar.com/avatar/04be6a13bd2e1d72af13c56198468f669f39660d5c2cc8a54f8af3767d2b88bd?d=identicon)[AndreiHere](/maintainers/AndreiHere)

---

Top Contributors

[![AndreiIgna](https://avatars.githubusercontent.com/u/263021?v=4)](https://github.com/AndreiIgna "AndreiIgna (11 commits)")

---

Tags

embedlink-previewoembedopengraphschemascraperurl-previewschemalink previewembedscraperopengraphoembedurl-preview

###  Code Quality

TestsPHPUnit

### Embed Badge

![Health badge](/badges/layered-url-preview/health.svg)

```
[![Health](https://phpackages.com/badges/layered-url-preview/health.svg)](https://phpackages.com/packages/layered-url-preview)
```

###  Alternatives

[embed/embed

PHP library to retrieve page info using oembed, opengraph, etc

2.1k11.0M97](/packages/embed-embed)[layered/page-meta

Get detailed info for any URL on the internet! Scraper for HTML, OpenGraph, Schema data

1131.5k](/packages/layered-page-meta)[league/config

Define configuration arrays with strict schemas and access values with dot notation

564302.2M24](/packages/league-config)[essence/essence

Extracts information about medias on the web, like youtube videos, twitter statuses or blog articles.

770562.9k3](/packages/essence-essence)[vdb/php-spider

A configurable and extensible PHP web spider

1.4k181.0k7](/packages/vdb-php-spider)[raulr/google-play-scraper

A PHP scraper to get app data from Google Play

12892.7k](/packages/raulr-google-play-scraper)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)