PHPackages                             adrianoferreira/document-distance - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Utility &amp; Helpers](/categories/utility)
4. /
5. adrianoferreira/document-distance

ActiveLibrary[Utility &amp; Helpers](/categories/utility)

adrianoferreira/document-distance
=================================

A simple library for calculating the distance between two documents through the cosine similarity algorithm.

137[1 PRs](https://github.com/adrianosferreira/document-distance/pulls)PHPCI passing

Since Jan 24Pushed 2mo ago1 watchersCompare

[ Source](https://github.com/adrianosferreira/document-distance)[ Packagist](https://packagist.org/packages/adrianoferreira/document-distance)[ RSS](/packages/adrianoferreira-document-distance/feed)WikiDiscussions master Synced today

READMEChangelogDependenciesVersions (3)Used By (0)

Document Distance - Cosine Similarity
=====================================

[](#document-distance---cosine-similarity)

[![Build Status](https://camo.githubusercontent.com/3b40907698ae78fb487af24d6b5fc3cca3c2062b99e1dd9671b6cc0cb471431f/68747470733a2f2f7472617669732d63692e6f72672f61647269616e6f7366657272656972612f646f63756d656e742d64697374616e63652e7376673f6272616e63683d6d6173746572)](https://travis-ci.org/adrianosferreira/document-distance)[![Build Status](https://camo.githubusercontent.com/a36e4d37be0566646da84c104532118edc6ce34b32c39ecaf806102abed693a4/68747470733a2f2f636f6465636f762e696f2f67682f61647269616e6f7366657272656972612f646f63756d656e742d64697374616e63652f6272616e63682f6d61737465722f67726170682f62616467652e737667)](https://codecov.io/gh/adrianosferreira/document-distance)[![Total Downloads](https://camo.githubusercontent.com/475c8fe590d4195efb844ba06648583da9a21faf34a0070495d7b9ce74c744af/68747470733a2f2f706f7365722e707567782e6f72672f61647269616e6f66657272656972612f646f63756d656e742d64697374616e63652f646f776e6c6f616473)](https://packagist.org/packages/adrianoferreira/document-distance)[![License](https://camo.githubusercontent.com/394552573cd3605ada2790d125834e579425f002791f8f2bdd2a6ec74b15872f/68747470733a2f2f706f7365722e707567782e6f72672f61647269616e6f66657272656972612f646f63756d656e742d64697374616e63652f6c6963656e7365)](https://packagist.org/packages/adrianoferreira/document-distance)

Document Distance / Similarity is measured based on the content overlap between documents.

One of the most common algorithms to solve this particular problem is the cosine similarity - a vector based similarity measure. That's what this library is about.

The cosine distance of two documents is defined by the angle between their feature vectors which are, in our case, word frequency vectors. The word frequency distribution of a document is a mapping from words to their frequency count.

[![Cosine Similarity](https://camo.githubusercontent.com/aa3651304382a95a171ba4a9610f583c273c45a68dae814aa80c7447e758fc14/68747470733a2f2f7777772e616e647265772e636d752e6564752f636f757273652f31352d3132312f6c6162732f48572d34253230446f63756d656e7425323044697374616e63652f706978312e626d70)](https://camo.githubusercontent.com/aa3651304382a95a171ba4a9610f583c273c45a68dae814aa80c7447e758fc14/68747470733a2f2f7777772e616e647265772e636d752e6564752f636f757273652f31352d3132312f6c6162732f48572d34253230446f63756d656e7425323044697374616e63652f706978312e626d70)

Installation
------------

[](#installation)

It's recommended that you use Composer to install this library.

```
$ composer require adrianoferreira/document-distance:dev-master

```

Usage
-----

[](#usage)

Calculating similarity percentage between two remote files:

```
echo ( new \AdrianoFerreira\DD\File( 'http://test.com/test.txt', 'http://test.com/test2.txt' ) )->getPercent();
```

Calculating arc size between two local files:

```
echo ( new \AdrianoFerreira\DD\File( __DIR__ . 'test.txt', __DIR__ . 'test2.txt' ) )->getArcSize();
```

Calculating similarity percentage between two arbitrary strings:

```
echo ( new \AdrianoFerreira\DD\Text( 'test 123 456', 'test 678 000' ) )->getPercent();
```

Calculating arc size between arbitrary strings:

```
echo ( new \AdrianoFerreira\DD\Text( 'test 123 456', 'test 678 000' ) )->getArcSize();
```

References
----------

[](#references)

This implementation is based in a MIT document:

###  Health Score

29

—

LowBetter than 57% of packages

Maintenance57

Moderate activity, may be stable

Popularity9

Limited adoption so far

Community7

Small or concentrated contributor base

Maturity38

Early-stage or recently created project

 Bus Factor1

Top contributor holds 100% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

### Community

Maintainers

![](https://avatars.githubusercontent.com/u/4672632?v=4)[Adriano Ferreira](/maintainers/adrianosferreira)[@adrianosferreira](https://github.com/adrianosferreira)

---

Top Contributors

[![adrianosferreira](https://avatars.githubusercontent.com/u/4672632?v=4)](https://github.com/adrianosferreira "adrianosferreira (11 commits)")

---

Tags

cosine-similaritydocument-distance

### Embed Badge

![Health badge](/badges/adrianoferreira-document-distance/health.svg)

```
[![Health](https://phpackages.com/badges/adrianoferreira-document-distance/health.svg)](https://phpackages.com/packages/adrianoferreira-document-distance)
```

###  Alternatives

[chrisguitarguy/request-id-bundle

Add request IDs to to your Symfony requests.

451.5M6](/packages/chrisguitarguy-request-id-bundle)[edofre/yii2-fullcalendar-scheduler

Yii2 widget for fullcalendar scheduler module

2438.0k](/packages/edofre-yii2-fullcalendar-scheduler)[mrjgreen/config

Config component based on laravel's

2411.0k1](/packages/mrjgreen-config)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
