PHPackages                             kmvan/participle - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Parsing &amp; Serialization](/categories/parsing)
4. /
5. kmvan/participle

ActiveLibrary[Parsing &amp; Serialization](/categories/parsing)

kmvan/participle
================

This is a participle library

10.0.0(6y ago)133Apache-2.0PHPPHP &gt;=7.3

Since Jan 17Pushed 6y agoCompare

[ Source](https://github.com/kmvan/php-participle)[ Packagist](https://packagist.org/packages/kmvan/participle)[ RSS](/packages/kmvan-participle/feed)WikiDiscussions master Synced 6d ago

READMEChangelog (1)DependenciesVersions (3)Used By (0)

php-participle 一个纯php的分词
========================

[](#php-participle-一个纯php的分词)

[![](https://camo.githubusercontent.com/97a5caddad07a7eb9e8268ec19302fbf46d48313ca71e953d4b9ec863ec350f5/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f737570706f72742d3939362e6963752d7265642e737667)](https://github.com/996icu/996.ICU/blob/master/LICENSE)

QQ交流群: 731475644

环境需求
----

[](#环境需求)

- PHP7.3+
- PHP igbinary 扩展

安装
--

[](#安装)

```
composer require kmvan/participle
```

分词说明
----

[](#分词说明)

- 含有3种切分方法
    - `getWord` 长度优先切分 。最快
    - `getShortWord` 细粒度切分。比最快慢一点点
    - `getAutoWord` 自动切分 。效果最好
- 可自定义词典，自己添加词语到词库，词库支持文本格式`json`和二级制格式`igb`二进制格式词典小，加载快
- `dict.igb`含有175662个词，欢迎大家补充词语到 `dict.txt` ，格式(词语 \\t idf \\t 词性)
    - idf 获取方法 百度搜索这个词语 `Math.log(100000001/结果数量)`，如果你有更好的方法欢迎补充。
    - 词性 \[标点符号,名词,动词,形容词,区别词,代词,数词,量词,副词,介词,连词,助词,语气词,拟声词,叹词\] 取index ；标点符号取0
- 三种分词结果对比

```
$fc = new new \Kmvan\Participle\Word\Query([
    'dictType' => 'igb',
    // 'dictType' => 'json',
]);
$arr = $fc->getWord('北京大学生喝进口红酒，在北京大学生活区喝进口红酒');
//北京大学|生喝|进口|红酒|，|在|北京大学|生活区|喝|进口|红酒
//$arr 是一个数组 每个单元的结构[词语,词语位置,词性,这个词语是否包含在词典中] 这里只值列出了词语

$arr =  $fc->getShortWord('北京大学生喝进口红酒，在北京大学生活区喝进口红酒');
//北京|大学|生喝|进口|红酒|，|在|北京|大学|生活|区喝|进口|红酒

$arr = $fc->getAutoWord('北京大学生喝进口红酒，在北京大学生活区喝进口红酒');
//北京|大学生|喝|进口|红酒|，|在|北京大学|生活区|喝|进口|红酒

//对比
//qq的分词 http://nlp.qq.com/semantic.cgi#page2
//百度的分词 http://ai.baidu.com/tech/nlp/lexical
```

分词速度
----

[](#分词速度)

机器阿里云 `Intel(R) Xeon(R) Platinum 8163 CPU @ 2.50GHz``getWord` 每秒140w字 `getShortWord` 每秒138w字 `getAutoWord` 每秒40w字 测试文本在百度百科拷贝的一段5000字的文本

制作词库
----

[](#制作词库)

- 词库支持utf-8的任意字符
- 词典大小不影响 分词速度

只有一个方法 `\Kmvan\Participle\Word\Insert(词语,词性 = null)`

```
//定义词典文件路径

//目前可支持 igb 和 json 两种词典库格式；igb需要安装igbinary扩展，igb文件小，加载快
$dict = new \Kmvan\Participle\Word\Insert([
    // 'dictType' => 'igb',
    'dictType' => 'json',
]);

//添加词语词库 add(词语,词性) 不分语言，可以是utf-8编码的任何字符
$dict->add('中国','n');

//保存词库
$dict->save();
```

Demo
----

[](#demo)

- 详见 `demos` 目录

原作者的其他软件
--------

[](#原作者的其他软件)

[一个极简高性能php框架，支持\[swoole | php-fpm \]环境](https://github.com/lizhichao/one)

###  Health Score

25

—

LowBetter than 37% of packages

Maintenance20

Infrequent updates — may be unmaintained

Popularity9

Limited adoption so far

Community8

Small or concentrated contributor base

Maturity53

Maturing project, gaining track record

 Bus Factor1

Top contributor holds 74.6% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Unknown

Total

1

Last Release

2312d ago

### Community

Maintainers

![](https://www.gravatar.com/avatar/fcce60b1e7b547f0373ff13475eec2f9328ec8295b51267db157a3bdeb0e1558?d=identicon)[kmvan](/maintainers/kmvan)

---

Top Contributors

[![lizhichao](https://avatars.githubusercontent.com/u/3723567?v=4)](https://github.com/lizhichao "lizhichao (50 commits)")[![kmvan](https://avatars.githubusercontent.com/u/3839554?v=4)](https://github.com/kmvan "kmvan (17 commits)")

---

Tags

participle-libraryphp

### Embed Badge

![Health badge](/badges/kmvan-participle/health.svg)

```
[![Health](https://phpackages.com/badges/kmvan-participle/health.svg)](https://phpackages.com/packages/kmvan-participle)
```

###  Alternatives

[mtdowling/jmespath.php

Declaratively specify how to extract elements from a JSON document

2.0k472.8M135](/packages/mtdowling-jmespathphp)[opis/closure

A library that can be used to serialize closures (anonymous functions) and arbitrary data.

2.6k230.0M284](/packages/opis-closure)[masterminds/html5

An HTML5 parser and serializer.

1.8k242.8M229](/packages/masterminds-html5)[sabberworm/php-css-parser

Parser for CSS Files written in PHP

1.8k191.2M65](/packages/sabberworm-php-css-parser)[michelf/php-markdown

PHP Markdown

3.5k52.4M345](/packages/michelf-php-markdown)[jms/metadata

Class/method/property metadata management in PHP

1.8k152.8M88](/packages/jms-metadata)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
