PHPackages                             lizhichao/word - PHPackages - PHPackages  [Skip to content](#main-content)[PHPackages](/)[Directory](/)[Categories](/categories)[Trending](/trending)[Leaderboard](/leaderboard)[Changelog](/changelog)[Analyze](/analyze)[Collections](/collections)[Log in](/login)[Sign up](/register)

1. [Directory](/)
2. /
3. [Utility &amp; Helpers](/categories/utility)
4. /
5. lizhichao/word

ActiveLibrary[Utility &amp; Helpers](/categories/utility)

lizhichao/word
==============

This is a participle library

v2.1(5y ago)59140.3k—1.9%110[2 issues](https://github.com/lizhichao/VicWord/issues)1Apache-2.0PHPPHP &gt;=5.6.0

Since Dec 27Pushed 5y ago25 watchersCompare

[ Source](https://github.com/lizhichao/VicWord)[ Packagist](https://packagist.org/packages/lizhichao/word)[ RSS](/packages/lizhichao-word/feed)WikiDiscussions master Synced 1mo ago

READMEChangelogDependenciesVersions (8)Used By (1)

VicWord 一个纯php的分词
=================

[](#vicword-一个纯php的分词)

[![](https://camo.githubusercontent.com/97a5caddad07a7eb9e8268ec19302fbf46d48313ca71e953d4b9ec863ec350f5/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f737570706f72742d3939362e6963752d7265642e737667)](https://github.com/996icu/996.ICU/blob/master/LICENSE)

QQ交流群: 731475644

安装
--

[](#安装)

```
composer require lizhichao/word
```

分词说明
----

[](#分词说明)

- 含有3种切分方法
    - `getWord` 长度优先切分 。最快
    - `getShortWord` 细粒度切分。比最快慢一点点
    - `getAutoWord` 自动切分 。效果最好
- 可自定义词典，自己添加词语到词库，词库支持文本格式`json`和二级制格式`igb`二进制格式词典小，加载快
- `dict.igb`含有175662个词，欢迎大家补充词语到 `dict.txt` ，格式(词语 \\t idf \\t 词性)
    - idf 获取方法 百度搜索这个词语 `Math.log(100000001/结果数量)`，如果你有更好的方法欢迎补充。
    - 词性 \[标点符号,名词,动词,形容词,区别词,代词,数词,量词,副词,介词,连词,助词,语气词,拟声词,叹词\] 取index ；标点符号取0
- 三种分词结果对比

```
$fc = new VicWord();
$arr = $fc->getWord('北京大学生喝进口红酒，在北京大学生活区喝进口红酒');
//北京大学|生喝|进口|红酒|，|在|北京大学|生活区|喝|进口|红酒
//$arr 是一个数组 每个单元的结构[词语,词语位置,词性,这个词语是否包含在词典中] 这里只值列出了词语

$arr =  $fc->getShortWord('北京大学生喝进口红酒，在北京大学生活区喝进口红酒');
//北京|大学|生喝|进口|红酒|，|在|北京|大学|生活|区喝|进口|红酒

$arr = $fc->getAutoWord('北京大学生喝进口红酒，在北京大学生活区喝进口红酒');
//北京|大学生|喝|进口|红酒|，|在|北京大学|生活区|喝|进口|红酒

//对比
//qq的分词 http://nlp.qq.com/semantic.cgi#page2
//百度的分词 http://ai.baidu.com/tech/nlp/lexical
```

分词速度
----

[](#分词速度)

机器阿里云 `Intel(R) Xeon(R) Platinum 8163 CPU @ 2.50GHz`
`getWord` 每秒140w字
`getShortWord` 每秒138w字
`getAutoWord` 每秒40w字
测试文本在百度百科拷贝的一段5000字的文本

制作词库
----

[](#制作词库)

- 词库支持utf-8的任意字符
- 词典大小不影响 分词速度

只有一个方法 VicDict-&gt;add(词语,词性 = null)

```
require __DIR__.'/Lib/VicDict.php';

//目前可支持 igb 和 json 两种词典库格式；igb需要安装igbinary扩展，igb文件小，加载快
$path = ''; //词典地址
$dict = new VicDict($path);

//添加词语词库 add(词语,词性) 不分语言，可以是utf-8编码的任何字符
$dict->add('中国','n');

//保存词库
$dict->save();
```

demo
----

[](#demo)

[demo](http://blogs.vicsdf.com/my/fc)

该作者的其他软件
--------

[](#该作者的其他软件)

- [一个极简高性能php框架，支持\[swoole | php-fpm \]环境](https://github.com/lizhichao/one)
- [clickhouse tcp 客户端](https://github.com/lizhichao/one-ck)

###  Health Score

43

—

FairBetter than 91% of packages

Maintenance20

Infrequent updates — may be unmaintained

Popularity52

Moderate usage in the ecosystem

Community23

Small or concentrated contributor base

Maturity63

Established project with proven stability

 Bus Factor1

Top contributor holds 84.8% of commits — single point of failure

How is this calculated?**Maintenance (25%)** — Last commit recency, latest release date, and issue-to-star ratio. Uses a 2-year decay window.

**Popularity (30%)** — Total and monthly downloads, GitHub stars, and forks. Logarithmic scaling prevents top-heavy scores.

**Community (15%)** — Contributors, dependents, forks, watchers, and maintainers. Measures real ecosystem engagement.

**Maturity (30%)** — Project age, version count, PHP version support, and release stability.

###  Release Activity

Cadence

Every ~157 days

Recently: every ~194 days

Total

7

Last Release

2119d ago

Major Versions

1.4 → v2.02020-06-23

PHP version history (2 changes)v1.0PHP &gt;=5.3.0

v1.2PHP &gt;=5.6.0

### Community

Maintainers

![](https://www.gravatar.com/avatar/f409689ed08808a9e04222e40a4b7b71db61d9a9d23a9aa3737036504700b1df?d=identicon)[lizhichao1](/maintainers/lizhichao1)

---

Top Contributors

[![lizhichao](https://avatars.githubusercontent.com/u/3723567?v=4)](https://github.com/lizhichao "lizhichao (56 commits)")[![kmvan](https://avatars.githubusercontent.com/u/3839554?v=4)](https://github.com/kmvan "kmvan (10 commits)")

---

Tags

phpsegmentationsplitword

### Embed Badge

![Health badge](/badges/lizhichao-word/health.svg)

```
[![Health](https://phpackages.com/badges/lizhichao-word/health.svg)](https://phpackages.com/packages/lizhichao-word)
```

###  Alternatives

[fof/masquerade

User profile builder extension for your Flarum forum.

2123.1k4](/packages/fof-masquerade)[anime-db/anime-db

The application for making home collection anime

252.1k2](/packages/anime-db-anime-db)

PHPackages © 2026

[Directory](/)[Categories](/categories)[Trending](/trending)[Changelog](/changelog)[Analyze](/analyze)
