crodas/text-rank textrank算法提取关键词、描述信息

README

extract relevant keywords from a given text

How to use it

In order to use the class, you must instance a Config object.

<?php

require __DIR__ . "/vendor/autoload.php";

use \crodas\TextRank\Config;
use \crodas\TextRank\TextRank;

$config   = new Config;
$textrank = new TextRank($config);

$keywords = $textrank->getKeywords($some_long_text);

var_dump($keywords);

It is possible to get better results by adding few information about the language (stopword list, stemmer with pecl install stem).

<?php

require __DIR__ . "/vendor/autoload.php";

use \crodas\TextRank\Config;
use \crodas\TextRank\TextRank;
use \crodas\TextRank\Stopword;

$config = new Config;
$config->addListener(new Stopword);

$textrank = new TextRank($config);
$keywords = $textrank->getKeywords($some_long_text);

var_dump($keywords);

By doing this it will detect the language of the text and will remove common words (from the stopword list). If ext-stem is available the results will be even better.

Summarize large texts

This class is also capable of summarizing long texts

$config = new \crodas\TextRank\Config;
$config->addListener(new \crodas\TextRank\Stopword);
$analizer = new \crodas\TextRank\Summary($config);
$summary = $analizer->getSummary($text);         

$summary is at most 5% of the sentences of the text.

https://packagist.org/packages/crodas/text-rank

https://github.com/crodas/TextRank

安装是重点


composer require crodas/languagedetector

composer.json 文件加 autoload内容 src/lib/ 对相应 textRank 算法目录

需要到github下载 https://github.com/crodas/TextRank


{
"require": {
"crodas/languagedetector": "^0.1.1"
},
"autoload": {
"classmap":["src/lib/"]
}
}

PHP Session锁及并发机制 | void session_write_close(void)函数

手册中有这样的描述:
void session_write_close ( void )

End the current session and store session data.

Session data is usually stored after your script terminated without the need to call session_write_close(), but as session data is locked to prevent concurrent writes only one script may operate on a session at any time. When using framesets together with sessions you will experience the frames loading one by one due to this locking. You can reduce the time needed to load all the frames by ending the session as soon as all changes to session variables are done.

也就是说session是有锁的,为防止并发的写会话数据,php自带的的文件保存会话数据是加了一个互斥锁(在session_start()的时候)。
程序执行session_start(),此时当前程序就开始持有锁。
程序结束,此时程序自动释放Session的锁。

如果同一个客户端同时并发发送多个请求(如ajax在页面同时发送多个请求),且脚本执行时间较长,就会导致session文件阻塞,影响性能。因为对于每个请求,PHP执行session_start(),就会取得文件独占锁,只有在该请求处理结束后,才会释放独占锁。这样,同时多个请求就会引起阻塞。解决方案如下:
修改会话变量后,立即使用session_write_close()来保存会话数据并释放文件锁。

session_start();   
$_SESSION['test'] = 'test';
session_write_close();
......
//do something