NLTK基础教程学习笔记(十二)

简介:

构建第一个NLP应用:
信息摘要:对所提供的文章短文故事生成需要针对其内容自动生成摘要。信息摘要需要理解的不只是句子的结构,而是整个文本结构,还要了解文本的体裁和主体主题内容。
下面了一个介绍创建个人版的Google News
通常用于较多实体和名词的句子的重要性往往会比较高,现在的任务是要用某种可能被标准化的统一逻辑来计算重要性成分(importance score),即如果想要获取前n个句子的信息情况,要去选择一个重要性评分阈值。
由于找不到原文的新闻材料所以用wiki上的一段介绍吾王Saber材料代替;

f=open('new.txt','r')
new_content=f.read()
print(new_content)

结果:

Saber's full name is Altria Pendragon, a character inspired by the legends of King Arthur. At her nativity, Uther decides to not publicly 
announce Altria's birth or gender, fearing his subjects will never accept a woman as a legitimate ruler. She is entrusted by Merlin to a loyal knight, Sir 
Ector, who raises her as a surrogate son. When Altria is fifteen, King Uther dies leaving no known eligible heir to the throne. Britain enters a period of 
turmoil following the growing threat of invasion by the Saxons. Merlin soon approaches Altria, explaining that the British people will recognize her as a 
destined ruler if she withdraws Caliburn, a ceremonial sword embedded in a large slab of stone. However, pulling this sword is symbolic of accepting the 
hardships of a monarch, and Altria will be responsible for preserving the welfare of her people. Without hesitation and despite her gender, she draws Caliburn 
and shoulders Britain's mantle of leadership.
Altria rules Britain from her stronghold in Camelot and earns the reputation of a just, yet distant king. Under the guidance of Merlin and with the aid of her 
Knights of the Round Table, she guides Britain into an era of prosperity and tranquillity. Caliburn is destroyed, but Altria soon acquires her holy sword, 
Excalibur, and Avalon, Excalibur's blessed sheath, from Vivian, the Lady of the Lake. While Avalon is in her possession, Altria never ages and is immortal in battle.
Despite her immense strength and fighting abilities, Altria is plagued by feelings of guilt and inferiority throughout her reign; she sacrifices her emotions for 
the good of Britain, yet many of her subjects and knights become critical of her lack of humanity and cold calculation. Excalibur's scabbard is stolen while she 
repels an assault along her country's borders; when Altria returns inland, she discovers Britain is being torn asunder by civil unrest. Despite her valiant efforts 
to placate the dissent, Altria is mortally wounded by a traitorous knight, a homunculus born of her blood named Mordred, during the Battle of Camlann. Her dying body 
is escorted to a holy isle by Morgan le Fay and Sir Bedivere. Altria orders a grieving Bedivere to dispose of Excalibur by throwing it back to Vivian; in her absence, 
she reflects on her personal failures, regretting her life as king. Before her last breath, she appeals to the world; in exchange for services as a Heroic Spirit, she 
asks to be given an opportunity to relive her life, where someone more suitable and effective would lead Britain in her stead.

要对文字进行分析,先要将文章转换成一个句子列表。用句子标识器将内容分成若干个句子,这里提供一些句型的编号,便于识别这些句子并对其进行排名。一旦得到了这些段子,会让其在单词标识器中过一遍,最后再来过NER标注器和POS标注器。

import nltk
f=open('new.txt','r')
new_content=f.read()
results=[]
for sent_no,sentence in enumerate(nltk.sent_tokenize(new_content)):
    no_of_tokens=len(nltk.word_tokenize(sentence))
    #print(no_of_tokens)
    tagged=nltk.pos_tag(nltk.word_tokenize(sentence))
    no_of_nouns=len([word for word ,pos in tagged if pos in ["NN","NNP"]])
    ners=nltk.ne_chunk(nltk.pos_tag(nltk.word_tokenize(sentence)))
    no_of_ners=len([chunk for chunk in ners if hasattr(chunk,'label')])
    score=(no_of_ners+no_of_nouns)/float(no_of_tokens)
    results.append((sent_no,no_of_tokens,no_of_ners,no_of_nouns,score,sentence))
for sent in sorted(results,key=lambda x:x[4],reverse=True):
    print(sent[5])

上面代码中我们对句子列表进行了迭代,并根据公式计算出了这些句子的评分,该公式只是个以被标识实体为分子,以普通标识词为分母的分子式,将这些结果创建成一个元组。降序排列后打印的结果:

Caliburn is destroyed, but Altria soon acquires her holy sword, 
Excalibur, and Avalon, Excalibur's blessed sheath, from Vivian, the Lady of the Lake.
Her dying body 
is escorted to a holy isle by Morgan le Fay and Sir Bedivere.
Saber's full name is Altria Pendragon, a character inspired by the legends of King Arthur.
Britain enters a period of 
turmoil following the growing threat of invasion by the Saxons.
Altria rules Britain from her stronghold in Camelot and earns the reputation of a just, yet distant king.
Without hesitation and despite her gender, she draws Caliburn 
and shoulders Britain's mantle of leadership.
Under the guidance of Merlin and with the aid of her 
Knights of the Round Table, she guides Britain into an era of prosperity and tranquillity.
While Avalon is in her possession, Altria never ages and is immortal in battle.
Excalibur's scabbard is stolen while she 
repels an assault along her country's borders; when Altria returns inland, she discovers Britain is being torn asunder by civil unrest.
Despite her valiant efforts 
to placate the dissent, Altria is mortally wounded by a traitorous knight, a homunculus born of her blood named Mordred, during the Battle of Camlann.
When Altria is fifteen, King Uther dies leaving no known eligible heir to the throne.
She is entrusted by Merlin to a loyal knight, Sir 
Ector, who raises her as a surrogate son.
Merlin soon approaches Altria, explaining that the British people will recognize her as a 
destined ruler if she withdraws Caliburn, a ceremonial sword embedded in a large slab of stone.
Altria orders a grieving Bedivere to dispose of Excalibur by throwing it back to Vivian; in her absence, 
she reflects on her personal failures, regretting her life as king.
At her nativity, Uther decides to not publicly 
announce Altria's birth or gender, fearing his subjects will never accept a woman as a legitimate ruler.
Before her last breath, she appeals to the world; in exchange for services as a Heroic Spirit, she 
asks to be given an opportunity to relive her life, where someone more suitable and effective would lead Britain in her stead.
Despite her immense strength and fighting abilities, Altria is plagued by feelings of guilt and inferiority throughout her reign; she sacrifices her emotions for 
the good of Britain, yet many of her subjects and knights become critical of her lack of humanity and cold calculation.
However, pulling this sword is symbolic of accepting the 
hardships of a monarch, and Altria will be responsible for preserving the welfare of her people.

完成了句子的排序,一旦有no_of_nouns和no_of_ners的评分列表,就可以围绕他们建议一些更加复杂的规则。
v2_a25d2dd7839c74213432d85a9eeed7ab_hd

目录
相关文章
|
开发者 Python
正则表达式简介 | 手把手教你入门Python之八十四
正则表达式是⼀个特殊的字符序列,计算机科学的⼀个概念。通常被⽤来检索、替换那些符合某个模式(规则)的⽂本。
|
开发者 Python
MarkDown语法的使用 | 手把手教你入门Python之四
上节课我们安装了Typora,它就是一个用来编辑笔记的软件,而且它支持的格式和语法是MD形式,这节课我们就来学习该语法的使用。
MarkDown语法的使用 | 手把手教你入门Python之四
|
存储 Python
集合的简介 | Python从入门到精通:进阶篇之十七
本节的重点介绍集合的一些基本操作方法,包括创建、删除、清空、浅复制等。
集合的简介 | Python从入门到精通:进阶篇之十七
字典的使用(下) | Python从入门到精通:进阶篇之十五
本节重点介绍了字典中的一些基本操作。包括删除的几种不同方法,浅复制的方法等。
字典的使用(下) | Python从入门到精通:进阶篇之十五
字典的使用(上) | Python从入门到精通:进阶篇之十四
本节重点介绍了字典中的一些基本操作,包含创建字典,获取字典的个数,检查字典中是否包含/不包含某个键,以及获取value,修改字典等操作方法。
字典的使用(上) | Python从入门到精通:进阶篇之十四
|
索引 Python
列表中的方法简介 | Python从入门到精通:进阶篇之五
本文讲述了如何通过方法修改列表,包括添加元素、删除元素、列表的翻转以及排序等的使用方式。
列表中的方法简介 | Python从入门到精通:进阶篇之五
|
存储 搜索推荐 程序员
学习准备(下):计算机基础 | Python从入门到精通:入门篇之二
本章节继续讲授关于计算机基础的课程,包括进制、Python编辑器的安装以及编码方式等。
学习准备(下):计算机基础 | Python从入门到精通:入门篇之二
|
Python 程序员
语法学习 | python从入门到精通:入门篇之四
本节将学习一些Python的的基本语法,包括变量、标识符、字面量等内容。
10212 0
语法学习 | python从入门到精通:入门篇之四
|
机器学习/深度学习