基于规则的汉语基本块自动分析器

CORC > 清华大学

	基于规则的汉语基本块自动分析器
	周强 ; Zhou Qiang
	2010-07-15 ; 2010-07-15
会议名称	中国计算技术与语言问题研究——第七届中文信息处理国际会议论文集 ; Chinese Computing Technologies and Related Linguistic Issues--Proceedings of the 7th International Conference on Chinese Computing ; 第七届中文信息处理国际会议 ; The 7th International Conference on Chinese Computing ; 中国湖北武汉 ; CNKI ; 中国中文信息学会、新加坡中文与东方语言信息处理学会、武汉大学语言与信息研究中心
关键词	基本块部分分析规则驱动排歧 Base Chunk Partial Parsing Rule-driven Disambiguation H087
其他题名	A Rule-based Chinese Base Chunk Parser
中文摘要	本文提出了一种规则驱动的汉语基本块自动分析方法,它的主要分析资源是从大规模标注语料库和词汇关联知识库的交互作用中自动习得的融合内部词汇关联和外部语境限制约束知识的分层次、多粒度的基本块规则库。利用其中各条规则的置信度信息,可以有效地驱动汉语真实文本句子的多词语基本块的自动识别过程,同时完成歧义结构自动排歧。初步的实验结果表明,现有分析器可以在95%以上的开放测试语料上达到90%左右的 F-measure 值,同时又保留了约5%的在现有知识库条件下很难判断的复杂歧义结果供后续分析器选择使用,显示出较好的处理灵活性和有效性。; This paper proposed a rule-driven Chinese chunking algorithm,whose main parsing resource is a hierarchical rule base automatically learned from the interaction of a large-scale annotated corpus and a lexical knowledge base.Each rule in it can obtain a confident score to evaluate the construction reliability of a special multiword chunk under some refined knowledge about its internal lexical relationship and external contextual restriction.These confident values give us alternative opportunity for efficient multiword chunk recognition and disambiguation.Some primitive experimental results indicate that our current parser can achieve the chunking performance of about 90% overall F-measure under 95% open testing texts,and still keep 5% ambiguous regions due to the deficiency of reliable enough knowledge.Therefore,the more complex parser can be developed to select suitable chunks among them based on some new knowledge and larger contexts.
会议录出版者	电子工业出版社
语种	中文 ; 中文
内容类型	会议论文
源URL	[http://hdl.handle.net/123456789/69770]
专题	清华大学
推荐引用方式 GB/T 7714	周强,Zhou Qiang. 基于规则的汉语基本块自动分析器[C]. 见:中国计算技术与语言问题研究——第七届中文信息处理国际会议论文集, Chinese Computing Technologies and Related Linguistic Issues--Proceedings of the 7th International Conference on Chinese Computing, 第七届中文信息处理国际会议, The 7th International Conference on Chinese Computing, 中国湖北武汉, CNKI, 中国中文信息学会、新加坡中文与东方语言信息处理学会、武汉大学语言与信息研究中心.

个性服务

查看访问统计

相关权益政策

暂无数据

收藏/分享

所有评论 (0)

[发表评论/异议/意见]

暂无评论

评论
权益异议
反馈意见

评注功能仅针对注册用户开放，请您登录

您对该条目有什么异议，请向管理员反馈。
内容：
Email：	*
单位:
验证码：	刷新

您在知识库使用过程中有什么好的想法或者建议可以反馈给我们。
标题：	*
内容：
Email：	*
验证码：	刷新

相关链接

CORC

联系我们