早教吧作业答案频道 -->英语-->
英语翻译中文分词是中文信息处理的基础.在自然语言理解、语言文字研究、中文文本自动标引、信息检索、机器翻译等领域中,中文分词具有不可替代的作用.因此,中文分词的研究至关重要.
题目详情
英语翻译
中文分词是中文信息处理的基础.在自然语言理解、语言文字研究、中文文本自动标引、信息检索、机器翻译等领域中,中文分词具有不可替代的作用.因此,中文分词的研究至关重要.
但是,中文分词的研究水平已经远落后于与它关联的相关技术,成为制约其它技术发展的瓶颈.中文分词的研究过程中遇到了以下问题:语言学方面的困难,新词的不断出现,歧义的判别,分词的标准不统一等;计算机方面的困难,没有合理的自然语言形式模型,没有有效方式对语义进行理解以及形式化等.这些问题将会制约着中文分词的发展.
本文在综合分析现有的中文分词研究成果,重点对基于图的中文分词进行研究,提出了基于S-EK图最短路径的中文分词.研究的主要内容如下:
1.对中文分词的主要的算法进行了研究,比较和分析了常用的三种分词算法:基于字符串匹配的分词算法,基于统计的分词算法和基于知识理解的分词算法,并对它们之间的优缺点进行了总结.最后文章还给出了中文分词的评测标准及其意义.
2.重点在有向图和中文分词结合方面进行了深入研究,对N-最短路径中文分词的算法中的有向图进行改进,提出了S-EK图,并采用N-元统计模型计算出一个词在一定的语境下的概率,并对该值做了平滑处理,把最后的结果作为S-EK图的边的权值.
3.基于S-EK图的优点提出了S-EK最短路径算法.该算法在与N-最短路径算法和Dijkstra算法进行对比,实验和理论推导均证明该算法有一定的优点和价值.
关键词:中文分词;信息处理;S-EK图;最短路径;统计模型
中文分词是中文信息处理的基础.在自然语言理解、语言文字研究、中文文本自动标引、信息检索、机器翻译等领域中,中文分词具有不可替代的作用.因此,中文分词的研究至关重要.
但是,中文分词的研究水平已经远落后于与它关联的相关技术,成为制约其它技术发展的瓶颈.中文分词的研究过程中遇到了以下问题:语言学方面的困难,新词的不断出现,歧义的判别,分词的标准不统一等;计算机方面的困难,没有合理的自然语言形式模型,没有有效方式对语义进行理解以及形式化等.这些问题将会制约着中文分词的发展.
本文在综合分析现有的中文分词研究成果,重点对基于图的中文分词进行研究,提出了基于S-EK图最短路径的中文分词.研究的主要内容如下:
1.对中文分词的主要的算法进行了研究,比较和分析了常用的三种分词算法:基于字符串匹配的分词算法,基于统计的分词算法和基于知识理解的分词算法,并对它们之间的优缺点进行了总结.最后文章还给出了中文分词的评测标准及其意义.
2.重点在有向图和中文分词结合方面进行了深入研究,对N-最短路径中文分词的算法中的有向图进行改进,提出了S-EK图,并采用N-元统计模型计算出一个词在一定的语境下的概率,并对该值做了平滑处理,把最后的结果作为S-EK图的边的权值.
3.基于S-EK图的优点提出了S-EK最短路径算法.该算法在与N-最短路径算法和Dijkstra算法进行对比,实验和理论推导均证明该算法有一定的优点和价值.
关键词:中文分词;信息处理;S-EK图;最短路径;统计模型
▼优质解答
答案和解析
The Chinese word segmentation is Chinese information processing foundation. In natural language understanding, language research, Chinese text automatic indexing, information retrieval, machine translation, etc, the Chinese word segmentation plays an irreplaceable role. Therefore, the Chinese word segmentation research is very important.
However, the Chinese word segmentation research level is already far behind its associated related technologies, become the bottleneck of restricting the development of other technologies. The Chinese word segmentation research process encountered the following questions: linguistic difficulties, the words appear ceaselessly, ambiguity discriminant, participle standard is not uniform; Computer difficulties, no reasonable natural language form model, no effective way for understanding of the semantic and formalized, etc. These problems will restricts the development of the Chinese word segmentation.
Based on synthetic analysis of existing research results of the Chinese word segmentation, focus on Chinese word segmentation based on graph, is put forward based on S - EK figure shortest path Chinese word segmentation. The main content of the study are as follows:
1. The main for the Chinese word segmentation algorithm was studied, and the comparison and analysis of three commonly used words segmentation algorithm based on string matching, based on statistical words segmentation algorithm and the words segmentation algorithm based on knowledge understanding and of words segmentation algorithm and the advantages and disadvantages of between them are summarized. Finally the paper also gives the assessment of the Chinese word segmentation and its significance.
2. Key in a directed graph and combined Chinese word segmentation is studied, the shortest path to N - the Chinese word segmentation algorithm digraph was improved, puts forward S - EK chart and adopt N - yuan statistical model to compute a word in a certain context, and the probability of made smooth processing, value the final result as S - EK figure edge metric.
3. Based on S - EK proposed graph advantages s-rough shortest path algorithm EK. This algorithm in and N - a shortest path algorithm and Dijkstra algorithm is compared, and the experiment and theoretical derivation proves this algorithm has certain advantages and value.
However, the Chinese word segmentation research level is already far behind its associated related technologies, become the bottleneck of restricting the development of other technologies. The Chinese word segmentation research process encountered the following questions: linguistic difficulties, the words appear ceaselessly, ambiguity discriminant, participle standard is not uniform; Computer difficulties, no reasonable natural language form model, no effective way for understanding of the semantic and formalized, etc. These problems will restricts the development of the Chinese word segmentation.
Based on synthetic analysis of existing research results of the Chinese word segmentation, focus on Chinese word segmentation based on graph, is put forward based on S - EK figure shortest path Chinese word segmentation. The main content of the study are as follows:
1. The main for the Chinese word segmentation algorithm was studied, and the comparison and analysis of three commonly used words segmentation algorithm based on string matching, based on statistical words segmentation algorithm and the words segmentation algorithm based on knowledge understanding and of words segmentation algorithm and the advantages and disadvantages of between them are summarized. Finally the paper also gives the assessment of the Chinese word segmentation and its significance.
2. Key in a directed graph and combined Chinese word segmentation is studied, the shortest path to N - the Chinese word segmentation algorithm digraph was improved, puts forward S - EK chart and adopt N - yuan statistical model to compute a word in a certain context, and the probability of made smooth processing, value the final result as S - EK figure edge metric.
3. Based on S - EK proposed graph advantages s-rough shortest path algorithm EK. This algorithm in and N - a shortest path algorithm and Dijkstra algorithm is compared, and the experiment and theoretical derivation proves this algorithm has certain advantages and value.
看了 英语翻译中文分词是中文信息处...的网友还看了以下:
英语翻译内容摘要:诚信是企业长期发展的基石.本文分析了诚信在市场营销中的重要作用,探讨了企业开展诚 2020-05-15 …
周处原文译文 2020-05-17 …
做早操对身体有好处中文译英文 2020-06-04 …
蒹葭萋萋,白露未希.所谓伊人,在水之湄.溯洄从之,道阻且跻.溯游从之,宛在水中坻.(译文和出处)全 2020-06-10 …
文言文翻译:良将劲弩守要害之处,信臣精卒陈利兵而谁何. 2020-06-12 …
下面文字在语言表达方面有多处错误,请找出两处加以改正。1992年,世界上第一条手机短信在英国发送成 2020-06-17 …
把下面句子翻译成现代汉语。(1)今文信侯自请卿相燕而不肯行,臣不知卿所死处矣。译文:(2)王不如赍 2020-07-01 …
2.用现代汉语写出下面句子的意思。(1)小信未孚,神弗福也。译文:(2)牺牲玉帛,弗敢加也,必以信。 2020-11-10 …
之字在句子中的意义1反归取之()2何不试之以足()3之志市,而忘操之()翻译下面句子1及反,市罢,隧 2020-11-26 …
2.用现代汉语写出下面句子的意思。(1)公曰:“牺牲玉帛,弗敢加也,必以信。”译文:(2)小信未孚, 2020-12-19 …