[试题] 104下 陈信希 自然语言处理 期中考

楼主: madeformylov (睡觉治百病)   2016-07-04 19:05:03
课程名称︰自然语言处理
课程性质︰系内选修
课程教师︰陈信希
开课学院:电资学院
开课系所︰资讯工程学系
考试日期(年月日)︰2016/04/21
考试时限(分钟):180 mins
试题 :
01. Machine translation (MT) is one of practical NLP applications. The
development of MT systems has a long history, but still has space to
improve. Please address two linguistic phenomena to explain why
MT systems are challenging. (10pts)
02. An NLP system can be implemented in a pipeline, including modules of
morphological processing, syntactic analysis, semantic interpretation
and context analysis. Please use the following news story to describe
the concepts behind. You are asked to mention one task in each module.
(10pts)
这场地震可能影响日相安倍晋三的施政计画。安倍十八日说,消费睡调涨的
计画不会改变。
03. Ambiguity is inherent in natural language. Please describe why ambiguity
may happen in each of the following cases. (10pts)
(a) Prepositional phrase attachment.
(b) Noun-noun compound.
(c) Word: bass
04. Why the extraction of multiword expressions is critical for NLP
applications? Please propose a method to check if an extracted multiword
expression meets the non-compositionality criterion, (10pts)
05. Mutual information and likelihood ratio are commonly used to find
collocations in a corpus. Please describe the ideas of these two methods.
(10pts)
06. Emoticons are commonly used in social media. They can be regarded as a
special vocabulary in a language. Emoticon understanding is helpful to
understand the utterances in an interaction. Please propose an "emoticon"
embedding approach to represent each emoticons as a vector, and find the
most 5 relevant words to each emoticon. (10pts)
07. To deal with unseen n-grams, smoothing techniques are adopted in
conventional language modeling approach. They are applied to n-grams to
reallocate probability mass from observed n-grams to unobserved n-grams,
producing better estimates for unseen data. Please show a smoothing
technique for the conventional language model, and discuss why neural
network language model (NNLM) can achieve better generalization for unseen
n-grams. (10pts)
08. In HMM learning, we aim at inferring the best model parameters, given a
skeletal model and an observation sequence. The following two equations
are related to compute the state transition probabilities.
Σ_{t=1}^{T-1} ξ_t(i, j)
\hat{a}_{ij} =

Links booklink

Contact Us: admin [ a t ] ucptt.com