admin 管理员组文章数量: 1086019
2023年12月17日发(作者:开源软件下载平台)
基于依存树库的英语名词句法的研究
⑧
论文作者签名:
指导教师签名:
论文评阅人1:
评阅人2:
评阅人3:
评阅人4:
评阅人5:
一
答辩委员会主席:一隧疆垄熬拯∑逝江王直太堂
委员1:
隆盟受熬援∑逝洹工直太堂
委员2:
委员3:
委员4:
委员5:
Committee
Chairperson
Examining
幽凼
奎徨直到熬拯∑逝江太堂
壁送壅熬拯∑逝江太堂
:
:耋I
业
一
!||
一
272011
Dateoforaldefenee
~~一一一
~一~一一
:
浙江大学研究生学位论文独创性声明艺 |l一L
/05/
本人声明所呈交的学位论文是本人在导师指导下进行的研究工作及取得的
研究成果。除了文中特别加以标注和致谢的地方外,论文中不包含其他人已经发
表或撰写过的研究成果,也不包含为获得逝壅太堂或其他教育机构的学位或
证书而使用过的材料。与我一同工作的同志对本研究所做的任何贡献均己在论文
中作了明确的说明并表示谢意。
学位论文作者繇铂咱签字眺圳年6月7日
学位论文版权使用授权书
本学位论文作者完全了解浙江太堂有权保留并向国家有关部门或机构
送交本论文的复印件和磁盘,允许论文被查阅和借阅。本人授权逝江太堂可
以将学位论文的全部或部分内容编入有关数据库进行检索和传播,可以采用影印、
缩印或扫描等复制手段保存、汇编学位论文。
保密的学位论文在解密后适用本授权书
靴敝雠轹唯必即
新鹤:袈瑶
签6月8日
占月7日 签字日期:弦11年
摘要
如今,对生语料进行句法标注己成为语料库语言学的主流趋势。树库,即通
过词性和句法标注过的语料库,作为获取句法结构的知识源和评估句法分析结果
的双重工具,引起了理论语言学和计算语言学学者们的浓厚兴趣和广泛重视。树
库所含的大量词性句法功能分布信息亦为词性语法功能的理论研究提供牢靠的
字日期:弘1 年
事实论据。
本研究在概率配价模式理论基础上,利用英语依存树库,量化分析英语名词
的各依存关系,通过逐层解析四个研究问题来描述英语名词的搭配和句法功能,
挖掘本研究和其他中英文研究成果的同异之处。
本研究的语料取自新概念英语和牛津高中英语课本,整个依存树库由6200
条英语句子依存语法标注构成,包含句中各个词的词性,支配词,从属词及相应
依存关系等信息。该树库的语法标注工作结合了斯坦福分析器的自动语法分析和
人工校对两个部分,全部数据都被自动输入进Excel表格,然后应用Excel分别
对名词作为支配词或从属词两种情况进行名词句法功能的统计分析,最后根据名
词各句法功能出现的频率高低来区分名词的主要、次要和罕见句法功能。
研究结果反映,名词的主要句法功能大体上证实了前人理论,更在极大程度
上强调了主、次要句法功能的分布和使用,并补充了一些罕见句法功能。最后在
总结的同时,提出了对现行国际上通用的英语依存句法标注体系的一些补充建议。
关键词: 语料库语言学,句法标注,依存语法,树库,概率配价模式,斯坦福
语法分析器,英语名词,量化分析
Abstract
annotationonraw
Nowadays,thesyntactic becomesa
corpora dominant
tendency
in
theSOUrCeof
corpuslinguistics.As
structureandfoundationof
obtainingsyntactic
the
estimating
syntacticparsing,treebanks,annotatedbeenattached
corpusgreat
thescholars
,have
Importance fromtheoreticaland
by
computationallinguists.Thelarge
amountofdistributional
informationon
functionof
canmake
syntactic
part-of-speech
enormous
contributiononthetheoretical
research.
linguistic
011the
baseof
PvP Probabilistic
researchaimsto
ValencyPattern theory,this
thecommon
clarify collocationsand
functionof
syntactic nouns,makea
tomparison
withother
researchesviaa
previous
of
quantitativeanalysis
English
dependency
treebankand
offourresearch
respectiveinterpretations
queStions.
The
rawmaterials
ofover6200
sentencesaleextracted
English
fromthetextsof
New
andAdvancewith
Concept the
English
Englishtreebank
dependency
contains
and
,and
structure
part。of-speechtags syntactic ofthe
tags
sentencesincluding
of
the
tags ofeach
govemor’dependent the
word,and
correspondingdependency
relations.Theannotation
oftreebankwas
,
intwo
automatic
completed
phases
of
Stanfordandmanual
parsing verificationof
parser the
results.Allthese
previous
datawere
:the
intotheExcel
automaticallygenerated the
form,and of
quantitative
report
the
functions
was
syntactic Excelinthe of
analyzedby
twoconditionswhere
light
nouDsact
the and
English
govemor the
dependent seriesofwork
distinctively.After
done
and
previously,thetypical functions
atypicalsyntactic nounsWOuIdbe
ofEnglish
concludedtothe
according
frequency.
Keywords:corpus
linguisticsdependencygrammar,Weebank,
PVP
theory,Stanford
parser,Englishnouns,quantitativeanalysis
ListofT曲les
ThesisforMaster
DegreeofZhejiangUniversity
Contents
摘要„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„.i
,syntacticannotation,
Abstract„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„ii
OneIntroduction„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„1
Chapter
1.1
ofResearch„„„„„„„„„„„„„„„„„„„„„„„„„„„„„I
Background
1.2
ofResearch„„„„„„„„„„„„„„„„„„„„„„„„„„„„..3
Significance
1.3
ofResearch„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„5
Purpose
1.4
oftheThesis„„„„„„„„„„„„„„„„„„„„„„„„„„„„5
Organization
TwoLiteratureReview„„„„„„„„„„„„„„„„„„„„„„„„„„„„7
Chapter
2.1
Grammar„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„7
Dependency
2.1.1 of
Grammar„„„„„„„„„„„„„„„„„„„„„.7
PhylogenyDependency
1
2.1.2Varieties
Grammar„„„„„„„„„„„„„„„„„„„„„„.1
ofDependency
2.1.2.1WordGrammar„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„12
Generative
2.I.2.2Functional
Description„„„„„„„„„„„„„„„„„„.„„..12
2
UnificationGrammar„„„„„„„„„„„„„„„„„„„„„1
2.1.2.3
Dependency
3
2.1.2.4
Theory„„„„„„„„„„„„„„„„„„„„„„„„„„„..1
Meaning―Text
4
2.1.2.5Constraint
Grammar„„„„„„„„„„„„„„„„„„„„„.1
Dependency
2.2
andTreebank„„„„„„„„„„„„„„„„„„„„„„„..14
CorpusLinguistics
5
2.2.1
Linguistics„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„.1
Corpus
6
2.2.2Treebank„„„„.„„„„„„„„„„.„„„„„.„„„„„„„„„„„„„„„.1
Structure„„„„„„„„„17
of StructureandPhrase
2.2.3
Dependency
Comparison
8
PVP
2.3
theory„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„.1
2.3.1Definition
ofValency„„„„„„„„„„„„„„„„„„„„„„„„„„„„„18
PVP
anditsRelatedResearch„„„„„„„„„„„„„„„„„„„„.20
2.3.2
Theory
ThreeResearch
Methodology„„„„„„„„„„„„„„„„„„„„„„„..25
Chapter
3.1
Research
Quegions„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„25
3.2Materials„„„„„„„„„„„„„.„.„„„„„„.„„„„„„„„„„„„„„.„25
Instruments„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„26
3.3
IV
ThesisforMaster
ListofTables
DegreeofZhejiangUniversity
:;.3.1Parser„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„27
3.3.1.1
ofParser„„„„„„„„„„„„„„„„„„„„„„„„。„„„„„„..28
Types
3.3.1.2Stanford
Parser„„„„„„„„„„„„„„„„„„„„„„„„29
Dependency
l
:;.3.2Excel„„„„„„„„„„„„„„„„„„„„„„„„„
„„„„„„„„„„„„.3
3.4ResearchProcedures„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„.32
3.4.1 the
ConstructingCorpus„„„„„„„„„„„„„„„„„„„„„„„„„„„.32
3.4.2Automatic
Part-of-SpeechTagging„„„„„„„„„„„„„„„„„„„„„„33
Relations
3.4.3Automatic
Tagging„„„„„„„„„„„„„„„„„..33
Dependency
3.4.4Manual Relations
Checking„„„„„„„„„„„„„.„„„..-„33
Dependency
3.4.5Statigical
Analysis„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„.34
FourResultsandDiscussion„.„.„„„„„„„„„.„„„.„.„„„„„.„.„35
Chapter
4.1 Relations NounsastheGovernor„„„„„„„„„„„35
DependencyofEnglish
4.1.1 RelationsGoverned Nouns„„„„„.35
ProbabilityofDependency byEnglish
4.1.2GrammaticalFunctions
NounsastheGovernor„„„„„„„„„.38
ofEnglish
1
4.2 Relationsof Nounsasthe
Dependency
Dependent„„„„„„„„„„.4
4.2.1
Nouns„„„„„„.41
ProbabilityofDependencyGoverningEnglish
4.3
Nouns„„„„„„„„„„„„„„„„„44
TypicalSyntactic English
English
Relations
Functionsof
4.4
betweenNounsandChineseNouns„„„„„„„„„„„.49
ComparisonEnglish
FiveConclusion„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„..53
Chapter
1;.1
Conclusions„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„..53
5.2Limitationsand
Suggestions„.„„„„„„„„„.„.„„.„„.„„.„„„„„„„.54
References„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„56
Appendix„„„„„„„„„„„„„.„„.„„„.„„„„„„„.„„.„.„„„„„„„„..63
I„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„.63
Appendix
II„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„66
Appendix
III„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„.67
Appendix
Publications„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„.68
Acknowledgements„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„.69
V
ThesisforMaster
ListofTabies
DegreeofZhejiangUniversity
Listof
Figures
2.1 oftheSentence:Tomeatsan
FigureGraphicalRepresentation
apple„„„„.8
2.2
ofthe
brotherlikesthis
Figure
GraphicalRepresenantionSentence:Myyoung
beautiful
girl.„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„..8
2.3 the
of
Thisbeautifulfasinates
FigureGraphicalRepresentation girl
brother.„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„.9
myyoung
Sentence:
2.4 ofPartof
FigureDependencyRelationshipSpeech„„„„„„„„„„„„„..9
2.5PhraseStructureTreeforthe
Sentence‘‘Economicnewshadlittle
Figure
effectonfinancialmarkets’’„„„„„„„„„„„„„„„„„„„„„„„„..18
2.6
:StructureTreefortheSameSentence.„„„„„„.„.„.18
FigureDependency
2.7 ofPVP
Figure
Theory„„„„„.„„„„„„„„„„.„.„„„„„„.21
Diagram
2.8 Relations
ofChineseVerbandNounin
FigureDependencyRepresentations
PVP„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„23
2.9 ofChineseNounsastheGovernorinGao
FigureFinding
Song’SStudy„„24
Shot
Screenofthe
Figure:;.1
Transcript.„„„„„„„„„„„„„„„„„„„„.26
3.2 ExcelFileoftheSentence:Mrs.Smith’Skitchenis
FigureAuto―Generating
small„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„..26
3.3Process a
Source
ofParsing
Figure String
ComputerLanguage „„„„.„.28
3.4GrammaticalRelation
Figure
Hierarchy„„„„„„„„„„„„„„„„„„..30
3.5Resultsof
Stanford
ParserinExcelFormat„„.„.„„..31
Figure Dependency
3.6StatisticalResult
ofExcel„„„„„„„„„„„„„„„„„„„„„„.32
Figure
4.1PVP
NounsasGovernor„„„„„„„„„„„„„..46
Figure GraphofEnglish
4.2PVP of as
Nouns
Figure GraphEnglish Dependent„„„„„„„„„„„„„47
VI
ThesisforMaster List
Degree University ofT:lbles
ofZhejiang
of
ListTables
t出le4.1 RelationsDistributionofNounsas
Governor„„„„„„36
Dependency
Table4.2 ModifiersDistributionfor
Nouns„„„„„„„„.39
Preposition English
1’able4.3 RelationsDistributionofNounsas
1
Dependency
Dependent„„„„„.4
Table4.4SimilaritiesandDifferencesofPreviousandPresenton
Study Study
Nouns„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„..z18
English
Table4.5SimilaritiesandDifferencesofChineseand NounsGovernor
English
„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„.49
Table4.6SimilaritiesandDifferencesofChineseand Nounsas
觞
English
1
Dependent„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„..5
Ⅶ
OneIntroduction
Chapter
Asthe oftime??-honoredresearchand
integration
cuttingedgecomputer
arousedenthusiasmthe
science,natural
linguistic
language great among
processing NLP has
westernandChinesescholarsofvariousscientific
since1950s.Before
disciplines
1 NLP werebasedon setsofhand―written
990s,most rules,which
systems complex
metseveraldifficultiesin realistic revolutiontowards
corpora.A corpus
large―scale
hadcameforwardafterCOLING1 Conferencefor
linguistics 990 International
TMI and
Issuesin
ComputationalLinguistics and1992 TheoreticalMethodological
Machine
Translation .Since called‘treebanks’,
that,parsedcorpora,commonly
commencesits in machine
leadingperformanceempiricallinguistics硒well硒in
methodsin in
natural
domain,the
learning languageprocessing.Particularlylinguistic
oftreebanksisattachedmuchmore
researchin
usage importancesyntactic
recent and
years Abeilld,2003;Hinrichs
and
structure knownas
Liu,2009 .Phrasegrammar
dependencygrammar also
thetwoessentialtheoreticaltoolsin the
valencygrammar
syntacticallyannotating
raw
tothe
are
corpora.
structure outcomeofdominant
Although Chomsky锄
phrase grammar,the
been intoNLPdomainwith
theories,has
widelyapplied high―raterecognition,the
interestin to
seemsbedrivenboth
increasing usingdependency―basedrepresentation
usefulnessofbilexicalrelationsin andthe
by‘'thepotential disambiguationby gains
thatresultfromthemoreconstrained these
inefficiency
parsingproblem’’for
from are
representations Nivre,2005 ,derivingthese,dependencyparsersexpectedly
tofulfillNLP machinetranslationto the
employed tasks,from questionanswering.In
of
this treebank-basedresearchleavesahot
light trend,dependencylinguistic vacancy
tofill
in.
1.1 ofResearch
Background
with structure researchesand on
Comparedphrase grammar,the experiments
werecarriedoutlaterin
dependencygrammar syntactic
English parsing.Dependency
lor
IhesisMaster
OneIntroduction
DegreeZhejiangUniversity Chapmr
de structuralwas
grammar grammairedependance andsyntax firstlyformally
of
introducedFrench
the
by
linguist
011
andhad
comprehensiveinvestigationdependencygrammar
developed
forwardthenotionofWord
984,1
Meaning―TextTheory MTD.Hudson 1
990 put
the
definitionof
Unification
986,2003 raised
Grammar.Hellwig 1 Dependency
Grammar.Besidesthesewell-knowntheoriesof
DependencyGrammar,agreat
numberofscholarsalsodevotedtheireffoastorefineand this
improvetheory P.
and and
Helzerman,1995;J"arvinen
1994;Harper
Duchierand and
Debusmann,2001;Schr"。oder,2002;CreswellRambow,2003;
Dras,
2004;Klein,2004;Nivre,2005;Eisner,2005 .
treebankisatext inwhich
sentencehasbeenannotated
Dependency
corpus every
with
transferredthePennTreebank
dependencystructures.Eisner 1996 originally
into WaSnotedfor annotationOn
dependencyrepresentations,whichphrasegrammar
。
theresearchfoundationof
and
Magerman,Collins
thetaskof PennTreebankinto
Nivre 2006 perfectedtransferring dependency
structure.Withthe of a
prevalencebuildingup dependency
treebankswitha
of versionshavebeen allthese
variety
language exploited years,for
Arabic Arabic
instance,PragueDependencyTreebank PADT andQuranic
版权声明:本文标题:基于依存树库的英语名词句法的研究(可编辑) 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://roclinux.cn/b/1702797155a431270.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论