admin 管理员组

文章数量: 1086019


2023年12月17日发(作者:开源软件下载平台)

基于依存树库的英语名词句法的研究

论文作者签名:

指导教师签名:

论文评阅人1:

评阅人2:

评阅人3:

评阅人4:

评阅人5:

答辩委员会主席:一隧疆垄熬拯∑逝江王直太堂

委员1:

隆盟受熬援∑逝洹工直太堂

委员2:

委员3:

委员4:

委员5:

Committee

Chairperson

Examining

幽凼

奎徨直到熬拯∑逝江太堂

壁送壅熬拯∑逝江太堂

:耋I

!||

272011

Dateoforaldefenee

~~一一一

~一~一一

浙江大学研究生学位论文独创性声明艺 |l一L

/05/

本人声明所呈交的学位论文是本人在导师指导下进行的研究工作及取得的

研究成果。除了文中特别加以标注和致谢的地方外,论文中不包含其他人已经发

表或撰写过的研究成果,也不包含为获得逝壅太堂或其他教育机构的学位或

证书而使用过的材料。与我一同工作的同志对本研究所做的任何贡献均己在论文

中作了明确的说明并表示谢意。

学位论文作者繇铂咱签字眺圳年6月7日

学位论文版权使用授权书

本学位论文作者完全了解浙江太堂有权保留并向国家有关部门或机构

送交本论文的复印件和磁盘,允许论文被查阅和借阅。本人授权逝江太堂可

以将学位论文的全部或部分内容编入有关数据库进行检索和传播,可以采用影印、

缩印或扫描等复制手段保存、汇编学位论文。

保密的学位论文在解密后适用本授权书

靴敝雠轹唯必即

新鹤:袈瑶

签6月8日

占月7日 签字日期:弦11年

摘要

如今,对生语料进行句法标注己成为语料库语言学的主流趋势。树库,即通

过词性和句法标注过的语料库,作为获取句法结构的知识源和评估句法分析结果

的双重工具,引起了理论语言学和计算语言学学者们的浓厚兴趣和广泛重视。树

库所含的大量词性句法功能分布信息亦为词性语法功能的理论研究提供牢靠的

字日期:弘1 年

事实论据。

本研究在概率配价模式理论基础上,利用英语依存树库,量化分析英语名词

的各依存关系,通过逐层解析四个研究问题来描述英语名词的搭配和句法功能,

挖掘本研究和其他中英文研究成果的同异之处。

本研究的语料取自新概念英语和牛津高中英语课本,整个依存树库由6200

条英语句子依存语法标注构成,包含句中各个词的词性,支配词,从属词及相应

依存关系等信息。该树库的语法标注工作结合了斯坦福分析器的自动语法分析和

人工校对两个部分,全部数据都被自动输入进Excel表格,然后应用Excel分别

对名词作为支配词或从属词两种情况进行名词句法功能的统计分析,最后根据名

词各句法功能出现的频率高低来区分名词的主要、次要和罕见句法功能。

研究结果反映,名词的主要句法功能大体上证实了前人理论,更在极大程度

上强调了主、次要句法功能的分布和使用,并补充了一些罕见句法功能。最后在

总结的同时,提出了对现行国际上通用的英语依存句法标注体系的一些补充建议。

关键词: 语料库语言学,句法标注,依存语法,树库,概率配价模式,斯坦福

语法分析器,英语名词,量化分析

Abstract

annotationonraw

Nowadays,thesyntactic becomesa

corpora dominant

tendency

in

theSOUrCeof

corpuslinguistics.As

structureandfoundationof

obtainingsyntactic

the

estimating

syntacticparsing,treebanks,annotatedbeenattached

corpusgreat

thescholars

,have

Importance fromtheoreticaland

by

computationallinguists.Thelarge

amountofdistributional

informationon

functionof

canmake

syntactic

part-of-speech

enormous

contributiononthetheoretical

research.

linguistic

011the

baseof

PvP Probabilistic

researchaimsto

ValencyPattern theory,this

thecommon

clarify collocationsand

functionof

syntactic nouns,makea

tomparison

withother

researchesviaa

previous

of

quantitativeanalysis

English

dependency

treebankand

offourresearch

respectiveinterpretations

queStions.

The

rawmaterials

ofover6200

sentencesaleextracted

English

fromthetextsof

New

andAdvancewith

Concept the

English

Englishtreebank

dependency

contains

and

,and

structure

part。of-speechtags syntactic ofthe

tags

sentencesincluding

of

the

tags ofeach

govemor’dependent the

word,and

correspondingdependency

relations.Theannotation

oftreebankwas

intwo

automatic

completed

phases

of

Stanfordandmanual

parsing verificationof

parser the

results.Allthese

previous

datawere

:the

intotheExcel

automaticallygenerated the

form,and of

quantitative

report

the

functions

was

syntactic Excelinthe of

analyzedby

twoconditionswhere

light

nouDsact

the and

English

govemor the

dependent seriesofwork

distinctively.After

done

and

previously,thetypical functions

atypicalsyntactic nounsWOuIdbe

ofEnglish

concludedtothe

according

frequency.

Keywords:corpus

linguisticsdependencygrammar,Weebank,

PVP

theory,Stanford

parser,Englishnouns,quantitativeanalysis

ListofT曲les

ThesisforMaster

DegreeofZhejiangUniversity

Contents

摘要„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„.i

,syntacticannotation,

Abstract„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„ii

OneIntroduction„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„1

Chapter

1.1

ofResearch„„„„„„„„„„„„„„„„„„„„„„„„„„„„„I

Background

1.2

ofResearch„„„„„„„„„„„„„„„„„„„„„„„„„„„„..3

Significance

1.3

ofResearch„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„5

Purpose

1.4

oftheThesis„„„„„„„„„„„„„„„„„„„„„„„„„„„„5

Organization

TwoLiteratureReview„„„„„„„„„„„„„„„„„„„„„„„„„„„„7

Chapter

2.1

Grammar„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„7

Dependency

2.1.1 of

Grammar„„„„„„„„„„„„„„„„„„„„„.7

PhylogenyDependency

1

2.1.2Varieties

Grammar„„„„„„„„„„„„„„„„„„„„„„.1

ofDependency

2.1.2.1WordGrammar„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„12

Generative

2.I.2.2Functional

Description„„„„„„„„„„„„„„„„„„.„„..12

2

UnificationGrammar„„„„„„„„„„„„„„„„„„„„„1

2.1.2.3

Dependency

3

2.1.2.4

Theory„„„„„„„„„„„„„„„„„„„„„„„„„„„..1

Meaning―Text

4

2.1.2.5Constraint

Grammar„„„„„„„„„„„„„„„„„„„„„.1

Dependency

2.2

andTreebank„„„„„„„„„„„„„„„„„„„„„„„..14

CorpusLinguistics

5

2.2.1

Linguistics„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„.1

Corpus

6

2.2.2Treebank„„„„.„„„„„„„„„„.„„„„„.„„„„„„„„„„„„„„„.1

Structure„„„„„„„„„17

of StructureandPhrase

2.2.3

Dependency

Comparison

8

PVP

2.3

theory„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„.1

2.3.1Definition

ofValency„„„„„„„„„„„„„„„„„„„„„„„„„„„„„18

PVP

anditsRelatedResearch„„„„„„„„„„„„„„„„„„„„.20

2.3.2

Theory

ThreeResearch

Methodology„„„„„„„„„„„„„„„„„„„„„„„..25

Chapter

3.1

Research

Quegions„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„25

3.2Materials„„„„„„„„„„„„„.„.„„„„„„.„„„„„„„„„„„„„„.„25

Instruments„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„26

3.3

IV

ThesisforMaster

ListofTables

DegreeofZhejiangUniversity

:;.3.1Parser„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„27

3.3.1.1

ofParser„„„„„„„„„„„„„„„„„„„„„„„„。„„„„„„..28

Types

3.3.1.2Stanford

Parser„„„„„„„„„„„„„„„„„„„„„„„„29

Dependency

l

:;.3.2Excel„„„„„„„„„„„„„„„„„„„„„„„„„

„„„„„„„„„„„„.3

3.4ResearchProcedures„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„.32

3.4.1 the

ConstructingCorpus„„„„„„„„„„„„„„„„„„„„„„„„„„„.32

3.4.2Automatic

Part-of-SpeechTagging„„„„„„„„„„„„„„„„„„„„„„33

Relations

3.4.3Automatic

Tagging„„„„„„„„„„„„„„„„„..33

Dependency

3.4.4Manual Relations

Checking„„„„„„„„„„„„„.„„„..-„33

Dependency

3.4.5Statigical

Analysis„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„.34

FourResultsandDiscussion„.„.„„„„„„„„„.„„„.„.„„„„„.„.„35

Chapter

4.1 Relations NounsastheGovernor„„„„„„„„„„„35

DependencyofEnglish

4.1.1 RelationsGoverned Nouns„„„„„.35

ProbabilityofDependency byEnglish

4.1.2GrammaticalFunctions

NounsastheGovernor„„„„„„„„„.38

ofEnglish

1

4.2 Relationsof Nounsasthe

Dependency

Dependent„„„„„„„„„„.4

4.2.1

Nouns„„„„„„.41

ProbabilityofDependencyGoverningEnglish

4.3

Nouns„„„„„„„„„„„„„„„„„44

TypicalSyntactic English

English

Relations

Functionsof

4.4

betweenNounsandChineseNouns„„„„„„„„„„„.49

ComparisonEnglish

FiveConclusion„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„..53

Chapter

1;.1

Conclusions„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„..53

5.2Limitationsand

Suggestions„.„„„„„„„„„.„.„„.„„.„„.„„„„„„„.54

References„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„56

Appendix„„„„„„„„„„„„„.„„.„„„.„„„„„„„.„„.„.„„„„„„„„..63

I„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„.63

Appendix

II„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„66

Appendix

III„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„.67

Appendix

Publications„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„.68

Acknowledgements„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„.69

V

ThesisforMaster

ListofTabies

DegreeofZhejiangUniversity

Listof

Figures

2.1 oftheSentence:Tomeatsan

FigureGraphicalRepresentation

apple„„„„.8

2.2

ofthe

brotherlikesthis

Figure

GraphicalRepresenantionSentence:Myyoung

beautiful

girl.„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„..8

2.3 the

of

Thisbeautifulfasinates

FigureGraphicalRepresentation girl

brother.„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„.9

myyoung

Sentence:

2.4 ofPartof

FigureDependencyRelationshipSpeech„„„„„„„„„„„„„..9

2.5PhraseStructureTreeforthe

Sentence‘‘Economicnewshadlittle

Figure

effectonfinancialmarkets’’„„„„„„„„„„„„„„„„„„„„„„„„..18

2.6

:StructureTreefortheSameSentence.„„„„„„.„.„.18

FigureDependency

2.7 ofPVP

Figure

Theory„„„„„.„„„„„„„„„„.„.„„„„„„.21

Diagram

2.8 Relations

ofChineseVerbandNounin

FigureDependencyRepresentations

PVP„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„23

2.9 ofChineseNounsastheGovernorinGao

FigureFinding

Song’SStudy„„24

Shot

Screenofthe

Figure:;.1

Transcript.„„„„„„„„„„„„„„„„„„„„.26

3.2 ExcelFileoftheSentence:Mrs.Smith’Skitchenis

FigureAuto―Generating

small„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„..26

3.3Process a

Source

ofParsing

Figure String

ComputerLanguage „„„„.„.28

3.4GrammaticalRelation

Figure

Hierarchy„„„„„„„„„„„„„„„„„„..30

3.5Resultsof

Stanford

ParserinExcelFormat„„.„.„„..31

Figure Dependency

3.6StatisticalResult

ofExcel„„„„„„„„„„„„„„„„„„„„„„.32

Figure

4.1PVP

NounsasGovernor„„„„„„„„„„„„„..46

Figure GraphofEnglish

4.2PVP of as

Nouns

Figure GraphEnglish Dependent„„„„„„„„„„„„„47

VI

ThesisforMaster List

Degree University ofT:lbles

ofZhejiang

of

ListTables

t出le4.1 RelationsDistributionofNounsas

Governor„„„„„„36

Dependency

Table4.2 ModifiersDistributionfor

Nouns„„„„„„„„.39

Preposition English

1’able4.3 RelationsDistributionofNounsas

1

Dependency

Dependent„„„„„.4

Table4.4SimilaritiesandDifferencesofPreviousandPresenton

Study Study

Nouns„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„..z18

English

Table4.5SimilaritiesandDifferencesofChineseand NounsGovernor

English

„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„.49

Table4.6SimilaritiesandDifferencesofChineseand Nounsas

English

1

Dependent„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„„..5

OneIntroduction

Chapter

Asthe oftime??-honoredresearchand

integration

cuttingedgecomputer

arousedenthusiasmthe

science,natural

linguistic

language great among

processing NLP has

westernandChinesescholarsofvariousscientific

since1950s.Before

disciplines

1 NLP werebasedon setsofhand―written

990s,most rules,which

systems complex

metseveraldifficultiesin realistic revolutiontowards

corpora.A corpus

large―scale

hadcameforwardafterCOLING1 Conferencefor

linguistics 990 International

TMI and

Issuesin

ComputationalLinguistics and1992 TheoreticalMethodological

Machine

Translation .Since called‘treebanks’,

that,parsedcorpora,commonly

commencesits in machine

leadingperformanceempiricallinguistics硒well硒in

methodsin in

natural

domain,the

learning languageprocessing.Particularlylinguistic

oftreebanksisattachedmuchmore

researchin

usage importancesyntactic

recent and

years Abeilld,2003;Hinrichs

and

structure knownas

Liu,2009 .Phrasegrammar

dependencygrammar also

thetwoessentialtheoreticaltoolsin the

valencygrammar

syntacticallyannotating

raw

tothe

are

corpora.

structure outcomeofdominant

Although Chomsky锄

phrase grammar,the

been intoNLPdomainwith

theories,has

widelyapplied high―raterecognition,the

interestin to

seemsbedrivenboth

increasing usingdependency―basedrepresentation

usefulnessofbilexicalrelationsin andthe

by‘'thepotential disambiguationby gains

thatresultfromthemoreconstrained these

inefficiency

parsingproblem’’for

from are

representations Nivre,2005 ,derivingthese,dependencyparsersexpectedly

tofulfillNLP machinetranslationto the

employed tasks,from questionanswering.In

of

this treebank-basedresearchleavesahot

light trend,dependencylinguistic vacancy

tofill

in.

1.1 ofResearch

Background

with structure researchesand on

Comparedphrase grammar,the experiments

werecarriedoutlaterin

dependencygrammar syntactic

English parsing.Dependency

lor

IhesisMaster

OneIntroduction

DegreeZhejiangUniversity Chapmr

de structuralwas

grammar grammairedependance andsyntax firstlyformally

of

introducedFrench

the

by

linguist

011

andhad

comprehensiveinvestigationdependencygrammar

developed

forwardthenotionofWord

984,1

Meaning―TextTheory MTD.Hudson 1

990 put

the

definitionof

Unification

986,2003 raised

Grammar.Hellwig 1 Dependency

Grammar.Besidesthesewell-knowntheoriesof

DependencyGrammar,agreat

numberofscholarsalsodevotedtheireffoastorefineand this

improvetheory P.

and and

Helzerman,1995;J"arvinen

1994;Harper

Duchierand and

Debusmann,2001;Schr"。oder,2002;CreswellRambow,2003;

Dras,

2004;Klein,2004;Nivre,2005;Eisner,2005 .

treebankisatext inwhich

sentencehasbeenannotated

Dependency

corpus every

with

transferredthePennTreebank

dependencystructures.Eisner 1996 originally

into WaSnotedfor annotationOn

dependencyrepresentations,whichphrasegrammar

theresearchfoundationof

and

Magerman,Collins

thetaskof PennTreebankinto

Nivre 2006 perfectedtransferring dependency

structure.Withthe of a

prevalencebuildingup dependency

treebankswitha

of versionshavebeen allthese

variety

language exploited years,for

Arabic Arabic

instance,PragueDependencyTreebank PADT andQuranic


本文标签: 句法 标注 学位 研究