Eberhardt School of Business Faculty Presentations

Discourse Structure Identification for Knowledge Extraction

Jamie Guidry, Louisiana State University
Leili Javadpour, Louisiana State UniversityFollow
Gerald M. Knapp, Louisiana State University
J. Guidry

ORCiD

Leili Javadpour: 0000-0003-4004-1950

Document Type

Conference Presentation

Conference Title

Industrial and Systems Engineering Research Conference (ISERC)

Location

San Juan, Puerto Rico

Conference Dates

May 18-22, 2013

Date of Presentation

5-1-2013

First Page

214

Last Page

223

Abstract

Identification of a document's discourse structure - what each part contributes to the ideas presented, such as hypothesis, support, comparison, and results - is a key precursor to improving knowledge extraction from technical documents. As yet, only a few efforts have been made at automating discourse structure identification, with limited success. The current state-of-the-art discourse parser, SPADE, is limited to parsing discourse within a single sentence. HILDA extends the parsing abilities of SPADE to the document level structure, but with a significant decrease in performance. Both are based on Rhetorical Structure Theory (RST), a widely accepted approach for analyzing discourse coherence, and which holds that coherent text can be placed into a hierarchical organization of interrelated clauses. This paper documents the first part of a study that will achieve RST-based document-level discourse parsing without sacrificing performance. It addresses the first two steps of discourse parsing: structuring and nuclearity labeling. An algorithm was developed for classifying relation existence and nuclearity that improved upon previous methods.

Recommended Citation

Guidry, J., Javadpour, L., Knapp, G. M., & Guidry, J. (2013). Discourse Structure Identification for Knowledge Extraction. Paper presented at Industrial and Systems Engineering Research Conference (ISERC) in San Juan, Puerto Rico.
https://scholarlycommons.pacific.edu/esob-facpres/388

Link to Full Text

COinS

Eberhardt School of Business Faculty Presentations

Discourse Structure Identification for Knowledge Extraction

ORCiD

Document Type

Conference Title

Location

Conference Dates

Date of Presentation

First Page

Last Page

Abstract

Recommended Citation

Search

Browse

Author Corner

Links

Eberhardt School of Business Faculty Presentations

Discourse Structure Identification for Knowledge Extraction

Authors

ORCiD

Document Type

Conference Title

Location

Conference Dates

Date of Presentation

First Page

Last Page

Abstract

Recommended Citation

Share

Search

Browse

Author Corner

Links