WEKO3
アイテム
IMPROVING COHERENCE IN MULI-DOCUMENT SUMMARIZATION THROUGH PROPER ORDERING OF SENTENCES
http://hdl.handle.net/2261/25847
http://hdl.handle.net/2261/25847c6a59852-c4b6-4938-9b3a-d524ab62e0c4
名前 / ファイル | ライセンス | アクション |
---|---|---|
Bollegala.pdf (541.4 kB)
|
|
Item type | 学位論文 / Thesis or Dissertation(1) | |||||
---|---|---|---|---|---|---|
公開日 | 2011-08-08 | |||||
タイトル | ||||||
タイトル | IMPROVING COHERENCE IN MULI-DOCUMENT SUMMARIZATION THROUGH PROPER ORDERING OF SENTENCES | |||||
言語 | ||||||
言語 | eng | |||||
キーワード | ||||||
主題Scheme | Other | |||||
主題 | multi-document summarization | |||||
キーワード | ||||||
主題Scheme | Other | |||||
主題 | sentence ordering | |||||
キーワード | ||||||
主題Scheme | Other | |||||
主題 | text coherence | |||||
キーワード | ||||||
主題Scheme | Other | |||||
主題 | machine learning | |||||
資源タイプ | ||||||
資源タイプ識別子 | http://purl.org/coar/resource_type/c_46ec | |||||
資源タイプ | thesis | |||||
その他のタイトル | ||||||
その他のタイトル | 複数文書自動要約における要約文の並び順による一貫性向上に関する研究 | |||||
著者 |
Bollegala, Danushka
× Bollegala, Danushka |
|||||
著者別名 | ||||||
識別子Scheme | WEKO | |||||
識別子 | 5600 | |||||
姓名 | ボッレーガラ, ダヌシカ | |||||
著者所属 | ||||||
値 | 大学院情報理工学系研究科電子情報学専攻 | |||||
Abstract | ||||||
内容記述タイプ | Abstract | |||||
内容記述 | The problem of extracting salient information to include in a summary has been researched extensively in the field of automatic text summarization. However, coherent arrangement of the extracted information has received little attention. Specially, in the case of extractive multi-document text summarization, sentences that convey important information are selected from a set of documents. There is no guarantee that this set of extracted sentences will form a coherent summary by itself. The order of presentation of information is an important factor that affects the coherence of a summary. This thesis focuses on the problem of automatically generating a coherent summary from a given set of documents by ordering the extracted sentences. I propose two different approaches to this problem: a pair-wise sentence comparison approach and a bottom-up text structuring approach. The pair-wise sentence comparison approach first compares all possible pairs of sentences and decides partial orderings between the two sentences in pairs. It then creates a total ordering that optimizes a certain function. In the bottom-up text structuring approach, I define four criteria for sentence ordering: chronology, topical-closeness, precedence and succedence. I then use support vector machines to integrate these four different criteria to compute the strength of association between two sentences. For training I use a set of manually ordered summaries. A hierarchical text clustering algorithm is used to produce a total ordering of sentences. I begin by ordering the pair of sentences that has the highest strength of association. I then repeatedly order the two segments of texts with the maximum association strength until a single segment with all sentences ordered is formed. I compare the sentence orderings produced by the proposed algorithm against manually ordered summaries using various rank correlation measures. Moreover, I perform a subjective grading of the generated summaries. Both automatic evaluation and subjective grading suggest that the proposed sentence ordering algorithms significantly outperforms all existing sentence ordering methods for multi-document summarization. Moreover, I investigate the problem of automatically evaluating a sentence ordering for its coherence and propose Average Continuity as an automatic evaluation measure for this task. The proposed automatic evaluation measure reports a high correlation with human ratings. | |||||
書誌情報 | 発行日 2007-02-02 | |||||
日本十進分類法 | ||||||
主題Scheme | NDC | |||||
主題 | 007 | |||||
学位 | ||||||
値 | master | |||||
研究科・専攻 | ||||||
値 | 情報理工学系研究科電子情報学専攻 | |||||
学位授与年月日 | ||||||
学位授与年月日 | 2007-03 | |||||
学位記番号 | ||||||
値 | 修第号 |