Generation of fundamental frequency contours for Thai speech synthesis using tone nucleus model

Krityakien, Oraphan

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

Generation of fundamental frequency contours for Thai speech synthesis using tone nucleus model

http://hdl.handle.net/2261/54201

名前 / ファイル	ライセンス	アクション
48116420.pdf (3.4 MB)

Item type

学位論文 / Thesis or Dissertation(1)

公開日

2013-05-07

タイトル

Generation of fundamental frequency contours for Thai speech synthesis using tone nucleus model

言語

eng

資源タイプ

資源

http://purl.org/coar/resource_type/c_46ec

タイプ

thesis

その他のタイトル

声調核モデルによるタイ語音声合成の基本周波数パターン生成

著者

Krityakien, Oraphan

著者別名

識別子Scheme

WEKO

識別子

10552

姓名

クリットヤーキヤン, オラパン

著者所属

東京大学大学院情報理工学系研究科電子情報学専攻

著者所属

Department of Information and Communication Engineering, Graduate School of Information Science and Technology, The University of Tokyo

Abstract

内容記述タイプ

Abstract

内容記述

In this information decades, speech media is one of the new coming interfaces between human and machines. Applications with this interface help users to access information while they can continue their front tasks. Not only speech recognition but speech synthesis has been also introduced and embedded in such applications. However, the users prefer the synthetic speech with intelligibility and naturalness regardless of how many other abilities the application provides. The speech synthesis for tonal languages is much more challenge than that for non-tonal languages, because both intonation and tones need to be concerned. Fundamental frequency is one of acoustic features relating to the intonation and tones. Existing F0 models for Thai language are expensive to complete the F0 generation from their parameters and suffer when the size of the available data to build the model is small. With many advantages of the tone nucleus model which has been originated in Mandarin, we have pioneered adapting this model in Thai language to meet the classic but still intrinsic requirements of speech synthesis in continuous speech. Tone nuclei are analytically defined for all five distinctive Thai tones according to their underlying targets. The full process of the F0 contour generation is presented from the tone nucleus extraction, parameter extraction, parameter prediction, until the F0 contour generation for the continuous speech. Again, the model is successfully proven to be adapted in the other language than Mandarin through objective and subjective tests. The tests confirmed the efficiency and adaptability of the model. Compared to the F0 contours generated by the predictors trained from the contours in the whole syllables without extracting the tone nuclei, the model generated the F0 contours in continuous utterances with less distortion but more tone intelligibility and naturalness. Proposed methodology in parameter prediction and the F0 contour generation processes improved the quality of the synthetic speeches by reducing the distortion and increasing the tone intelligibility and naturalness significantly.

書誌情報

発行日 2013-03-25

学位名

修士(情報理工学)

学位

値

master

研究科・専攻

情報理工学系研究科・電子情報学専攻

学位授与年月日

2013-03-25

戻る

views

See details

	Views

Versions

Ver.1

2021-03-02 07:49:08.581048

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR 2.0
JPCOAR 1.0
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

Generation of fundamental frequency contours for Thai speech synthesis using tone nucleus model

× Krityakien, Oraphan

Versions

Share

Cite as

エクスポート