WEKO3
AND
アイテム
{"_buckets": {"deposit": "dd0655e9fc6646e1bb55b6a97a6958fb"}, "_deposit": {"id": "4069", "owners": [], "pid": {"revision_id": 0, "type": "depid", "value": "4069"}, "status": "published"}, "_oai": {"id": "oai:repository.dl.itc.utokyo.ac.jp:00004069"}, "item_7_alternative_title_1": {"attribute_name": "\u305d\u306e\u4ed6\u306e\u30bf\u30a4\u30c8\u30eb", "attribute_value_mlt": [{"subitem_alternative_title": "Tcomplexity\u306e\u89e3\u6790\u3068\u5fdc\u7528"}]}, "item_7_biblio_info_7": {"attribute_name": "\u66f8\u8a8c\u60c5\u5831", "attribute_value_mlt": [{"bibliographicIssueDates": {"bibliographicIssueDate": "20100324", "bibliographicIssueDateType": "Issued"}, "bibliographic_titles": [{}]}]}, "item_7_date_granted_25": {"attribute_name": "\u5b66\u4f4d\u6388\u4e0e\u5e74\u6708\u65e5", "attribute_value_mlt": [{"subitem_dategranted": "20100324"}]}, "item_7_degree_grantor_23": {"attribute_name": "\u5b66\u4f4d\u6388\u4e0e\u6a5f\u95a2", "attribute_value_mlt": [{"subitem_degreegrantor": [{"subitem_degreegrantor_name": "University of Tokyo (\u6771\u4eac\u5927\u5b66)"}]}]}, "item_7_degree_name_20": {"attribute_name": "\u5b66\u4f4d\u540d", "attribute_value_mlt": [{"subitem_degreename": "\u535a\u58eb(\u79d1\u5b66)"}]}, "item_7_description_5": {"attribute_name": "\u6284\u9332", "attribute_value_mlt": [{"subitem_description": "Tcodes are variablelength selfsynchronizing codes introduced by Titchener in 1984. Tcode codewords are constructed recursively from a finite alphabet using an algorithm called Taugmentation, resulting in excellent selfsynchronization properties. An algorithm called Tdecomposition parses a given sequence into a series of Tprefixes, and finds a Tcode set in which the sequence is encoded to a longest codeword. There are similarities and differences between Tdecomposition and the conventional LZ78 incremental parsing. The LZ78 incremental parsing algorithm parses a given sequence into consecutive distinct subsequences (words) sequentially in such a way that each word consists of the longest matching word parsed previously and a literal symbol. Then, the LZcomplexity is defined as the number of words. By contrast, Tdecomposition parses a given sequence into a series of Tprefixes, each of which consists of the recursive concatenation of the longest matching Tprefix parsed previously and a literal symbol, and it has to access the whole sequence every time it determines a Tprefix. Alike to the LZcomplexity, the Tcomplexity of a sequence is defined as the number of Tprefixes, however, the Tcomplexity of a particular sequence in general tends to be smaller than the LZcomplexity. In the first part of the thesis, we deal with our contributions to the theory of Tcodes. In order to realize a sequential determination of Tprefixes, we devise a new Tdecomposition algorithm using forward parsing. Both the Tcomplexity profile obtained from the forward Tdecomposition and the LZcomplexity profile can be derived in a unified way using a differential equation method. The method focuses on the increase of the average codeword length of a code tree. The obtained formulas are confirmed to coincide with those of previous studies. The magnitude of the Tcomplexity of a given sequence s in general indicates the degree of randomness. However, there exist interesting sequences that have much larger Tcomplexities than any random sequences. We investigate the maximum Tcomplexity sequences and the maximum LZcomplexity sequences using various techniques including those of the test suite released by the National Institute of Standards and Technology (NIST) of the U.S. government, and find that the maximum Tcomplexity sequences are less random than the maximum LZcomplexity sequences. In the second part of the thesis, we present our achievements in terms of application. We consider two applications  data compression and randomness testing. First, we propose a new data compression scheme based on Tcodes using a dictionary method such that all phrases added to a dictionary have a recursive structure similar to Tcodes. Our data compression scheme can compress any of the files in the Calgary Corpus more efficiently than previous schemes based on Tcodes and the UNIX compress, a variant of LZ78 (LZW). Next, we introduce a randomness test based on the Tcomplexity. Recently, the LempelZiv (LZ) randomness test based on the LZcomplexity was officially excluded from the NIST test suite. This is because the distribution of Pvalues for random sequences of length 106, the most common length used, is strictly discrete in the case of the LZcomplexity. Our test solves this problem because the Tcomplexity features an almost ideal uniform continuous distribution of Pvalues for random sequences of length 106. The proposed test outperforms the NIST LZ test, a modified LZ test proposed by Doganaksoy and G\u00f6loglu, and all other tests included in the NIST test suite, in terms of the detection of undesirable pseudorandom sequences generated by a multiplicative congruential generator (MCG) and nonrandom byte sequences Y = Y0, Y1, Y2, \u00b7\u00b7\u00b7, where Y3i and Y(3i+1) are random, but Y(3i+2) is given by Y3i + Y(3i+1) mod 28.", "subitem_description_type": "Abstract"}]}, "item_7_dissertation_number_26": {"attribute_name": "\u5b66\u4f4d\u6388\u4e0e\u756a\u53f7", "attribute_value_mlt": [{"subitem_dissertationnumber": "\u7532\u7b2c26141\u53f7"}]}, "item_7_full_name_3": {"attribute_name": "\u8457\u8005\u5225\u540d", "attribute_value_mlt": [{"nameIdentifiers": [{"nameIdentifier": "9382", "nameIdentifierScheme": "WEKO"}], "names": [{"name": "\u6ff1\u91ce, \u5065\u4e8c"}]}]}, "item_7_identifier_registration": {"attribute_name": "ID\u767b\u9332", "attribute_value_mlt": [{"subitem_identifier_reg_text": "10.15083/00004060", "subitem_identifier_reg_type": "JaLC"}]}, "item_7_select_21": {"attribute_name": "\u5b66\u4f4d", "attribute_value_mlt": [{"subitem_select_item": "doctoral"}]}, "item_7_subject_13": {"attribute_name": "\u65e5\u672c\u5341\u9032\u5206\u985e\u6cd5", "attribute_value_mlt": [{"subitem_subject": "007", "subitem_subject_scheme": "NDC"}]}, "item_7_text_22": {"attribute_name": "\u5b66\u4f4d\u5206\u91ce", "attribute_value_mlt": [{"subitem_text_value": "Science (Kagaku)(\u79d1\u5b66)"}]}, "item_7_text_24": {"attribute_name": "\u7814\u7a76\u79d1\u30fb\u5c02\u653b", "attribute_value_mlt": [{"subitem_text_value": "Department of Complexity Science and Engineering, Graduate School of Frontier Sciences (\u65b0\u9818\u57df\u5275\u6210\u79d1\u5b66\u7814\u7a76\u79d1\u8907\u96d1\u7406\u5de5\u5b66\u5c02\u653b)"}]}, "item_7_text_27": {"attribute_name": "\u5b66\u4f4d\u8a18\u756a\u53f7", "attribute_value_mlt": [{"subitem_text_value": "\u535a\u5275\u57df\u7b2c558\u53f7"}]}, "item_7_text_36": {"attribute_name": "\u8cc7\u6e90\u30bf\u30a4\u30d7", "attribute_value_mlt": [{"subitem_text_value": "Thesis"}]}, "item_7_text_4": {"attribute_name": "\u8457\u8005\u6240\u5c5e", "attribute_value_mlt": [{"subitem_text_value": "\u6771\u4eac\u5927\u5b66\u5927\u5b66\u9662\u65b0\u9818\u57df\u5275\u6210\u79d1\u5b66\u7814\u7a76\u79d1\u57fa\u76e4\u79d1\u5b66\u7814\u7a76\u7cfb\u8907\u96d1\u7406\u5de5\u5b66\u5c02\u653b"}, {"subitem_text_value": "Department of Complexity Science and Engineering, Graduate School of Frontier Sciences, The University of Tokyo"}]}, "item_creator": {"attribute_name": "\u8457\u8005", "attribute_type": "creator", "attribute_value_mlt": [{"creatorNames": [{"creatorName": "Hamano, Kenji"}], "nameIdentifiers": [{"nameIdentifier": "9381", "nameIdentifierScheme": "WEKO"}]}]}, "item_files": {"attribute_name": "\u30d5\u30a1\u30a4\u30eb\u60c5\u5831", "attribute_type": "file", "attribute_value_mlt": [{"accessrole": "open_date", "date": [{"dateType": "Available", "dateValue": "20170601"}], "displaytype": "detail", "download_preview_message": "", "file_order": 0, "filename": "K02429.pdf", "filesize": [{"value": "3.0 MB"}], "format": "application/pdf", "future_date_message": "", "is_thumbnail": false, "licensetype": "license_free", "mimetype": "application/pdf", "size": 3000000.0, "url": {"label": "\u672c\u6587(fulltext)", "url": "https://repository.dl.itc.utokyo.ac.jp/record/4069/files/K02429.pdf"}, "version_id": "6d4d2343d0ac4a18a252019c18d0b4d2"}]}, "item_keyword": {"attribute_name": "\u30ad\u30fc\u30ef\u30fc\u30c9", "attribute_value_mlt": [{"subitem_subject": "Tcode", "subitem_subject_scheme": "Other"}, {"subitem_subject": "Tcomplexity", "subitem_subject_scheme": "Other"}, {"subitem_subject": "Tdecomposition", "subitem_subject_scheme": "Other"}, {"subitem_subject": "LempelZiv complexity", "subitem_subject_scheme": "Other"}, {"subitem_subject": "NIST SP 80022", "subitem_subject_scheme": "Other"}, {"subitem_subject": "data compression", "subitem_subject_scheme": "Other"}, {"subitem_subject": "complexity profile", "subitem_subject_scheme": "Other"}]}, "item_language": {"attribute_name": "\u8a00\u8a9e", "attribute_value_mlt": [{"subitem_language": "eng"}]}, "item_resource_type": {"attribute_name": "\u8cc7\u6e90\u30bf\u30a4\u30d7", "attribute_value_mlt": [{"resourcetype": "thesis", "resourceuri": "http://purl.org/coar/resource_type/c_46ec"}]}, "item_title": "Analysis and Applications of the Tcomplexity", "item_titles": {"attribute_name": "\u30bf\u30a4\u30c8\u30eb", "attribute_value_mlt": [{"subitem_title": "Analysis and Applications of the Tcomplexity"}]}, "item_type_id": "7", "owner": "1", "path": ["9/233/280", "110/328/329"], "permalink_uri": "https://doi.org/10.15083/00004060", "pubdate": {"attribute_name": "\u516c\u958b\u65e5", "attribute_value": "20121101"}, "publish_date": "20121101", "publish_status": "0", "recid": "4069", "relation": {}, "relation_version_is_last": true, "title": ["Analysis and Applications of the Tcomplexity"], "weko_shared_id": null}
Analysis and Applications of the Tcomplexity
https://doi.org/10.15083/00004060
d06d46f0eda24649b776d83814a1f4fc
名前 / ファイル  ライセンス  アクション  

本文(fulltext) (3.0 MB)


Item type  学位論文 / Thesis or Dissertation(1)  

公開日  20121101  
タイトル  
タイトル  Analysis and Applications of the Tcomplexity  
言語  
言語  eng  
キーワード  
主題  Tcode  
主題Scheme  Other  
キーワード  
主題  Tcomplexity  
主題Scheme  Other  
キーワード  
主題  Tdecomposition  
主題Scheme  Other  
キーワード  
主題  LempelZiv complexity  
主題Scheme  Other  
キーワード  
主題  NIST SP 80022  
主題Scheme  Other  
キーワード  
主題  data compression  
主題Scheme  Other  
キーワード  
主題  complexity profile  
主題Scheme  Other  
資源タイプ  
資源  http://purl.org/coar/resource_type/c_46ec  
タイプ  thesis  
ID登録  
ID登録  10.15083/00004060  
ID登録タイプ  JaLC  
その他のタイトル  
その他のタイトル  Tcomplexityの解析と応用  
著者 
Hamano, Kenji
× Hamano, Kenji 

著者別名  
姓名  
姓名  濱野, 健二  
著者所属  
東京大学大学院新領域創成科学研究科基盤科学研究系複雑理工学専攻  
著者所属  
Department of Complexity Science and Engineering, Graduate School of Frontier Sciences, The University of Tokyo  
Abstract  
内容記述タイプ  Abstract  
内容記述  Tcodes are variablelength selfsynchronizing codes introduced by Titchener in 1984. Tcode codewords are constructed recursively from a finite alphabet using an algorithm called Taugmentation, resulting in excellent selfsynchronization properties. An algorithm called Tdecomposition parses a given sequence into a series of Tprefixes, and finds a Tcode set in which the sequence is encoded to a longest codeword. There are similarities and differences between Tdecomposition and the conventional LZ78 incremental parsing. The LZ78 incremental parsing algorithm parses a given sequence into consecutive distinct subsequences (words) sequentially in such a way that each word consists of the longest matching word parsed previously and a literal symbol. Then, the LZcomplexity is defined as the number of words. By contrast, Tdecomposition parses a given sequence into a series of Tprefixes, each of which consists of the recursive concatenation of the longest matching Tprefix parsed previously and a literal symbol, and it has to access the whole sequence every time it determines a Tprefix. Alike to the LZcomplexity, the Tcomplexity of a sequence is defined as the number of Tprefixes, however, the Tcomplexity of a particular sequence in general tends to be smaller than the LZcomplexity. In the first part of the thesis, we deal with our contributions to the theory of Tcodes. In order to realize a sequential determination of Tprefixes, we devise a new Tdecomposition algorithm using forward parsing. Both the Tcomplexity profile obtained from the forward Tdecomposition and the LZcomplexity profile can be derived in a unified way using a differential equation method. The method focuses on the increase of the average codeword length of a code tree. The obtained formulas are confirmed to coincide with those of previous studies. The magnitude of the Tcomplexity of a given sequence s in general indicates the degree of randomness. However, there exist interesting sequences that have much larger Tcomplexities than any random sequences. We investigate the maximum Tcomplexity sequences and the maximum LZcomplexity sequences using various techniques including those of the test suite released by the National Institute of Standards and Technology (NIST) of the U.S. government, and find that the maximum Tcomplexity sequences are less random than the maximum LZcomplexity sequences. In the second part of the thesis, we present our achievements in terms of application. We consider two applications  data compression and randomness testing. First, we propose a new data compression scheme based on Tcodes using a dictionary method such that all phrases added to a dictionary have a recursive structure similar to Tcodes. Our data compression scheme can compress any of the files in the Calgary Corpus more efficiently than previous schemes based on Tcodes and the UNIX compress, a variant of LZ78 (LZW). Next, we introduce a randomness test based on the Tcomplexity. Recently, the LempelZiv (LZ) randomness test based on the LZcomplexity was officially excluded from the NIST test suite. This is because the distribution of Pvalues for random sequences of length 106, the most common length used, is strictly discrete in the case of the LZcomplexity. Our test solves this problem because the Tcomplexity features an almost ideal uniform continuous distribution of Pvalues for random sequences of length 106. The proposed test outperforms the NIST LZ test, a modified LZ test proposed by Doganaksoy and Göloglu, and all other tests included in the NIST test suite, in terms of the detection of undesirable pseudorandom sequences generated by a multiplicative congruential generator (MCG) and nonrandom byte sequences Y = Y0, Y1, Y2, ···, where Y3i and Y(3i+1) are random, but Y(3i+2) is given by Y3i + Y(3i+1) mod 28.  
書誌情報  発行日 20100324  
日本十進分類法  
主題  007  
主題Scheme  NDC  
学位名  
学位名  博士(科学)  
学位  
値  doctoral  
学位分野  
Science (Kagaku)(科学)  
学位授与機関  
学位授与機関名  
学位授与機関名  University of Tokyo (東京大学)  
研究科・専攻  
Department of Complexity Science and Engineering, Graduate School of Frontier Sciences (新領域創成科学研究科複雑理工学専攻)  
学位授与年月日  
学位授与年月日  20100324  
学位授与番号  
学位授与番号  甲第26141号  
学位記番号  
博創域第558号 