Multi-Task Neural Networks for Speech Recognition

Egorova, Ekaterina

Multi-Task Neural Networks for Speech Recognition

but.committee	doc. Dr. Ing. Jan Černocký (předseda) prof. Ing. Tomáš Vojnar, Ph.D. (místopředseda) prof. Ing. Adam Herout, Ph.D. (člen) Doc. Ing. Branislav Sobota, Ph.D. (člen) Ing. Josef Strnadel, Ph.D. (člen) Ing. Michal Španěl, Ph.D. (člen)	cs
but.defence	Studentka nejprve prezentovala výsledky, kterých dosáhla v rámci své práce. Komise se poté seznámila s hodnocením vedoucího a posudkem oponenta práce. Studentka následně odpověděla na otázky oponenta a na další otázky přítomných. Komise se na základě posudku oponenta, hodnocení vedoucího, přednesené prezentace a odpovědí studentky na položené otázky rozhodla práci hodnotit stupněm "B" Otázky u obhajoby: Is there a secondary task that is likely to be helpful for many languages? What would happen if the individual classification tasks had weights? For Vietnamese, all the secondary tasks are helpful, but don't combine well. Do you have an idea what might have gone wrong?	cs
but.jazyk	angličtina (English)
but.program	Informační technologie	cs
but.result	práce byla úspěšně obhájena	cs
dc.contributor.advisor	Karafiát, Martin	en
dc.contributor.author	Egorova, Ekaterina	en
dc.contributor.referee	Veselý, Karel	en
dc.date.accessioned	2019-05-17T07:16:51Z
dc.date.available	2019-05-17T07:16:51Z
dc.date.created	2014	cs
dc.description.abstract	První část této diplomové práci se zabývá teoretickým rozborem principů neuronových sítí, včetně možnosti jejich použití v oblasti rozpoznávání řeči. Práce pokračuje popisem viceúkolových neuronových sítí a souvisejících experimentů. Praktická část práce obsahovala změny software pro trénování neuronových sítí, které umožnily viceúkolové trénování. Je rovněž popsáno připravené prostředí, včetně několika dedikovaných skriptů. Experimenty představené v této diplomové práci ověřují použití artikulačních characteristik řeči pro viceúkolové trénování. Experimenty byly provedeny na dvou řečových databázích lišících se kvalitou a velikostí a representujících různé jazyky - angličtinu a vietnamštinu. Artikulační charakteristiky byly také kombinovány s jinými sekundárními úkoly, například kontextem, s záměrem ověřit jejich komplementaritu. Porovnaní je provedeno s neuronovými sítěmi různých velikostí tak, aby byl popsán vztah mezi velikostí neuronových sítí a efektivitou viceúkolového trénování. Závěrem provedených experimentů je, že viceúkolové trénování s použitím artikulačnich charakteristik jako sekundárních úkolů vede k lepšímu trénování neuronových sítí a výsledkem tohoto trénování může být přesnější rozpoznávání fonémů. V závěru práce jsou viceúkolové neuronové sítě testovány v systému rozpoznávání řeči jako extraktor příznaků.	en
dc.description.abstract	The first part of this Master's thesis covers theoretical investigation into the principles and usage of neural networks, including their usability for the speech recognition tasks. Then it proceeds to summarize the multi-task neural networks' operating principles and some recent experiments with them. The practical part of the semester project reports changes made to a tool for neural network training which support multi-task training. Then the preparation of the settings is described, including a number of scripts written especially for this purpose. The experiments presented in the thesis explore the idea of using articulatory characteristics of phonemes as secondary tasks for multi-task training. The experiments are conducted on two different datasets of different quality and size and representing different languages - English and Vietnamese. Articulatory characteristics are occasionally combined with different secondary tasks, such as context, to see how well they function together. A comparison is made between the networks of different sizes to see how their size affects the effectiveness of multi-task training. These experiments show that multi-task training with the use of articulatory characteristics as secondary tasks can enhance training and yield better phoneme accuracy as a result. Finally, multi-task training is embedded to a speech recognition system as a feature extractor.	cs
dc.description.mark	B	cs
dc.identifier.citation	EGOROVA, E. Multi-Task Neural Networks for Speech Recognition [online]. Brno: Vysoké učení technické v Brně. Fakulta informačních technologií. 2014.	cs
dc.identifier.other	79742	cs
dc.identifier.uri	http://hdl.handle.net/11012/53377
dc.language.iso	en	cs
dc.publisher	Vysoké učení technické v Brně. Fakulta informačních technologií	cs
dc.rights	Standardní licenční smlouva - přístup k plnému textu bez omezení	cs
dc.subject	Rozpoznávání řeči	en
dc.subject	neuronové sítě	en
dc.subject	hluboké neuronové sítě	en
dc.subject	viceúkolové neuronové sítě	en
dc.subject	Speech recognition	cs
dc.subject	neural networks	cs
dc.subject	deep neural networks	cs
dc.subject	multi-task neural networks.	cs
dc.title	Multi-Task Neural Networks for Speech Recognition	en
dc.title.alternative	Multi-Task Neural Networks for Speech Recognition	cs
dc.type	Text	cs
dc.type.driver	masterThesis	en
dc.type.evskp	diplomová práce	cs
dcterms.dateAccepted	2014-06-20	cs
dcterms.modified	2020-05-10-16:11:33	cs
eprints.affiliatedInstitution.faculty	Fakulta informačních technologií	cs
sync.item.dbid	79742	en
sync.item.dbtype	ZP	en
sync.item.insts	2021.11.12 13:12:34	en
sync.item.modts	2021.11.12 12:43:58	en
thesis.discipline	Počítačová grafika a multimédia	cs
thesis.grantor	Vysoké učení technické v Brně. Fakulta informačních technologií. Ústav počítačové grafiky a multimédií	cs
thesis.level	Inženýrský	cs
thesis.name	Ing.	cs

Files

Original bundle

Now showing 1 - 2 of 2

Name:: final-thesis.pdf
Size:: 1.67 MB
Format:: Adobe Portable Document Format
Description:: final-thesis.pdf

Download

Name:: review_79742.html
Size:: 1.45 KB
Format:: Hypertext Markup Language
Description:: review_79742.html

Download

Collections

2014