Natural Language Processing

In this globalization age, to go over language barrier, language processing is becoming more and more important. As globalization effort is essential for Myanmar, localization is also important and it cannot be left behind. In searching and sorting, the importance of localization can be seen clearly as Myanmar is distinct from other languages.

Natural Language Processing Research Team was started in 2010 under the guidance of Ministry of Science and Technology. The initial objective of this research team is to develop a “Myanmar to English Translation System”. There are totally eight portions to be developed in this project.

  • To build Bilingual Lexicon and Myanmar Word Net
  • To create Spelling Checking System
  • To create Name Identification System to identify the name of (person, organization, places)
  • To create POS Tagging System
  • To create Noun Phrase Identification System
  • To create Verb Phrase Identification System
  • To create Word Sense Disambiguation System
  • To create Reordering of Myanmar to English Translation System

Myanmar Language Lexico-Conceptual Knowledge Resources (Online Resource Link)

        The research on development of Myanmar language Lexico-conceptual knowledge resources aimed to investigate the following potion for Myanmar NLP applications.
(1)Myanmar WordNet : It has been developed using the existing English WordNet lexical database and Myanmar English Machine Readable Dictionaries (MRDs) with a semi-automatic approach.
(2)Myanmar English Bilingual WordNet Like Lexicon : It has been developed using the existing English WordNet lexical database and Myanmar English MRDs with a semi-automatic approach.
(3)Myanmar3 Tokenization System:  This system contains  normalization tokenizer and orthographic tokenizer which are implemented using Finite State Automata (FSA) approach.
(4)Myanmar3 Segmentation System: This system has been developed using graph base pattern merging approach.
(5)Myanmar3 Part of Speech Tagging System: This system is implemented using context free grammar.
(6)Myanmar3 Parsing System: This system is developed using the CFG as the Myanmar sentence structure rules. The Myanmar name rules are generated for NER to work together to generate the parse tree.

Ongoing Works

Another objective of NLP team is to successfully implement the Speech-to-Speech system to reduce the language barrier in sharing and getting knowledge of every field such as social, economical and health.

(1)Speech-to-Text                                 The main objective of this work is to accept the Speech input (Myanmar) and to produce Text output (Myanmar).

(2)Text-to-Speech                                 The main objective of this work is to accept Text input (Myanmar) and to produce Speech output (English).

(3)Speech-to-Speech                           The main objective of this work is to synthesis the outputs of Speech-to-Text Phase and Text-to-Speech Phase.


Publications of Related Research

  • “Myanmar Word Segmentation Using Statistical Approach”, Proceeding of 2010 International Conference on Advanced Computer Technology and Engineering (ICACTE 2010), China, 20-22 August 2010.
  • “A Hybrid Approach for Part of Speech Tagging of Burmese Texts”, International Conference on Computer and Management (CAMAN 2011), ISBN:978-1-4244-9282-4, May 2011,Wuhan, China.
  • “Lexicon Based Word Segmentation and Part of Speech Tagging for Written Myanmar Text”, Volume II, Issue VI , International Journal of Computational Linguistics and Natural  Language Processing, ISSN 2279-0756, June 2013, India.
  • “Name Entity Transliteration in Myanmar Text”, International Journal of Computational Linguistics and Natural Language Processing (IJCLNLP),May 2013, India
  • “Machine Transliteration by Different Methods, The International Symposium on Society, Tourism, Education and Politics, September 13-15, 2013,Singapore
  • “Proposed Myanmar Word Tokenizer Based on LIPIDIPIKAR Treatise”, ICCET (International Conference on Computer Engineering and Technology) April 16-18(2010) Chengdu, Sichuan
  • “Words to Phrase Reordering Machine Translation System in Myanmar-English”, 3rd International Conference on Computer Research and Development (ICCRD-2011), March 11-13,2011, Shanghai, China(IEEE)
  • “Sentence Level Reordering System for Myanmar-English Machine Translation System”, International Journal of Computational Linguistics and Natural Language Processing (IJCNLP)Vol 2 Issue 5 May 2013 ISSN 2279-0756
  • “Syntactic Reordering Approach for Myanmar to English Machine Translation system”, The International Symposium on Society, Tourism, Education and Politics, Singapore, September 13-15, 2013
  • “Myanmar to English Verb Translation Disambiguation Approach based on Naïve Bayesian Classifier”, Proceedings of the 10th International Conference on Computer Research and Development (ICCRD), ICCRD 2011, March  11th – 13th , 2011, Shanghai, China.
  • “Supervised Word Sense Disambiguation for Myanmar using Joint Entropy”, International Association of Academicians and Researchers (INAAR) 2013, March 5th -6th  ,2013. First Hotel, Bangkok.