Friday, October 7, 2022

Reversing a Odia Unicode String

Odia is a language with lot of combining characters. Therefore simple reversal of string like str.split('').reverse().join('') will not work . With this naive approach, the reverse of string like ଜଳାର୍ଣ୍ଣବ will end up in ବଣ୍ଣ୍ରାଳଜ where as the expected reversal would be ବର୍ଣ୍ଣଳାଜ. 

Even I failed with BreakIterator of Java with Odia Locale. I also tried packages like esrever but failed. Finally I tried with basic decomposition of Unicode Odia syllables. A string like ଜଳାର୍ଣ୍ଣବ is composed of characters like [  'ଜ', 'ଳ', 'ା', 'ର',  '୍',  'ଣ', '୍',  'ଣ',  'ବ' ], therefore handling the matras and falas did the job. Here is the code in JavaScript: 

function reverseOdia(str)
{
   var reverseStr = "" ;
   var chars = [...str] ;
   var maatraas = "ାିୀୁୂୃେୈୋୌଂଁ";
   var isEnd = false ;
   syllable = "" ;
   for (let index = 0; index < str.length; index++) {
      char = str[index] ;
      nextChar = str[index+1] ;
      syllable += char ;
      if (maatraas.indexOf(nextChar) != -1 ) {
         syllable += nextChar ;
         index++ ;
         isEnd = true ;
      }
      else {
         if( nextChar == '୍') {
            isEnd = false ;
            syllable += nextChar ;
            index++ ;
         }
         else {
            isEnd = true ;
         }
      }
      if (isEnd)
      {
         reverseStr = syllable + reverseStr ;
         syllable = "" ;
      }
   }
   return reverseStr ;
}

str = "ଜଳାର୍ଣ୍ଣବ" ; // କଟକ // ଅସତ୍‌କର୍ମ // ନୃପସ୍ଥାୟକ ;
console.log(reverseOdia(str)) ;