HOME JOURNALS CONTACT

Asian Journal of Information Technology

A Novel Approach to Sort Unicode Bengali Text Using Ancillary Maps
Shah Md. Emrul Islam and Muhammad Masroor Ali

Abstract: This study presents a novel approach for sorting Unicode represented Bengali text. Unicode provides a unique number for internal representation inside the computer for every character of almost every language of the world, irrespective of the platform and software. The emergence of the Unicode Standard and the availability of tools supporting it are among the most significant recent global software technology trends. Our concern is the character order in Unicode for Bengali is different from the sorting order suggested by the governing authority. The presence of modifiers in Bengali has made the problem different from conventional lexicographical sorting. The objective of this study is to adapt the suggested sort order for Bengali text, when Unicode represents it. The method is open for future modification so that rearranging the sort order does not require the algorithm to be changed. The method is based on the use of an ancillary table that specifies the desired sorting order by a Unicode to dictionary significance value (weight) for the Unicode characters. By the use of relative positional dual indexes for the characters in the input text the weights are aggregated to a single value for each Unicode enabled word Then a simple sorting of these values will sort the text in the desired sorting order.

How to cite this article
Shah Md. Emrul Islam and Muhammad Masroor Ali , 2005. A Novel Approach to Sort Unicode Bengali Text Using Ancillary Maps . Asian Journal of Information Technology, 4: 569-573.

© Medwell Journals. All Rights Reserved