Asian Journal of Information Technology

Year: 2005
Volume: 4
Issue: 10
Page No. 890 - 894

An Approach to Sort Unicode Bengali Text Using Ancillary Maps

Authors : Shah Md. Emrul Islam and Muhammad Masroor Ali

Abstract: This paper presents an approach for sorting Unicode represented Bengali text. Unicode provides a unique number for internal representation inside the computer for every character of almost every language of the world, irrespective of the platform and software. The emergence of the Unicode Standard and the availability of tools supporting it are among the most significant recent global software technology trends. Our concern is the character order in Unicode for Bengali is different from the sorting order suggested by the governing authority. The presence of modifiers in Bengali has made the problem different from conventional lexicographical sorting. The objective of this study is to adapt the suggested sort order for Bengali text, when Unicode represents it. The method is open for future modification so that rearranging the sort order does not require the algorithm to be changed. The method is based on the use of an ancillary table that specifies the desired sorting order by a Unicode to dictionary significance value (weight) for the Unicode characters. By the use of relative positional dual indexes for the characters in the input text the weights are aggregated to a single value for each Unicode enabled word Then a simple sorting of these values will sort the text in the desired sorting order.

How to cite this article:

Shah Md. Emrul Islam and Muhammad Masroor Ali , 2005. An Approach to Sort Unicode Bengali Text Using Ancillary Maps . Asian Journal of Information Technology, 4: 890-894.

Design and power by Medwell Web Development Team. © Medwell Publishing 2022 All Rights Reserved