Class IndicNormalizer


  • public class IndicNormalizer
    extends java.lang.Object
    Normalizes the Unicode representation of text in Indian languages.

    Follows guidelines from Unicode 5.2, chapter 6, South Asian Scripts I and graphical decompositions from http://ldc.upenn.edu/myl/IndianScriptsUnicode.html

    • Constructor Summary

      Constructors 
      Constructor Description
      IndicNormalizer()  
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      int normalize​(char[] text, int len)
      Normalizes input text, and returns the new length.
      • Methods inherited from class java.lang.Object

        equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • IndicNormalizer

        public IndicNormalizer()
    • Method Detail

      • normalize

        public int normalize​(char[] text,
                             int len)
        Normalizes input text, and returns the new length. The length will always be less than or equal to the existing length.
        Parameters:
        text - input text
        len - valid length
        Returns:
        normalized length