Class NGramGenerator


  • public class NGramGenerator
    extends java.lang.Object
    Generates an nGram, with optional separator, and returns the grams as a list of strings
    • Constructor Summary

      Constructors 
      Constructor Description
      NGramGenerator()  
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static java.util.List<java.lang.String> generate​(char[] input, int n, java.lang.String separator)
      Generates an nGram based on a char[] input
      static java.util.List<java.lang.String> generate​(java.util.List<java.lang.String> input, int n, java.lang.String separator)
      Creates an ngram separated by the separator param value i.e.
      • Methods inherited from class java.lang.Object

        equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • NGramGenerator

        public NGramGenerator()
    • Method Detail

      • generate

        public static java.util.List<java.lang.String> generate​(java.util.List<java.lang.String> input,
                                                                int n,
                                                                java.lang.String separator)
        Creates an ngram separated by the separator param value i.e. a,b,c,d with n = 3 and separator = "-" would return a-b-c,b-c-d
        Parameters:
        input - the input tokens the output ngrams will be derived from
        n - the number of tokens as the sliding window
        separator - each string in each gram will be separated by this value if desired. Pass in empty string if no separator is desired
        Returns:
      • generate

        public static java.util.List<java.lang.String> generate​(char[] input,
                                                                int n,
                                                                java.lang.String separator)
        Generates an nGram based on a char[] input
        Parameters:
        input - the array of chars to convert to nGram
        n - The number of grams (chars) that each output gram will consist of
        separator - each char in each gram will be separated by this value if desired. Pass in empty string if no separator is desired
        Returns: