Class LevenshteinAutomata


  • public class LevenshteinAutomata
    extends java.lang.Object
    Class to construct DFAs that match a word within some edit distance.

    Implements the algorithm described in: Schulz and Mihov: Fast String Correction with Levenshtein Automata

    • Constructor Summary

      Constructors 
      Constructor Description
      LevenshteinAutomata​(int[] word, int alphaMax, boolean withTranspositions)
      Expert: specify a custom maximum possible symbol (alphaMax); default is Character.MAX_CODE_POINT.
      LevenshteinAutomata​(java.lang.String input, boolean withTranspositions)
      Create a new LevenshteinAutomata for some input String.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      Automaton toAutomaton​(int n)
      Compute a DFA that accepts all strings within an edit distance of n.
      • Methods inherited from class java.lang.Object

        equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • MAXIMUM_SUPPORTED_DISTANCE

        public static final int MAXIMUM_SUPPORTED_DISTANCE
        See Also:
        Constant Field Values
    • Constructor Detail

      • LevenshteinAutomata

        public LevenshteinAutomata​(java.lang.String input,
                                   boolean withTranspositions)
        Create a new LevenshteinAutomata for some input String. Optionally count transpositions as a primitive edit.
      • LevenshteinAutomata

        public LevenshteinAutomata​(int[] word,
                                   int alphaMax,
                                   boolean withTranspositions)
        Expert: specify a custom maximum possible symbol (alphaMax); default is Character.MAX_CODE_POINT.
    • Method Detail

      • toAutomaton

        public Automaton toAutomaton​(int n)
        Compute a DFA that accepts all strings within an edit distance of n.

        All automata have the following properties:

        • They are deterministic (DFA).
        • There are no transitions to dead states.
        • They are not minimal (some transitions could be combined).