Class FuzzyTermsEnum

  • All Implemented Interfaces:
    BytesRefIterator

    public class FuzzyTermsEnum
    extends TermsEnum
    Subclass of TermsEnum for enumerating all terms that are similar to the specified filter term.

    Term enumerations are always ordered by getComparator(). Each term in the enumeration is greater than all that precede it.

    • Constructor Detail

      • FuzzyTermsEnum

        public FuzzyTermsEnum​(Terms terms,
                              AttributeSource atts,
                              Term term,
                              float minSimilarity,
                              int prefixLength,
                              boolean transpositions)
                       throws java.io.IOException
        Constructor for enumeration of all terms from specified reader which share a prefix of length prefixLength with term and which have a fuzzy similarity > minSimilarity.

        After calling the constructor the enumeration is already pointing to the first valid term if such a term exists.

        Parameters:
        terms - Delivers terms.
        atts - AttributeSource created by the rewrite method of MultiTermQuery thats contains information about competitive boosts during rewrite. It is also used to cache DFAs between segment transitions.
        term - Pattern term.
        minSimilarity - Minimum required similarity for terms from the reader. Pass an integer value representing edit distance. Passing a fraction is deprecated.
        prefixLength - Length of required common prefix. Default value is 0.
        Throws:
        java.io.IOException - if there is a low-level IO error
    • Method Detail

      • next

        public BytesRef next()
                      throws java.io.IOException
        Description copied from interface: BytesRefIterator
        Increments the iteration to the next BytesRef in the iterator. Returns the resulting BytesRef or null if the end of the iterator is reached. The returned BytesRef may be re-used across calls to next. After this method returns null, do not call it again: the results are undefined.
        Returns:
        the next BytesRef in the iterator or null if the end of the iterator is reached.
        Throws:
        java.io.IOException - If there is a low-level I/O error.
      • docFreq

        public int docFreq()
                    throws java.io.IOException
        Description copied from class: TermsEnum
        Returns the number of documents containing the current term. Do not call this when the enum is unpositioned. TermsEnum.SeekStatus.END.
        Specified by:
        docFreq in class TermsEnum
        Throws:
        java.io.IOException
      • totalTermFreq

        public long totalTermFreq()
                           throws java.io.IOException
        Description copied from class: TermsEnum
        Returns the total number of occurrences of this term across all documents (the sum of the freq() for each doc that has this term). This will be -1 if the codec doesn't support this measure. Note that, like other term measures, this measure does not take deleted documents into account.
        Specified by:
        totalTermFreq in class TermsEnum
        Throws:
        java.io.IOException
      • docs

        public DocsEnum docs​(Bits liveDocs,
                             DocsEnum reuse,
                             int flags)
                      throws java.io.IOException
        Description copied from class: TermsEnum
        Get DocsEnum for the current term, with control over whether freqs are required. Do not call this when the enum is unpositioned. This method will not return null.
        Specified by:
        docs in class TermsEnum
        Parameters:
        liveDocs - unset bits are documents that should not be returned
        reuse - pass a prior DocsEnum for possible reuse
        flags - specifies which optional per-document values you require; see DocsEnum.FLAG_FREQS
        Throws:
        java.io.IOException
        See Also:
        TermsEnum.docs(Bits, DocsEnum, int)
      • docsAndPositions

        public DocsAndPositionsEnum docsAndPositions​(Bits liveDocs,
                                                     DocsAndPositionsEnum reuse,
                                                     int flags)
                                              throws java.io.IOException
        Description copied from class: TermsEnum
        Get DocsAndPositionsEnum for the current term, with control over whether offsets and payloads are required. Some codecs may be able to optimize their implementation when offsets and/or payloads are not required. Do not call this when the enum is unpositioned. This will return null if positions were not indexed.
        Specified by:
        docsAndPositions in class TermsEnum
        Parameters:
        liveDocs - unset bits are documents that should not be returned
        reuse - pass a prior DocsAndPositionsEnum for possible reuse
        flags - specifies which optional per-position values you require; see DocsAndPositionsEnum.FLAG_OFFSETS and DocsAndPositionsEnum.FLAG_PAYLOADS.
        Throws:
        java.io.IOException
      • seekExact

        public void seekExact​(BytesRef term,
                              TermState state)
                       throws java.io.IOException
        Description copied from class: TermsEnum
        Expert: Seeks a specific position by TermState previously obtained from TermsEnum.termState(). Callers should maintain the TermState to use this method. Low-level implementations may position the TermsEnum without re-seeking the term dictionary.

        Seeking by TermState should only be used iff the state was obtained from the same TermsEnum instance.

        NOTE: Using this method with an incompatible TermState might leave this TermsEnum in undefined state. On a segment level TermState instances are compatible only iff the source and the target TermsEnum operate on the same field. If operating on segment level, TermState instances must not be used across segments.

        NOTE: A seek by TermState might not restore the AttributeSource's state. AttributeSource states must be maintained separately if this method is used.

        Overrides:
        seekExact in class TermsEnum
        Parameters:
        term - the term the TermState corresponds to
        state - the TermState
        Throws:
        java.io.IOException
      • getComparator

        public java.util.Comparator<BytesRef> getComparator()
        Description copied from interface: BytesRefIterator
        Return the BytesRef Comparator used to sort terms provided by the iterator. This may return null if there are no items or the iterator is not sorted. Callers may invoke this method many times, so it's best to cache a single instance & reuse it.
      • ord

        public long ord()
                 throws java.io.IOException
        Description copied from class: TermsEnum
        Returns ordinal position for current term. This is an optional method (the codec may throw UnsupportedOperationException). Do not call this when the enum is unpositioned.
        Specified by:
        ord in class TermsEnum
        Throws:
        java.io.IOException
      • seekExact

        public boolean seekExact​(BytesRef text)
                          throws java.io.IOException
        Description copied from class: TermsEnum
        Attempts to seek to the exact term, returning true if the term is found. If this returns false, the enum is unpositioned. For some codecs, seekExact may be substantially faster than TermsEnum.seekCeil(org.apache.lucene.util.BytesRef).
        Overrides:
        seekExact in class TermsEnum
        Throws:
        java.io.IOException
      • seekCeil

        public TermsEnum.SeekStatus seekCeil​(BytesRef text)
                                      throws java.io.IOException
        Description copied from class: TermsEnum
        Seeks to the specified term, if it exists, or to the next (ceiling) term. Returns SeekStatus to indicate whether exact term was found, a different term was found, or EOF was hit. The target term may be before or after the current term. If this returns SeekStatus.END, the enum is unpositioned.
        Specified by:
        seekCeil in class TermsEnum
        Throws:
        java.io.IOException
      • seekExact

        public void seekExact​(long ord)
                       throws java.io.IOException
        Description copied from class: TermsEnum
        Seeks to the specified term by ordinal (position) as previously returned by TermsEnum.ord(). The target ord may be before or after the current ord, and must be within bounds.
        Specified by:
        seekExact in class TermsEnum
        Throws:
        java.io.IOException
      • term

        public BytesRef term()
                      throws java.io.IOException
        Description copied from class: TermsEnum
        Returns current term. Do not call this when the enum is unpositioned.
        Specified by:
        term in class TermsEnum
        Throws:
        java.io.IOException
      • getMinSimilarity

        public float getMinSimilarity()
      • getScaleFactor

        public float getScaleFactor()