Class PassageScorer

    • Constructor Summary

      Constructors 
      Constructor Description
      PassageScorer()
      Creates PassageScorer with these default values: k1 = 1.2, b = 0.75.
      PassageScorer​(float k1, float b, float pivot)
      Creates PassageScorer with specified scoring parameters
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      float norm​(int passageStart)
      Normalize a passage according to its position in the document.
      float tf​(int freq, int passageLen)
      Computes term weight, given the frequency within the passage and the passage's length.
      float weight​(int contentLength, int totalTermFreq)
      Computes term importance, given its in-document statistics.
      • Methods inherited from class java.lang.Object

        equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • PassageScorer

        public PassageScorer()
        Creates PassageScorer with these default values:
        • k1 = 1.2,
        • b = 0.75.
        • pivot = 87
      • PassageScorer

        public PassageScorer​(float k1,
                             float b,
                             float pivot)
        Creates PassageScorer with specified scoring parameters
        Parameters:
        k1 - Controls non-linear term frequency normalization (saturation).
        b - Controls to what degree passage length normalizes tf values.
        pivot - Pivot value for length normalization (some rough idea of average sentence length in characters).
    • Method Detail

      • weight

        public float weight​(int contentLength,
                            int totalTermFreq)
        Computes term importance, given its in-document statistics.
        Parameters:
        contentLength - length of document in characters
        totalTermFreq - number of time term occurs in document
        Returns:
        term importance
      • tf

        public float tf​(int freq,
                        int passageLen)
        Computes term weight, given the frequency within the passage and the passage's length.
        Parameters:
        freq - number of occurrences of within this passage
        passageLen - length of the passage in characters.
        Returns:
        term weight
      • norm

        public float norm​(int passageStart)
        Normalize a passage according to its position in the document.

        Typically passages towards the beginning of the document are more useful for summarizing the contents.

        The default implementation is 1 + 1/log(pivot + passageStart)

        Parameters:
        passageStart - start offset of the passage
        Returns:
        a boost value multiplied into the passage's core.