Class LanguageHandler

  • All Implemented Interfaces:
    org.xml.sax.ContentHandler, org.xml.sax.DTDHandler, org.xml.sax.EntityResolver, org.xml.sax.ErrorHandler

    public class LanguageHandler
    extends WriteOutContentHandler
    SAX content handler that updates a language detector based on all the received character content.
    Since:
    Apache Tika 0.10
    • Constructor Detail

      • LanguageHandler

        public LanguageHandler()
                        throws java.io.IOException
        Throws:
        java.io.IOException
    • Method Detail

      • getDetector

        public LanguageDetector getDetector()
        Returns the language detector used by this content handler. Note that the returned detector gets updated whenever new SAX events are received by this content handler.
        Returns:
        language detector
      • getLanguage

        public LanguageResult getLanguage()
        Returns the detected language based on text handled thus far.
        Returns:
        LanguageResult