Interface HtmlParser


  • public interface HtmlParser
    The HTML parser is a service to parse HTML and generate SAX events or a Document out of the HTML.
    • Method Summary

      All Methods Instance Methods Abstract Methods 
      Modifier and Type Method Description
      void parse​(java.io.InputStream inputStream, java.lang.String encoding, org.xml.sax.ContentHandler contentHandler)
      Parse HTML and send SAX events.
      org.w3c.dom.Document parse​(java.lang.String systemId, java.io.InputStream inputStream, java.lang.String encoding)
      Parse HTML and return a DOM Document.
    • Method Detail

      • parse

        void parse​(java.io.InputStream inputStream,
                   java.lang.String encoding,
                   org.xml.sax.ContentHandler contentHandler)
            throws org.xml.sax.SAXException
        Parse HTML and send SAX events.
        Parameters:
        inputStream - The input stream
        encoding - Encoding of the input stream, null for default encoding.
        contentHandler - Content handler receiving the SAX events. The content handler might also implement the lexical handler interface.
        Throws:
        org.xml.sax.SAXException - Exception thrown when parsing fails.
      • parse

        org.w3c.dom.Document parse​(java.lang.String systemId,
                                   java.io.InputStream inputStream,
                                   java.lang.String encoding)
                            throws java.io.IOException
        Parse HTML and return a DOM Document.
        Parameters:
        systemId - The system id
        inputStream - The input stream
        encoding - Encoding of the input stream, null for default encoding.
        Returns:
        A DOM Document built from parsed HTML or null
        Throws:
        java.io.IOException - Exception thrown when parsing fails.