Class POITextExtractor

    • Method Summary

      All Methods Instance Methods Abstract Methods Concrete Methods 
      Modifier and Type Method Description
      void close()
      Allows to free resources of the Extractor as soon as it is not needed any more.
      abstract java.lang.Object getDocument()  
      abstract POITextExtractor getMetadataTextExtractor()
      Returns another text extractor, which is able to output the textual content of the document metadata / properties, such as author and title.
      abstract java.lang.String getText()
      Retrieves all the text from the document.
      void setFilesystem​(java.io.Closeable fs)
      Used to ensure file handle cleanup.
      • Methods inherited from class java.lang.Object

        equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • POITextExtractor

        public POITextExtractor()
    • Method Detail

      • getText

        public abstract java.lang.String getText()
        Retrieves all the text from the document. How cells, paragraphs etc are separated in the text is implementation specific - see the javadocs for a specific project for details.
        Returns:
        All the text from the document
      • getMetadataTextExtractor

        public abstract POITextExtractor getMetadataTextExtractor()
        Returns another text extractor, which is able to output the textual content of the document metadata / properties, such as author and title.
        Returns:
        the metadata and text extractor
      • setFilesystem

        public void setFilesystem​(java.io.Closeable fs)
        Used to ensure file handle cleanup.
        Parameters:
        fs - filesystem to close
      • close

        public void close()
                   throws java.io.IOException
        Allows to free resources of the Extractor as soon as it is not needed any more. This may include closing open file handles and freeing memory. The Extractor cannot be used after close has been called.
        Specified by:
        close in interface java.lang.AutoCloseable
        Specified by:
        close in interface java.io.Closeable
        Throws:
        java.io.IOException
      • getDocument

        public abstract java.lang.Object getDocument()
        Returns:
        the processed document