Package org.apache.pdfbox.pdfparser
Class PDFStreamParser
- java.lang.Object
-
- org.apache.pdfbox.pdfparser.BaseParser
-
- org.apache.pdfbox.pdfparser.PDFStreamParser
-
public class PDFStreamParser extends BaseParser
This will parse a PDF byte stream and extract operands and such.- Version:
- $Revision$
- Author:
- Ben Litchfield
-
-
Field Summary
-
Fields inherited from class org.apache.pdfbox.pdfparser.BaseParser
DEF, document, ENDOBJ, ENDSTREAM, forceParsing, pdfSource, PROP_PUSHBACK_SIZE
-
-
Constructor Summary
Constructors Constructor Description PDFStreamParser(java.io.InputStream stream, RandomAccess raf)
Constructor that takes a stream to parse.PDFStreamParser(java.io.InputStream stream, RandomAccess raf, boolean forceParsing)
Constructor that takes a stream to parse.PDFStreamParser(COSStream stream)
Constructor.PDFStreamParser(COSStream stream, boolean forceParsing)
Constructor.PDFStreamParser(PDStream stream)
Constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
clearResources()
Release all used resources.void
close()
This will close the underlying pdfSource object.java.util.Iterator<java.lang.Object>
getTokenIterator()
This will get an iterator which can be used to parse the stream one token after the other.java.util.List<java.lang.Object>
getTokens()
This will get the tokens that were parsed from the stream.void
parse()
This will parse the tokens in the stream.protected java.lang.String
readOperator()
This will read an operator from the stream.-
Methods inherited from class org.apache.pdfbox.pdfparser.BaseParser
isClosing, isClosing, isEndOfName, isEOL, isEOL, isWhitespace, isWhitespace, parseBoolean, parseCOSArray, parseCOSDictionary, parseCOSName, parseCOSStream, parseCOSString, parseCOSString, parseDirObject, readExpectedString, readGenerationNumber, readInt, readLine, readLong, readObjectNumber, readString, readString, readStringNumber, readUntilEndStream, setDocument, skipSpaces
-
-
-
-
Constructor Detail
-
PDFStreamParser
public PDFStreamParser(java.io.InputStream stream, RandomAccess raf, boolean forceParsing) throws java.io.IOException
Constructor that takes a stream to parse.- Parameters:
stream
- The stream to read data from.raf
- The random access file.forceParsing
- flag to skip malformed or otherwise unparseable input where possible- Throws:
java.io.IOException
- If there is an error reading from the stream.- Since:
- Apache PDFBox 1.3.0
-
PDFStreamParser
public PDFStreamParser(java.io.InputStream stream, RandomAccess raf) throws java.io.IOException
Constructor that takes a stream to parse.- Parameters:
stream
- The stream to read data from.raf
- The random access file.- Throws:
java.io.IOException
- If there is an error reading from the stream.
-
PDFStreamParser
public PDFStreamParser(PDStream stream) throws java.io.IOException
Constructor.- Parameters:
stream
- The stream to parse.- Throws:
java.io.IOException
- If there is an error initializing the stream.
-
PDFStreamParser
public PDFStreamParser(COSStream stream, boolean forceParsing) throws java.io.IOException
Constructor.- Parameters:
stream
- The stream to parse.forceParsing
- flag to skip malformed or otherwise unparseable input where possible- Throws:
java.io.IOException
- If there is an error initializing the stream.- Since:
- Apache PDFBox 1.3.0
-
PDFStreamParser
public PDFStreamParser(COSStream stream) throws java.io.IOException
Constructor.- Parameters:
stream
- The stream to parse.- Throws:
java.io.IOException
- If there is an error initializing the stream.
-
-
Method Detail
-
parse
public void parse() throws java.io.IOException
This will parse the tokens in the stream. This will close the stream when it is finished parsing.- Throws:
java.io.IOException
- If there is an error while parsing the stream.
-
getTokens
public java.util.List<java.lang.Object> getTokens()
This will get the tokens that were parsed from the stream.- Returns:
- All of the tokens in the stream.
-
close
public void close() throws java.io.IOException
This will close the underlying pdfSource object.- Throws:
java.io.IOException
- If there is an error releasing resources.
-
getTokenIterator
public java.util.Iterator<java.lang.Object> getTokenIterator()
This will get an iterator which can be used to parse the stream one token after the other.- Returns:
- an iterator to get one token after the other
-
readOperator
protected java.lang.String readOperator() throws java.io.IOException
This will read an operator from the stream.- Returns:
- The operator that was read from the stream.
- Throws:
java.io.IOException
- If there is an error reading from the stream.
-
clearResources
public void clearResources()
Release all used resources.- Overrides:
clearResources
in classBaseParser
-
-