Class COSDocument

  • All Implemented Interfaces:
    java.io.Closeable, java.lang.AutoCloseable, COSObjectable

    public class COSDocument
    extends COSBase
    implements java.io.Closeable
    This is the in-memory representation of the PDF document. You need to call close() on this object when you are done using it!!
    Author:
    Ben Litchfield
    • Constructor Detail

      • COSDocument

        public COSDocument​(RandomAccess scratchFileValue,
                           boolean forceParsingValue)
        Constructor that will use the given random access file for storage of the PDF streams. The client of this method is responsible for deleting the storage if necessary that this file will write to. The close method will close the file though.
        Parameters:
        scratchFileValue - the random access file to use for storage
        forceParsingValue - flag to skip malformed or otherwise unparseable document content where possible
      • COSDocument

        public COSDocument​(java.io.File scratchDir,
                           boolean forceParsingValue)
                    throws java.io.IOException
        Constructor that will use a temporary file in the given directory for storage of the PDF streams. The temporary file is automatically removed when this document gets closed.
        Parameters:
        scratchDir - directory for the temporary file, or null to use the system default
        forceParsingValue - flag to skip malformed or otherwise unparseable document content where possible
        Throws:
        java.io.IOException - if something went wrong
      • COSDocument

        public COSDocument()
        Constructor. Uses memory to store stream.
      • COSDocument

        public COSDocument​(java.io.File scratchDir)
                    throws java.io.IOException
        Constructor that will create a create a scratch file in the following directory.
        Parameters:
        scratchDir - The directory to store a scratch file.
        Throws:
        java.io.IOException - If there is an error creating the tmp file.
      • COSDocument

        public COSDocument​(RandomAccess file)
        Constructor that will use the following random access file for storage of the PDF streams. The client of this method is responsible for deleting the storage if necessary that this file will write to. The close method will close the file though.
        Parameters:
        file - The random access file to use for storage.
    • Method Detail

      • getScratchFile

        public RandomAccess getScratchFile()
        Deprecated.
        direct access to the scratch file will be removed
        This will get the scratch file for this document.
        Returns:
        The scratch file.
      • createCOSStream

        public COSStream createCOSStream()
        Create a new COSStream using the underlying scratch file.
        Returns:
        the new COSStream
      • createCOSStream

        public COSStream createCOSStream​(COSDictionary dictionary)
        Create a new COSStream using the underlying scratch file.
        Parameters:
        dictionary - the corresponding dictionary
        Returns:
        the new COSStream
      • getObjectByType

        public COSObject getObjectByType​(java.lang.String type)
                                  throws java.io.IOException
        Deprecated.
        This will get the first dictionary object by type.
        Parameters:
        type - The type of the object.
        Returns:
        This will return an object with the specified type.
        Throws:
        java.io.IOException - If there is an error getting the object
      • getObjectByType

        public COSObject getObjectByType​(COSName type)
                                  throws java.io.IOException
        This will get the first dictionary object by type.
        Parameters:
        type - The type of the object.
        Returns:
        This will return an object with the specified type.
        Throws:
        java.io.IOException - If there is an error getting the object
      • getObjectsByType

        public java.util.List<COSObject> getObjectsByType​(java.lang.String type)
                                                   throws java.io.IOException
        This will get all dictionary objects by type.
        Parameters:
        type - The type of the object.
        Returns:
        This will return an object with the specified type.
        Throws:
        java.io.IOException - If there is an error getting the object
      • getObjectsByType

        public java.util.List<COSObject> getObjectsByType​(COSName type)
                                                   throws java.io.IOException
        This will get a dictionary object by type.
        Parameters:
        type - The type of the object.
        Returns:
        This will return an object with the specified type.
        Throws:
        java.io.IOException - If there is an error getting the object
      • print

        public void print()
        This will print contents to stdout.
      • setVersion

        public void setVersion​(float versionValue)
        This will set the version of this PDF document and update the header string.
        Parameters:
        versionValue - The version of the PDF document.
      • getVersion

        public float getVersion()
        This will get the version of this PDF document.
        Returns:
        This documents version.
      • setDecrypted

        public void setDecrypted()
        Signals that the document is decrypted completely. Needed e.g. by NonSequentialPDFParser to circumvent additional decryption later on.
      • isDecrypted

        public boolean isDecrypted()
        Indicates if a encrypted pdf is already decrypted after parsing. Does make sense only if the NonSequentialPDFParser is used.
        Returns:
        true indicates that the pdf is decrypted.
      • isEncrypted

        public boolean isEncrypted()
        This will tell if this is an encrypted document.
        Returns:
        true If this document is encrypted.
      • getEncryptionDictionary

        public COSDictionary getEncryptionDictionary()
        This will get the encryption dictionary if the document is encrypted or null if the document is not encrypted.
        Returns:
        The encryption dictionary.
      • getSignatureInterface

        public SignatureInterface getSignatureInterface()
        This will return the signature interface.
        Returns:
        the signature interface
      • setEncryptionDictionary

        public void setEncryptionDictionary​(COSDictionary encDictionary)
        This will set the encryption dictionary, this should only be called when encrypting the document.
        Parameters:
        encDictionary - The encryption dictionary.
      • getSignatureDictionaries

        public java.util.List<COSDictionary> getSignatureDictionaries()
                                                               throws java.io.IOException
        This will return a list of signature dictionaries as COSDictionary.
        Returns:
        list of signature dictionaries as COSDictionary
        Throws:
        java.io.IOException - if no document catalog can be found
      • getSignatureFields

        public java.util.List<COSDictionary> getSignatureFields​(boolean onlyEmptyFields)
                                                         throws java.io.IOException
        This will return a list of signature fields.
        Parameters:
        onlyEmptyFields - only empty signature fields will be returned
        Returns:
        list of signature dictionaries as COSDictionary
        Throws:
        java.io.IOException - if no document catalog can be found
      • getDocumentID

        public COSArray getDocumentID()
        This will get the document ID.
        Returns:
        The document id.
      • setDocumentID

        public void setDocumentID​(COSArray id)
        This will set the document ID.
        Parameters:
        id - The document id.
      • setSignatureInterface

        public void setSignatureInterface​(SignatureInterface sigInterface)
        Set the signature interface to the given value.
        Parameters:
        sigInterface - the signature interface
      • getCatalog

        public COSObject getCatalog()
                             throws java.io.IOException
        This will get the document catalog. Maybe this should move to an object at PDFEdit level
        Returns:
        catalog is the root of all document activities
        Throws:
        java.io.IOException - If no catalog can be found.
      • getObjects

        public java.util.List<COSObject> getObjects()
        This will get a list of all available objects.
        Returns:
        A list of all objects.
      • getTrailer

        public COSDictionary getTrailer()
        This will get the document trailer.
        Returns:
        the document trailer dict
      • setTrailer

        public void setTrailer​(COSDictionary newTrailer)
        // MIT added, maybe this should not be supported as trailer is a persistence construct. This will set the document trailer.
        Parameters:
        newTrailer - the document trailer dictionary
      • accept

        public java.lang.Object accept​(ICOSVisitor visitor)
                                throws COSVisitorException
        visitor pattern double dispatch method.
        Specified by:
        accept in class COSBase
        Parameters:
        visitor - The object to notify when visiting this object.
        Returns:
        any object, depending on the visitor implementation, or null
        Throws:
        COSVisitorException - If an error occurs while visiting this object.
      • close

        public void close()
                   throws java.io.IOException
        This will close all storage and delete the tmp files.
        Specified by:
        close in interface java.lang.AutoCloseable
        Specified by:
        close in interface java.io.Closeable
        Throws:
        java.io.IOException - If there is an error close resources.
      • finalize

        protected void finalize()
                         throws java.io.IOException
        Warn the user in the finalizer if he didn't close the PDF document. The method also closes the document just in case, to avoid abandoned temporary files. It's still a good idea for the user to close the PDF document at the earliest possible to conserve resources.
        Overrides:
        finalize in class java.lang.Object
        Throws:
        java.io.IOException - if an error occurs while closing the temporary files
      • setWarnMissingClose

        public void setWarnMissingClose​(boolean warn)
        Controls whether this instance shall issue a warning if the PDF document wasn't closed properly through a call to the close() method. If the PDF document is held in a cache governed by soft references it is impossible to reliably close the document before the warning is raised. By default, the warning is enabled.
        Parameters:
        warn - true enables the warning, false disables it.
      • getHeaderString

        public java.lang.String getHeaderString()
        Returns:
        Returns the current headerString. (It may have been updated by calls to setVersion(float))
      • setHeaderString

        public void setHeaderString​(java.lang.String header)
        Parameters:
        header - The headerString to set.
      • getOriginalHeaderString

        public java.lang.String getOriginalHeaderString()
        Get the original headerString from the PDF file. Unlike getHeaderString(), the value is not changed by files that have another header value in the document catalog.
        Returns:
        the original header string.
      • dereferenceObjectStreams

        public void dereferenceObjectStreams()
                                      throws java.io.IOException
        This method will search the list of objects for types of ObjStm. If it finds them then it will parse out all of the objects from the stream that is contains.
        Throws:
        java.io.IOException - If there is an error parsing the stream.
      • getObjectFromPool

        public COSObject getObjectFromPool​(COSObjectKey key)
                                    throws java.io.IOException
        This will get an object from the pool.
        Parameters:
        key - The object key.
        Returns:
        The object in the pool or a new one if it has not been parsed yet.
        Throws:
        java.io.IOException - If there is an error getting the proxy object.
      • removeObject

        public COSObject removeObject​(COSObjectKey key)
        Removes an object from the object pool.
        Parameters:
        key - the object key
        Returns:
        the object that was removed or null if the object was not found
      • addXRefTable

        public void addXRefTable​(java.util.Map<COSObjectKey,​java.lang.Long> xrefTableValues)
        Populate XRef HashMap with given values. Each entry maps ObjectKeys to byte offsets in the file.
        Parameters:
        xrefTableValues - xref table entries to be added
      • getXrefTable

        public java.util.Map<COSObjectKey,​java.lang.Long> getXrefTable()
        Returns the xrefTable which is a mapping of ObjectKeys to byte offsets in the file.
        Returns:
        mapping of ObjectsKeys to byte offsets
      • setStartXref

        public void setStartXref​(long startXrefValue)
        This method set the startxref value of the document. This will only be needed for incremental updates.
        Parameters:
        startXrefValue - the value for startXref
      • getStartXref

        public long getStartXref()
        Return the startXref Position of the parsed document. This will only be needed for incremental updates.
        Returns:
        a long with the old position of the startxref
      • isXRefStream

        public boolean isXRefStream()
        Determines it the trailer is a XRef stream or not.
        Returns:
        true if the trailer is a XRef stream