Class: Poppler::Document

Inherits:
Object
  • Object
show all
Includes:
Enumerable
Defined in:
lib/poppler/document.rb

Instance Method Summary collapse

Constructor Details

#initialize(*args) ⇒ Poppler::Document

Creates a new Poppler::Document reading the PDF contents from stream. Note that the given GInput::Stream must be seekable or %G_IO_ERROR_NOT_SUPPORTED will be returned. Possible errors include those in the #POPPLER_ERROR, #G_FILE_ERROR and #G_IO_ERROR domains.

Parameters:

  • stream (Gio::InputStream)

    a GInput::Stream to read from

  • length (Integer)

    the stream length, or -1 if not known

  • password (String)

    password to unlock the file with, or nil

  • cancellable (Gio::Cancellable)

    a #GCancellable, or nil



24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
# File 'lib/poppler/document.rb', line 24

def initialize(*args)
  if args.size == 1 and args[0].is_a?(Hash)
    options = args[0]
  else
    uri_or_data, password = args
    if pdf_data?(uri_or_data)
      options = {
        :data => uri_or_data,
        :password => password
      }
    else
      options = {
        :uri => ensure_uri(uri_or_data),
        :password => password
      }
    end
  end

  data = options[:data]
  uri = options[:uri]
  path = options[:path]
  stream = options[:stream]
  length = options[:length]
  file = options[:file]
  password = options[:password]

  if data
    if respond_to?(:initialize_new_from_bytes)
      initialize_new_from_bytes(data, password)
    else
      @file = Tempfile.new(["poppler", ".pdf"])
      @file.binmode
      @file.write(data)
      @file.close
      initialize_new_from_file(ensure_uri(@file.path), password)
    end
  elsif uri
    initialize_new_from_file(uri, password)
  elsif path
    uri = ensure_uri(path)
    initialize_new_from_file(uri, password)
  elsif stream
    if length.nil?
      raise(ArgumentError,
            "must specify :length for :stream: #{options.inspect}")
    end
    initialize_new_from_stream(stream, length, password)
  elsif file
    case file
    when String, Pathname
      initialize(path: file, password: password)
    else
      initialize_new_from_gfile(file, password)
    end
  else
    message =
      "must specify one of :data, :uri, :path, :stream or :file: " +
      options.inspect
    raise(ArgumentError, message)
  end
end

Instance Method Details

#attachmentsGLib::List<Poppler::Attachment>

Returns a #GList containing Poppler::Attachments. These attachments are unowned, and must be unreffed, and the list must be freed with g_list_free().

Returns:

#authorString

The author of the document

Returns:

  • (String)

    author

#author=(author) ⇒ String

The author of the document

Parameters:

  • author (String)

Returns:

  • (String)

    author

  • (String)

    author

#create_dests_treeGLib::Tree

Creates a balanced binary tree of all named destinations in document

The tree key is strings in the form returned by poppler_named_dest_to_bytestring() which constains a destination name. The tree value is the Poppler::Dest which contains a named destination. The return value must be freed with g_tree_destroy().

Returns:

  • (GLib::Tree)

    the #GTree, or nil

#creation_dateInteger

The date the document was created as seconds since the Epoch, or -1

Returns:

  • (Integer)

    creation-date

#creation_date=(creation_date) ⇒ Integer

The date the document was created as seconds since the Epoch, or -1

Parameters:

  • creation_date (Integer)

Returns:

  • (Integer)

    creation-date

  • (Integer)

    creation-date

#creation_date_timeGLib::DateTime

Returns the date the document was created as a GDate::Time

Returns:

  • (GLib::DateTime)

    the date the document was created, or nil

#creation_date_time=(creation_datetime) ⇒ nil

Sets the document's creation date. If creation_datetime is nil, CreationDate entry is removed from the document's Info dictionary.

Parameters:

  • creation_datetime (GLib::DateTime)

    A new creation GDate::Time

Returns:

  • (nil)

#creation_datetimeGLib::DateTime

The GDate::Time the document was created.

Returns:

  • (GLib::DateTime)

    creation-datetime

#creation_datetime=(creation_datetime) ⇒ GLib::DateTime

The GDate::Time the document was created.

Parameters:

  • creation_datetime (GLib::DateTime)

Returns:

  • (GLib::DateTime)

    creation-datetime

  • (GLib::DateTime)

    creation-datetime

#creatorString

The creator of the document. See also poppler_document_get_creator()

Returns:

  • (String)

    creator

#creator=(creator) ⇒ String

The creator of the document. See also poppler_document_get_creator()

Parameters:

  • creator (String)

Returns:

  • (String)

    creator

  • (String)

    creator

#eachObject



88
89
90
91
92
93
94
# File 'lib/poppler/document.rb', line 88

def each
  return to_enum(__method__) unless block_given?

  n_pages.times do |i|
    yield get_page(i)
  end
end

#find_dest(link_name) ⇒ Poppler::Dest

Creates a Poppler::Dest for the named destination link_name in document.

Note that named destinations are bytestrings, not string. That means that unless link_name was returned by a poppler function (e.g. is Poppler::Dest.named_dest), it needs to be converted to string using poppler_named_dest_from_bytestring() before being passed to this function.

The returned value must be freed with poppler_dest_free().

Parameters:

  • link_name (String)

    a named destination

Returns:

  • (Poppler::Dest)

    a new Poppler::Dest destination, or nil if link_name is not a destination.

#formatString

The PDF version as string. See also poppler_document_get_pdf_version_string()

Returns:

  • (String)

    format

#format=(format) ⇒ String

The PDF version as string. See also poppler_document_get_pdf_version_string()

Parameters:

  • format (String)

Returns:

  • (String)

    format

  • (String)

    format

#format_majorInteger

The PDF major version number. See also poppler_document_get_pdf_version()

Returns:

  • (Integer)

    format-major

#format_major=(format_major) ⇒ Integer

The PDF major version number. See also poppler_document_get_pdf_version()

Parameters:

  • format_major (Integer)

Returns:

  • (Integer)

    format-major

  • (Integer)

    format-major

#format_minorInteger

The PDF minor version number. See also poppler_document_get_pdf_version()

Returns:

  • (Integer)

    format-minor

#format_minor=(format_minor) ⇒ Integer

The PDF minor version number. See also poppler_document_get_pdf_version()

Parameters:

  • format_minor (Integer)

Returns:

  • (Integer)

    format-minor

  • (Integer)

    format-minor

#get_form_field(id) ⇒ Poppler::FormField

Returns the Poppler::FormField for the given id. It must be freed with g_object_unref() not found

Parameters:

  • id (Integer)

    an id of a Poppler::FormField

Returns:

#get_id(permanent_id, update_id) ⇒ Boolean

Returns the PDF file identifier represented as two byte string arrays of size 32. permanent_id is the permanent identifier that is built based on the file contents at the time it was originally created, so that this identifer never changes. update_id is the update identifier that is built based on the file contents at the time it was last updated.

Note that returned strings are not null-terminated, they have a fixed size of 32 bytes.

Parameters:

  • permanent_id (String)

    location to store an allocated string, use g_free() to free the returned string

  • update_id (String)

    location to store an allocated string, use g_free() to free the returned string

Returns:

  • (Boolean)

    true if the document contains an id, false otherwise

#get_page(index) ⇒ Poppler::Page Also known as: []

Returns the Poppler::Page indexed at index. This object is owned by the caller.

Parameters:

  • index (Integer)

    a page index

Returns:

#get_page_by_label(label) ⇒ Poppler::Page

Returns the Poppler::Page reference by label. This object is owned by the caller. label is a human-readable string representation of the page number, and can be document specific. Typically, it is a value such as "iii" or "3".

By default, "1" refers to the first page.

Parameters:

  • label (String)

    a page label

Returns:

#get_pdf_version(major_version, minor_version) ⇒ nil

Updates values referenced by major_version & minor_version with the major and minor PDF versions of document.

Parameters:

  • major_version (Integer)

    return location for the PDF major version number

  • minor_version (Integer)

    return location for the PDF minor version number

Returns:

  • (nil)

#get_print_page_ranges(n_ranges) ⇒ Array<Poppler::PageRange>

Returns the suggested page ranges to print in the form of array of Poppler::PageRanges and number of ranges. nil pointer means that the document does not specify page ranges for printing.

Parameters:

  • n_ranges (Integer)

    return location for number of ranges

Returns:

  • (Array<Poppler::PageRange>)

    an array of Poppler::PageRanges or nil. Free the array when it is no longer needed.

#has_attachmentsBoolean

Returns true of document has any attachments.

Returns:

  • (Boolean)

    true, if document has attachments.

#has_javascriptBoolean

Returns whether document has any javascript in it.

Returns:

  • (Boolean)

#index_iterObject



106
107
108
# File 'lib/poppler/document.rb', line 106

def index_iter
  IndexIter.new(self)
end

#initialize_rawPoppler::Document

Creates a new Poppler::Document reading the PDF contents from stream. Note that the given GInput::Stream must be seekable or %G_IO_ERROR_NOT_SUPPORTED will be returned. Possible errors include those in the #POPPLER_ERROR, #G_FILE_ERROR and #G_IO_ERROR domains.

Parameters:

  • stream (Gio::InputStream)

    a GInput::Stream to read from

  • length (Integer)

    the stream length, or -1 if not known

  • password (String)

    password to unlock the file with, or nil

  • cancellable (Gio::Cancellable)

    a #GCancellable, or nil

Returns:



# File 'lib/poppler/document.rb', line 23

#is_linearizedBoolean

Returns whether document is linearized or not. Linearization of PDF enables efficient incremental access of the PDF file in a network environment.

Returns:

  • (Boolean)

    true if document is linearized, false otherwise

#keywordsString

The keywords associated to the document

Returns:

  • (String)

    keywords

#keywords=(keywords) ⇒ String

The keywords associated to the document

Parameters:

  • keywords (String)

Returns:

  • (String)

    keywords

  • (String)

    keywords

#linearized=(linearized) ⇒ Boolean

Whether document is linearized. See also poppler_document_is_linearized()

Parameters:

  • linearized (Boolean)

Returns:

  • (Boolean)

    linearized

  • (Boolean)

    linearized

#linearized?Boolean

Whether document is linearized. See also poppler_document_is_linearized()

Returns:

  • (Boolean)

    linearized

#metadataString

Document metadata in XML format, or nil

Returns:

  • (String)

    metadata

#metadata=(metadata) ⇒ String

Document metadata in XML format, or nil

Parameters:

  • metadata (String)

Returns:

  • (String)

    metadata

  • (String)

    metadata

#mod_dateInteger

The date the document was most recently modified as seconds since the Epoch, or -1

Returns:

  • (Integer)

    mod-date

#mod_date=(mod_date) ⇒ Integer

The date the document was most recently modified as seconds since the Epoch, or -1

Parameters:

  • mod_date (Integer)

Returns:

  • (Integer)

    mod-date

  • (Integer)

    mod-date

#mod_datetimeGLib::DateTime

The GDate::Time the document was most recently modified.

Returns:

  • (GLib::DateTime)

    mod-datetime

#mod_datetime=(mod_datetime) ⇒ GLib::DateTime

The GDate::Time the document was most recently modified.

Parameters:

  • mod_datetime (GLib::DateTime)

Returns:

  • (GLib::DateTime)

    mod-datetime

  • (GLib::DateTime)

    mod-datetime

#modification_datePoppler::time_t

Returns the date the document was most recently modified as seconds since the Epoch

Returns:

  • (Poppler::time_t)

    the date the document was most recently modified, or -1

#modification_date=(modification_date) ⇒ nil

Sets the document's modification date. If modification_date is -1, ModDate entry is removed from the document's Info dictionary.

Parameters:

  • modification_date (Poppler::time_t)

    A new modification date

Returns:

  • (nil)

#modification_date_timeGLib::DateTime

Returns the date the document was most recently modified as a GDate::Time

Returns:

  • (GLib::DateTime)

    the date the document was modified, or nil

#modification_date_time=(modification_datetime) ⇒ nil

Sets the document's modification date. If modification_datetime is nil, ModDate entry is removed from the document's Info dictionary.

Parameters:

  • modification_datetime (GLib::DateTime)

    A new modification GDate::Time

Returns:

  • (nil)

#n_attachmentsInteger

Returns the number of attachments in a loaded document. See also poppler_document_get_attachments()

Returns:

  • (Integer)

    Number of attachments

#n_pagesInteger Also known as: size

Returns the number of pages in a loaded document.

Returns:

  • (Integer)

    Number of pages

#n_signaturesInteger

Returns how many digital signatures document contains. PDF digital signatures ensure that the content hash not been altered since last edit and that it was produced by someone the user can trust

Returns:

  • (Integer)

    The number of signatures found in the document

#page_layoutPoppler::PageLayout

The page layout that should be used when the document is opened

Returns:

#page_layout=(page_layout) ⇒ Poppler::PageLayout

The page layout that should be used when the document is opened

Parameters:

Returns:

#page_modePoppler::PageMode

The mode that should be used when the document is opened

Returns:

#page_mode=(page_mode) ⇒ Poppler::PageMode

The mode that should be used when the document is opened

Parameters:

Returns:

#pdf_conformancePoppler::PDFConformance

Returns the conformance level of the document as Poppler::PDFConformance.

Returns:

#pdf_partPoppler::PDFPart

Returns the part of the conforming standard that the document adheres to as a Poppler::PDFSubtype.

Returns:

#pdf_subtypePoppler::PDFSubtype

Returns the subtype of document as a Poppler::PDFSubtype.

Returns:

#pdf_subtype_stringString

Returns the PDF subtype version of document as a string. the PDF subtype version of document, or nil

Returns:

  • (String)

    a newly allocated string containing

#pdf_version_stringString

Returns the PDF version of document as a string (e.g. PDF-1.6)

Returns:

  • (String)

    a new allocated string containing the PDF version of document, or nil

#permissionsPoppler::Permissions

Flags specifying which operations are permitted when the document is opened

Returns:

#permissions=(permissions) ⇒ Poppler::Permissions

Flags specifying which operations are permitted when the document is opened

Parameters:

Returns:

Returns print-duplex.

Returns:

Parameters:

Returns:

Suggested number of copies to be printed for this document

Returns:

  • (Integer)

    print-n-copies

Suggested number of copies to be printed for this document

Parameters:

  • print_n_copies (Integer)

Returns:

  • (Integer)

    print-n-copies

  • (Integer)

    print-n-copies

Returns print-scaling.

Returns:

Parameters:

Returns:

#producerString

The producer of the document. See also poppler_document_get_producer()

Returns:

  • (String)

    producer

#producer=(producer) ⇒ String

The producer of the document. See also poppler_document_get_producer()

Parameters:

  • producer (String)

Returns:

  • (String)

    producer

  • (String)

    producer

#reset_form(fields, exclude_fields) ⇒ nil

Resets the form fields specified by fields if exclude_fields is FALSE. Resets all others if exclude_fields is TRUE. All form fields are reset regardless of the exclude_fields flag if fields is empty.

Parameters:

  • fields (GLib::List<String>)

    list of fields to reset

  • exclude_fields (Boolean)

    whether to reset all fields except those in fields

Returns:

  • (nil)

#save(uri) ⇒ Boolean

Saves document. Any change made in the document such as form fields filled, annotations added or modified will be saved. If error is set, false will be returned. Possible errors include those in the #G_FILE_ERROR domain.

Parameters:

  • uri (String)

    uri of file to save

Returns:

  • (Boolean)

    true, if the document was successfully saved



97
98
99
# File 'lib/poppler/document.rb', line 97

def save(uri)
  save_raw(ensure_uri(uri))
end

#save_a_copy(uri) ⇒ Boolean

Saves a copy of the original document. Any change made in the document such as form fields filled by the user will not be saved. If error is set, false will be returned. Possible errors include those in the #G_FILE_ERROR domain.

Parameters:

  • uri (String)

    uri of file to save

Returns:

  • (Boolean)

    true, if the document was successfully saved



102
103
104
# File 'lib/poppler/document.rb', line 102

def save_a_copy(uri)
  save_a_copy_raw(ensure_uri(uri))
end

#save_a_copy_rawBoolean

Saves a copy of the original document. Any change made in the document such as form fields filled by the user will not be saved. If error is set, false will be returned. Possible errors include those in the #G_FILE_ERROR domain.

Parameters:

  • uri (String)

    uri of file to save

Returns:

  • (Boolean)

    true, if the document was successfully saved



# File 'lib/poppler/document.rb', line 101

#save_rawBoolean

Saves document. Any change made in the document such as form fields filled, annotations added or modified will be saved. If error is set, false will be returned. Possible errors include those in the #G_FILE_ERROR domain.

Parameters:

  • uri (String)

    uri of file to save

Returns:

  • (Boolean)

    true, if the document was successfully saved



# File 'lib/poppler/document.rb', line 96

#save_to_fd(fd, include_changes) ⇒ Boolean

Saves document. Any change made in the document such as form fields filled, annotations added or modified will be saved if include_changes is true, or discarded i include_changes is false.

Note that this function takes ownership of fd; you must not operate on it again, nor close it.

If error is set, false will be returned. Possible errors include those in the #G_FILE_ERROR domain.

Parameters:

  • fd (Integer)

    a valid file descriptor open for writing

  • include_changes (Boolean)

    whether to include user changes (e.g. form fills)

Returns:

  • (Boolean)

    true, if the document was successfully saved

#sign(signing_data, cancellable, callback, user_data) ⇒ nil

Sign #document using #signing_data.

Parameters:

  • signing_data (Poppler::SigningData)

    a Poppler::SigningData

  • cancellable (Gio::Cancellable)

    a #GCancellable

  • callback (Gio::AsyncReadyCallback)

    a GAsync::ReadyCallback

  • user_data (GObject)

    user data used by callback function

Returns:

  • (nil)

#sign_finish(result) ⇒ Boolean

Finish poppler_sign_document and get return status or error.

Parameters:

  • result (Gio::AsyncResult)

    a GAsync::Result

Returns:

  • (Boolean)

    true on successful signing a document, otherwise false and error is set.

#signature_fieldsGLib::List<Poppler::FormField>

Returns a #GList containing all signature Poppler::FormFields in the document.

Returns:

#subjectString

The subject of the document

Returns:

  • (String)

    subject

#subject=(subject) ⇒ String

The subject of the document

Parameters:

  • subject (String)

Returns:

  • (String)

    subject

  • (String)

    subject

#subtypePoppler::PDFSubtype

Document PDF subtype type

Returns:

#subtype=(subtype) ⇒ Poppler::PDFSubtype

Document PDF subtype type

Parameters:

Returns:

#subtype_conformancePoppler::PDFConformance

Document PDF subtype conformance

Returns:

#subtype_conformance=(subtype_conformance) ⇒ Poppler::PDFConformance

Document PDF subtype conformance

Parameters:

Returns:

#subtype_partPoppler::PDFPart

Document PDF subtype part

Returns:

#subtype_part=(subtype_part) ⇒ Poppler::PDFPart

Document PDF subtype part

Parameters:

Returns:

#subtype_stringString

Document PDF subtype. See also poppler_document_get_pdf_subtype_string()

Returns:

  • (String)

    subtype-string

#subtype_string=(subtype_string) ⇒ String

Document PDF subtype. See also poppler_document_get_pdf_subtype_string()

Parameters:

  • subtype_string (String)

Returns:

  • (String)

    subtype-string

  • (String)

    subtype-string

#titleString

The document's title or nil

Returns:

  • (String)

    title

#title=(title) ⇒ String

The document's title or nil

Parameters:

  • title (String)

Returns:

  • (String)

    title

  • (String)

    title

#viewer_preferencesPoppler::ViewerPreferences

Returns viewer-preferences.

Returns:

#viewer_preferences=(viewer_preferences) ⇒ Poppler::ViewerPreferences

Parameters:

Returns: