Class: Rpdfium::Document

Inherits:
Object
  • Object
show all
Includes:
Enumerable
Defined in:
lib/rpdfium/document.rb

Overview

Wrapper di livello documento. Espone:

  • apertura da path / IO / bytes / pagina by index

  • metadata (Title, Author, ecc.)

  • permissions

  • outline (bookmarks)

  • attachments

  • form environment (lazy)

Constant Summary collapse

META_KEYS =
%w[Title Author Subject Keywords Creator Producer
CreationDate ModDate Trapped].freeze
PERMISSIONS =

Permission bits secondo PDF spec (Table 22 §7.6.3.2)

{
  print:       1 << 2,
  modify:      1 << 3,
  copy:        1 << 4,
  annotate:    1 << 5,
  fill_forms:  1 << 8,
  extract_acc: 1 << 9,
  assemble:    1 << 10,
  print_hq:    1 << 11
}.freeze
FORM_TYPES =
Form type =====
{
  Raw::FORMTYPE_NONE      => :none,
  Raw::FORMTYPE_ACRO_FORM => :acroform,
  Raw::FORMTYPE_XFA_FULL  => :xfa_full,
  Raw::FORMTYPE_XFA_FOREGROUND => :xfa_foreground
}.freeze

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(input, password: nil) ⇒ Document

Returns a new instance of Document.



30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
# File 'lib/rpdfium/document.rb', line 30

def initialize(input, password: nil)
  Rpdfium.init!
  @password = password
  @source   = input
  handle, retain_buffer = load_handle(input, password)
  if handle.null?
    code = Rpdfium.last_error_code
    msg  = Rpdfium.last_error_message
    raise PasswordError, msg if code == 4

    raise LoadError, "Failed to load PDF: #{msg}"
  end
  # Stato condiviso tra istanza e finalizer. Wrappato in Hash mutabile
  # perché la closure del finalizer e il close() esplicito devono vedere
  # lo stesso :closed flag — altrimenti chi arriva secondo richiama
  # FPDF_CloseDocument su un handle già liberato e PDFium segfaulta.
  @state = {
    handle: handle,
    retain_buffer: retain_buffer,
    closed: false
  }
  @form_env = nil
  @page_cache = {}
  # IMPORTANTE: il finalizer cattura @state (Hash), NON self. Catturare
  # self impedirebbe al GC di raccogliere il Document. Inoltre il
  # finalizer NON tocca @page_cache: le Page hanno il loro finalizer
  # individuale, e l'ordine di esecuzione tra finalizer è non
  # deterministico in Ruby.
  ObjectSpace.define_finalizer(self, self.class.finalizer(@state))
end

Instance Attribute Details

#sourceObject (readonly)

Returns the value of attribute source.



17
18
19
# File 'lib/rpdfium/document.rb', line 17

def source
  @source
end

Class Method Details

.finalizer(state) ⇒ Object



61
62
63
64
65
66
67
68
69
70
# File 'lib/rpdfium/document.rb', line 61

def self.finalizer(state)
  proc do
    next if state[:closed]
    next if state[:handle].null?

    Raw.FPDF_CloseDocument(state[:handle])
    state[:closed] = true
    state[:retain_buffer] = nil
  end
end

.open(input, password: nil, &block) ⇒ Object



19
20
21
22
23
24
25
26
27
28
# File 'lib/rpdfium/document.rb', line 19

def self.open(input, password: nil, &block)
  doc = new(input, password: password)
  return doc unless block_given?

  begin
    yield doc
  ensure
    doc.close
  end
end

Instance Method Details

#attachmentsObject

Attachments =====


172
173
174
175
# File 'lib/rpdfium/document.rb', line 172

def attachments
  n = Raw.FPDFDoc_GetAttachmentCount(@state[:handle])
  Array.new(n) { |i| Attachment.new(self, i) }
end

#closeObject

Close =====


179
180
181
182
183
184
185
186
187
188
189
190
191
# File 'lib/rpdfium/document.rb', line 179

def close
  return if @state[:closed]

  # Ordine: chiudi prima form env e pagine cached, poi documento.
  @form_env&.close
  @page_cache.each_value(&:close)
  @page_cache.clear
  Raw.FPDF_CloseDocument(@state[:handle]) unless @state[:handle].null?
  @state[:handle] = FFI::Pointer::NULL
  @state[:retain_buffer] = nil
  @state[:closed] = true
  ObjectSpace.undefine_finalizer(self)
end

#closed?Boolean

Returns:

  • (Boolean)


193
194
195
# File 'lib/rpdfium/document.rb', line 193

def closed?
  @state[:closed]
end

#eachObject



95
96
97
98
99
# File 'lib/rpdfium/document.rb', line 95

def each
  return enum_for(:each) unless block_given?

  page_count.times { |i| yield page(i) }
end

#file_versionObject



114
115
116
117
118
119
120
121
# File 'lib/rpdfium/document.rb', line 114

def file_version
  buf = FFI::MemoryPointer.new(:int)
  return nil if Raw.FPDF_GetFileVersion(@state[:handle], buf) == 0

  v = buf.read_int
  # PDFium ritorna 14 → 1.4, 17 → 1.7
  "#{v / 10}.#{v % 10}"
end

#form_envObject

Lazy form environment. Necessario per:

  • leggere FormFieldType/Value/Name su widget annotations

  • renderizzare i form fields sopra la pagina (FFLDraw)



160
161
162
# File 'lib/rpdfium/document.rb', line 160

def form_env
  @form_env ||= Form::Environment.new(self) if has_forms?
end

#form_typeObject



149
150
151
# File 'lib/rpdfium/document.rb', line 149

def form_type
  FORM_TYPES[Raw.FPDF_GetFormType(@state[:handle])] || :unknown
end

#handleObject



72
73
74
# File 'lib/rpdfium/document.rb', line 72

def handle
  @state[:handle]
end

#has_forms?Boolean

Returns:

  • (Boolean)


153
154
155
# File 'lib/rpdfium/document.rb', line 153

def has_forms?
  form_type != :none
end

#metadataObject

Metadata =====


107
108
109
110
111
112
# File 'lib/rpdfium/document.rb', line 107

def 
  META_KEYS.each_with_object({}) do |key, h|
    v = Raw.read_utf16_string(:FPDF_GetMetaText, @state[:handle], key)
    h[key.downcase.to_sym] = v unless v.empty?
  end
end

#outlineObject

Outline =====


166
167
168
# File 'lib/rpdfium/document.rb', line 166

def outline
  Outline.from_document(self)
end

#page(index) ⇒ Object Also known as: []

Raises:



85
86
87
88
89
90
91
92
# File 'lib/rpdfium/document.rb', line 85

def page(index)
  ensure_open!
  raise PageError, "Page index #{index} out of range" unless (0...page_count).cover?(index)

  # Le pagine sono cacheable: ricaricarle è costoso e gli oggetti sono
  # immutabili dal punto di vista applicativo (in modalità read-only).
  @page_cache[index] ||= Page.new(self, index)
end

#page_countObject Also known as: size, length

Pages =====


78
79
80
81
# File 'lib/rpdfium/document.rb', line 78

def page_count
  ensure_open!
  Raw.FPDF_GetPageCount(@state[:handle])
end

#page_label(index) ⇒ Object



101
102
103
# File 'lib/rpdfium/document.rb', line 101

def page_label(index)
  Raw.read_utf16_string(:FPDF_GetPageLabel, @state[:handle], index)
end

#permissionsObject



135
136
137
138
# File 'lib/rpdfium/document.rb', line 135

def permissions
  bits = Raw.FPDF_GetDocPermissions(@state[:handle])
  PERMISSIONS.transform_values { |mask| (bits & mask) == mask }
end