Module: Rpdfium::Raw

Extended by:
FFI::Library
Defined in:
lib/rpdfium/raw.rb

Overview

Layer 1: raw FFI bindings to the PDFium C API. 1:1 mapping with the original names. Use the wrapper classes for application code. PDFium “Experimental” APIs are marked in the comments: in theory they could change, in practice they have been stable for years.

Defined Under Namespace

Classes: FPDF_FORMFILLINFO, FPDF_IMAGEOBJ_METADATA, FS_MATRIX, FS_POINTF, FS_QUADPOINTSF, FS_RECTF, FS_SIZEF

Constant Summary collapse

FPDFBitmap_Unknown =

Constants

Bitmap formats

0
FPDFBitmap_Gray =
1
FPDFBitmap_BGR =
2
FPDFBitmap_BGRx =
3
FPDFBitmap_BGRA =
4
FPDF_ANNOT =

Render flags (bit fields)

0x01
FPDF_LCD_TEXT =
0x02
FPDF_NO_NATIVETEXT =
0x04
FPDF_GRAYSCALE =
0x08
FPDF_REVERSE_BYTE_ORDER =

→ RGBA instead of BGRA

0x10
FPDF_NO_GDIPLUS =
0x40
FPDF_PRINTING =
0x800
FPDF_RENDER_NO_SMOOTHTEXT =
0x1000
FPDF_RENDER_NO_SMOOTHIMAGE =
0x2000
FPDF_RENDER_NO_SMOOTHPATH =
0x4000
PAGEOBJ_UNKNOWN =

Page object types

0
PAGEOBJ_TEXT =
1
PAGEOBJ_PATH =
2
PAGEOBJ_IMAGE =
3
PAGEOBJ_SHADING =
4
PAGEOBJ_FORM =
5
SEGMENT_UNKNOWN =

Path segment types

-1
SEGMENT_LINETO =
0
SEGMENT_BEZIERTO =
1
SEGMENT_MOVETO =
2
FILLMODE_NONE =

Path fill mode

0
FILLMODE_ALTERNATE =
1
FILLMODE_WINDING =
2
TEXT_RENDERMODE_FILL =

Text render modes

0
TEXT_RENDERMODE_STROKE =
1
TEXT_RENDERMODE_FILL_STROKE =
2
TEXT_RENDERMODE_INVISIBLE =
3
FPDF_ANNOT_UNKNOWN =

Annotation subtypes (PDF spec 12.5.6)

0
FPDF_ANNOT_TEXT =
1
2
FPDF_ANNOT_FREETEXT =
3
FPDF_ANNOT_LINE =
4
FPDF_ANNOT_SQUARE =
5
FPDF_ANNOT_CIRCLE =
6
FPDF_ANNOT_HIGHLIGHT =
9
FPDF_ANNOT_UNDERLINE =
10
FPDF_ANNOT_SQUIGGLY =
11
FPDF_ANNOT_STRIKEOUT =
12
FPDF_ANNOT_STAMP =
13
FPDF_ANNOT_INK =
15
FPDF_ANNOT_POPUP =
16
FPDF_ANNOT_FILEATTACHMENT =
17
FPDF_ANNOT_WIDGET =
20
FPDF_ANNOT_REDACT =
27
ANNOT_SUBTYPE_NAMES =
{
  FPDF_ANNOT_TEXT => "Text", FPDF_ANNOT_LINK => "Link",
  FPDF_ANNOT_FREETEXT => "FreeText", FPDF_ANNOT_LINE => "Line",
  FPDF_ANNOT_SQUARE => "Square", FPDF_ANNOT_CIRCLE => "Circle",
  FPDF_ANNOT_HIGHLIGHT => "Highlight", FPDF_ANNOT_UNDERLINE => "Underline",
  FPDF_ANNOT_SQUIGGLY => "Squiggly", FPDF_ANNOT_STRIKEOUT => "StrikeOut",
  FPDF_ANNOT_STAMP => "Stamp", FPDF_ANNOT_INK => "Ink",
  FPDF_ANNOT_POPUP => "Popup",
  FPDF_ANNOT_FILEATTACHMENT => "FileAttachment",
  FPDF_ANNOT_WIDGET => "Widget", FPDF_ANNOT_REDACT => "Redact"
}.freeze
FPDF_FORMFIELD_UNKNOWN =

Form field types (for widget annotations)

0
FPDF_FORMFIELD_PUSHBUTTON =
1
FPDF_FORMFIELD_CHECKBOX =
2
FPDF_FORMFIELD_RADIOBUTTON =
3
FPDF_FORMFIELD_COMBOBOX =
4
FPDF_FORMFIELD_LISTBOX =
5
FPDF_FORMFIELD_TEXTFIELD =
6
FPDF_FORMFIELD_SIGNATURE =
7
FPDF_MATCHCASE =

Search flags

0x01
FPDF_MATCHWHOLEWORD =
0x02
FPDF_CONSECUTIVE =
0x04
FORMTYPE_NONE =

Form types (FPDF_GetFormType)

0
FORMTYPE_ACRO_FORM =
1
FORMTYPE_XFA_FULL =
2
FORMTYPE_XFA_FOREGROUND =
3

Class Method Summary collapse

Class Method Details

.attach_function(name, *_args) ⇒ Object

Override of attach_function when the library failed to load: do not call super (which would blow up), generate the stub directly.



87
88
89
90
91
92
93
94
# File 'lib/rpdfium/raw.rb', line 87

def self.attach_function(name, *args)
  super
rescue FFI::NotFoundError, RuntimeError => e
  define_singleton_method(name) do |*_a|
    raise Rpdfium::LoadError,
          "PDFium symbol #{name} not available: #{e.message}"
  end
end

.candidate_pathsObject

Builds the list of candidates that ‘ffi_lib` will try in order.

WARNING: FFI auto-appends the platform’s “natural” extension (.dylib on macOS, .so on Linux, .dll on Windows) when the supplied path does not already end with a known extension. Therefore, if we pass ‘libpdfium.so` on macOS, FFI looks for `libpdfium.so.dylib` — absurd but documented. To avoid this, we filter the system_library_names by host OS.

Additionally: ENV and Rpdfium::Binary.library_path are ABSOLUTE/EXPLICIT paths: if they are not found, we do NOT fall back to system names. We immediately return an array of a single path: in that case ffi_lib either succeeds right away, or raises a clear LoadError (which is what the user wants — they provided an explicit path).



28
29
30
31
32
33
34
35
36
37
38
# File 'lib/rpdfium/raw.rb', line 28

def self.candidate_paths
  explicit = ENV["PDFIUM_LIBRARY_PATH"]
  return [explicit] if explicit && !explicit.empty?

  if defined?(Rpdfium::Binary) && Rpdfium::Binary.respond_to?(:library_path)
    path = Rpdfium::Binary.library_path
    return [path] if path && !path.empty?
  end

  system_library_names
end

.host_osObject



56
57
58
59
60
61
62
# File 'lib/rpdfium/raw.rb', line 56

def self.host_os
  case RbConfig::CONFIG["host_os"]
  when /darwin/         then :macos
  when /linux/          then :linux
  when /mswin|mingw|cygwin/ then :windows
  end
end

.load_errorObject



68
# File 'lib/rpdfium/raw.rb', line 68

def self.load_error;     @load_error;    end

.native_loaded?Boolean

Returns:

  • (Boolean)


67
# File 'lib/rpdfium/raw.rb', line 67

def self.native_loaded?; @native_loaded; end

.read_ascii_string(method_name, *args) ⇒ Object

Same two-call convention, but for the few APIs that return 7-bit ASCII bytes instead of UTF-16LE (e.g. FPDFAction_GetURIPath).



977
978
979
980
981
982
983
984
985
986
# File 'lib/rpdfium/raw.rb', line 977

def self.read_ascii_string(method_name, *args)
  args_probe = args + [FFI::Pointer::NULL, 0]
  n_bytes = send(method_name, *args_probe)
  return "" if n_bytes <= 1 # only the null terminator or an error

  buf = FFI::MemoryPointer.new(:uchar, n_bytes)
  args_real = args + [buf, n_bytes]
  send(method_name, *args_real)
  buf.read_bytes(n_bytes).delete("\x00").force_encoding("UTF-8")
end

.read_utf16_string(method_name, *args) ⇒ Object

Helper: reading UTF-16LE strings that PDFium returns as bytes

PDFium convention: most Get*Text/Get*Name calls return ‘unsigned long` (number of BYTES, terminator included). It is called first with a NULL/0 buffer to obtain the size, then with an allocated buffer.



964
965
966
967
968
969
970
971
972
973
# File 'lib/rpdfium/raw.rb', line 964

def self.read_utf16_string(method_name, *args)
  args_probe = args + [FFI::Pointer::NULL, 0]
  n_bytes = send(method_name, *args_probe)
  return "" if n_bytes <= 2 # only the null terminator or an error

  buf = FFI::MemoryPointer.new(:uchar, n_bytes)
  args_real = args + [buf, n_bytes]
  send(method_name, *args_real)
  utf16_bytes_to_utf8(buf.read_bytes(n_bytes))
end

.system_library_namesObject

“System” names filtered by host OS. We keep ‘pdfium` / `libpdfium` (without extension) first: FFI auto-appends the right ext. Names with an extension are included ONLY if they match the host OS, so we avoid the double-extension bug.



44
45
46
47
48
49
50
51
52
53
54
# File 'lib/rpdfium/raw.rb', line 44

def self.system_library_names
  base = %w[pdfium libpdfium]
  host = host_os
  ext_specific = case host
                 when :macos   then %w[libpdfium.dylib]
                 when :linux   then %w[libpdfium.so]
                 when :windows then %w[pdfium.dll libpdfium.dll]
                 else []
                 end
  base + ext_specific
end

.utf16_bytes_to_utf8(bytes) ⇒ Object

PDFium returns little-endian UTF-16LE with a null terminator.



989
990
991
992
993
# File 'lib/rpdfium/raw.rb', line 989

def self.utf16_bytes_to_utf8(bytes)
  bytes.force_encoding("UTF-16LE")
       .encode("UTF-8", invalid: :replace, undef: :replace)
       .delete("\x00")
end