Class: Rpdfium::Search

Inherits:
Object
  • Object
show all
Includes:
Enumerable
Defined in:
lib/rpdfium/search/search.rb

Overview

Ricerca testuale interna alla pagina, basata su FPDFText_Find*. Mantiene lo stato (cursor) e supporta forward/backward.

Esempio:

page.search("totale").each_match { |m| p m[:bbox], m[:text] }

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(page, query, match_case: false, whole_word: false, start_index: 0) ⇒ Search

Returns a new instance of Search.

Raises:



12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# File 'lib/rpdfium/search/search.rb', line 12

def initialize(page, query, match_case: false, whole_word: false, start_index: 0)
  @page = page
  @query = query
  @start_index = start_index
  flags = 0
  flags |= Raw::FPDF_MATCHCASE      if match_case
  flags |= Raw::FPDF_MATCHWHOLEWORD if whole_word

  utf16 = query.encode("UTF-16LE") + "\x00\x00".b
  @query_buf = FFI::MemoryPointer.new(:uchar, utf16.bytesize)
  @query_buf.put_bytes(0, utf16)

  handle = Raw.FPDFText_FindStart(@page.text_page.handle, @query_buf,
                                   flags, start_index)
  raise Error, "FindStart failed" if handle.null?

  @state = { handle: handle, closed: false }
  ObjectSpace.define_finalizer(self, self.class.finalizer(@state))
end

Class Method Details

.finalizer(state) ⇒ Object



32
33
34
35
36
37
38
39
40
# File 'lib/rpdfium/search/search.rb', line 32

def self.finalizer(state)
  proc do
    next if state[:closed]
    next if state[:handle].null?

    Raw.FPDFText_FindClose(state[:handle])
    state[:closed] = true
  end
end

Instance Method Details

#closeObject



68
69
70
71
72
73
74
75
# File 'lib/rpdfium/search/search.rb', line 68

def close
  return if @state[:closed]

  Raw.FPDFText_FindClose(@state[:handle]) unless @state[:handle].null?
  @state[:handle] = FFI::Pointer::NULL
  @state[:closed] = true
  ObjectSpace.undefine_finalizer(self)
end

#current_matchObject



57
58
59
60
61
62
63
64
65
66
# File 'lib/rpdfium/search/search.rb', line 57

def current_match
  idx = Raw.FPDFText_GetSchResultIndex(@state[:handle])
  n   = Raw.FPDFText_GetSchCount(@state[:handle])
  {
    char_index: idx,
    length:     n,
    text:       extract_text(idx, n),
    rects:      extract_rects(idx, n)
  }
end

#each_matchObject Also known as: each

Itera tutte le occorrenze in avanti. Ritorna hash con :char_index, :length, :text, :rects (array di bbox top-down: una per riga di testo).



48
49
50
51
52
53
54
# File 'lib/rpdfium/search/search.rb', line 48

def each_match
  return enum_for(:each_match) unless block_given?

  while Raw.FPDFText_FindNext(@state[:handle]) == 1
    yield current_match
  end
end

#handleObject



42
43
44
# File 'lib/rpdfium/search/search.rb', line 42

def handle
  @state[:handle]
end