Module: Rpdfium::Binary::Downloader

Defined in:
lib/rpdfium/binary/downloader.rb

Overview

Scarica e installa un asset di pdfium-binaries. Usa solo stdlib (net/http, openssl, zlib, rubygems/package). Supporta redirect HTTP, verifica SHA256 opzionale, atomicità via tmp dir.

Constant Summary collapse

RELEASE_BASE =
"https://github.com/bblanchon/pdfium-binaries/releases"

Class Method Summary collapse

Class Method Details

.asset_url(platform_key, pdfium_build) ⇒ Object

Costruisce l’URL dell’asset per una data piattaforma e build. Se build == “latest” usa l’URL “latest/download/…”, altrimenti quello specifico “/download/chromium/<N>/…”.



24
25
26
27
28
29
30
31
# File 'lib/rpdfium/binary/downloader.rb', line 24

def asset_url(platform_key, pdfium_build)
  asset = "pdfium-#{platform_key}.tgz"
  if pdfium_build == "latest"
    "#{RELEASE_BASE}/latest/download/#{asset}"
  else
    "#{RELEASE_BASE}/download/chromium/#{pdfium_build}/#{asset}"
  end
end

.download(url, dest, redirect_limit: 10) ⇒ Object

—- HTTP download con redirect —-



58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
# File 'lib/rpdfium/binary/downloader.rb', line 58

def download(url, dest, redirect_limit: 10)
  raise DownloadError, "too many redirects" if redirect_limit <= 0

  uri = URI.parse(url)
  Net::HTTP.start(uri.host, uri.port,
                  use_ssl: uri.scheme == "https",
                  read_timeout: 120, open_timeout: 30) do |http|
    req = Net::HTTP::Get.new(uri.request_uri)
    req["User-Agent"] = "rpdfium-binary/#{Rpdfium::Binary::VERSION}"
    http.request(req) do |res|
      case res
      when Net::HTTPSuccess
        File.open(dest, "wb") { |f| res.read_body { |c| f.write(c) } }
      when Net::HTTPRedirection
        download(res["location"], dest, redirect_limit: redirect_limit - 1)
      else
        raise DownloadError, "HTTP #{res.code} from #{url}"
      end
    end
  end
rescue SocketError, Errno::ECONNREFUSED, Net::OpenTimeout => e
  raise DownloadError, "Cannot reach #{uri.host}: #{e.message}"
end

.expected_sha256(platform_key, pdfium_build) ⇒ Object

Mappa di SHA256 attesi. La gemma platform-specific include i propri SHA come parte del package_data (file YAML caricato a build time). La gemma generica può usare valori inseriti via ENV o lasciare nil (download senza verifica, sconsigliato).



37
38
39
40
41
# File 'lib/rpdfium/binary/downloader.rb', line 37

def expected_sha256(platform_key, pdfium_build)
  return nil unless defined?(SHA256_DB)

  SHA256_DB.dig(pdfium_build, platform_key)
end

.extract!(tgz_path, dest) ⇒ Object

—- Estrazione tarball gzip —- I tarball bblanchon hanno layout:

include/      -> headers (non ci servono)
lib/libpdfium.so | bin/pdfium.dll
args.gn, build.txt, ...
LICENSE, LICENSES.chromium.html  -> includili sempre


99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
# File 'lib/rpdfium/binary/downloader.rb', line 99

def extract!(tgz_path, dest)
  File.open(tgz_path, "rb") do |io|
    gz = Zlib::GzipReader.new(io)
    Gem::Package::TarReader.new(gz).each do |entry|
      next unless entry.file?

      name = entry.full_name
      # Bypass percorsi sospetti
      next if name.include?("..")

      # Salviamo solo:
      #  - il binario nativo (in lib/ o bin/)
      #  - file di licenza
      keep = name.match?(%r{(?:^|/)lib/lib(?:pdfium)\.[^/]+$}) ||
             name.match?(%r{(?:^|/)bin/pdfium\.dll$}) ||
             name.match?(/LICENSE/i)
      next unless keep

      target = File.join(dest, name)
      FileUtils.mkdir_p(File.dirname(target))
      File.open(target, "wb") { |f| f.write(entry.read) }
      # Conserva permessi eseguibili
      File.chmod(entry.header.mode & 0o7777, target) rescue nil
    end
  end
end

.fetch!(url:, dest_dir:, library_name:, sha256: nil) ⇒ Object

SCARICA l’asset, verifica il checksum, ESTRAE solo i file utili (libpdfium*, LICENSE) in dest_dir/lib/ e dest_dir/.



45
46
47
48
49
50
51
52
53
54
# File 'lib/rpdfium/binary/downloader.rb', line 45

def fetch!(url:, dest_dir:, library_name:, sha256: nil)
  FileUtils.mkdir_p(dest_dir)
  Dir.mktmpdir("rpdfium-binary-") do |tmp|
    tarball = File.join(tmp, "pdfium.tgz")
    download(url, tarball)
    verify_sha256!(tarball, sha256) if sha256
    extract!(tarball, tmp)
    install_into!(tmp, dest_dir, library_name)
  end
end

.install_into!(staging, dest_dir, library_name) ⇒ Object

Sposta i file utili nella destinazione finale. dest_dir/<library_filename> ← libpdfium.so,dylib | pdfium.dll dest_dir/LICENSE.pdfium ← LICENSE dest_dir/LICENSES_third_party* ← se presenti

Raises:



130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
# File 'lib/rpdfium/binary/downloader.rb', line 130

def install_into!(staging, dest_dir, library_name)
  # Cerca il binario in staging (può essere dentro lib/ o bin/)
  candidates = Dir[File.join(staging, "**", library_name)]
  if candidates.empty?
    # Su macOS il file in lib/ può avere nome diverso (es. con
    # versionamento), prova un fallback più permissivo.
    candidates = Dir[File.join(staging, "**", "libpdfium*")] +
                 Dir[File.join(staging, "**", "pdfium.dll")]
  end
  raise DownloadError, "no PDFium binary found in tarball" if candidates.empty?

  chosen = candidates.first
  FileUtils.cp(chosen, File.join(dest_dir, library_name))

  # Salviamo anche le licenze
  # Dir[File.join(staging, "**", "*LICENSE*")].each do |lic|
  #   FileUtils.cp(lic, File.join(dest_dir, File.basename(lic)))
  # end
end

.verify_sha256!(path, expected) ⇒ Object

—- Verifica integrità —-

Raises:



84
85
86
87
88
89
90
91
# File 'lib/rpdfium/binary/downloader.rb', line 84

def verify_sha256!(path, expected)
  actual = Digest::SHA256.file(path).hexdigest
  return if actual.casecmp(expected).zero?

  raise ChecksumError, "SHA256 mismatch for #{path}\n" \
                        "  expected: #{expected}\n" \
                        "  actual:   #{actual}"
end