Module: Ucode::Fetch::CodeCharts

Defined in:
lib/ucode/fetch/code_charts.rb

Overview

Downloads per-block Code Charts PDFs from unicode.org/charts/PDF/.

URL pattern: ‘www.unicode.org/charts/PDF/U<XXXX>.pdf` where `XXXX` is the block’s first codepoint zero-padded to 4 digits (5–6 digits for planes > 0).

Class Method Summary collapse

Class Method Details

.call(version, block_first_cps:, force: false) ⇒ Integer

Returns number of PDFs downloaded.

Parameters:

  • version (String)

    used as the on-disk path namespace; PDFs are not versioned on unicode.org so the argument is mostly a convention.

  • block_first_cps (Array<Integer>)

    first codepoint of each block to download. If nil, caller is expected to derive the list from ‘Parsers::Blocks` (the PDF URL is `U<hex>.pdf`).

  • force (Boolean) (defaults to: false)

    re-download even if cached.

Returns:

  • (Integer)

    number of PDFs downloaded.



20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
# File 'lib/ucode/fetch/code_charts.rb', line 20

def call(version, block_first_cps:, force: false)
  Cache.ensure_version_dir!(version)
  pdfs_dir = Cache.pdfs_dir(version)
  pdfs_dir.mkpath

  downloaded = 0
  block_first_cps.each do |first_cp|
    filename = "U#{hex_pad(first_cp)}.pdf"
    dest = pdfs_dir.join(filename)
    next if dest.exist? && !force

    url = "#{Ucode.configuration.charts_base_url}/#{filename}"
    Http.get(url, dest: dest)
    downloaded += 1
  end
  downloaded
end

.first_cps_from(blocks) ⇒ Array<Integer>

Build the block→first-cp list from a parsed Blocks index. The caller passes the output of ‘Ucode::Parsers::Blocks.each_record` collapsed into `block_id => first_cp`.

Parameters:

Returns:

  • (Array<Integer>)

    first-cp values



44
45
46
# File 'lib/ucode/fetch/code_charts.rb', line 44

def first_cps_from(blocks)
  blocks.map(&:range_first)
end