Class: Bundler::Spinel::Enricher
- Inherits:
-
Object
- Object
- Bundler::Spinel::Enricher
- Defined in:
- lib/bundler/spinel/enricher.rb
Overview
Fetches public gem metadata from rubygems.org (description, total downloads, latest version + date, homepage, license) for a list of gems, into a sidecar meta.jsonl — one JSON line per gem. The catalog uses it to be enticing (real descriptions, sort by popularity) and to weed out low-signal / test gems by a downloads floor.
Append-only and resumable: a re-run skips gems already recorded, so a flaky network just needs another pass. Transient (non-200/404) responses are left unrecorded so the next run retries them. Committed alongside the survey, so the deploy build renders the catalog offline — no network at build time.
Constant Summary collapse
- HOST =
"rubygems.org"
Instance Method Summary collapse
-
#initialize(out:, jobs: 8) ⇒ Enricher
constructor
A new instance of Enricher.
-
#run(names, progress: $stderr) ⇒ Object
names: Array<String>.
Constructor Details
#initialize(out:, jobs: 8) ⇒ Enricher
Returns a new instance of Enricher.
20 21 22 23 24 |
# File 'lib/bundler/spinel/enricher.rb', line 20 def initialize(out:, jobs: 8) @out = out @jobs = jobs @write = Mutex.new end |
Instance Method Details
#run(names, progress: $stderr) ⇒ Object
names: Array<String>. Appends one JSON line per newly-fetched gem.
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 |
# File 'lib/bundler/spinel/enricher.rb', line 27 def run(names, progress: $stderr) have = existing todo = names.uniq.reject { |n| have.include?(n) } queue = Queue.new todo.each { |n| queue << n } total = todo.size done = 0 progress&.puts("[enrich] #{have.size} already recorded, #{total} to fetch") File.open(@out, "a") do |f| workers = Array.new([@jobs, [total, 1].max].min) do Thread.new do http = open_http until queue.empty? name = (queue.pop(true) rescue break) rec = fetch(http, name) @write.synchronize do f.puts(JSON.generate(rec)) && f.flush if rec done += 1 progress&.print("\r[enrich] #{done}/#{total} #{name.ljust(30)}") end end http.finish if http.started? end end workers.each(&:join) end progress&.puts("\r[enrich] #{done}/#{total} done#{' ' * 30}") end |