Class: Iev::Exporter

Inherits:
Object
  • Object
show all
Defined in:
lib/iev/exporter.rb

Overview

Exports IEV data to Glossarist YAML format.

Automatically detects input format from file extension:

.xlsx / .xls   → Excel IEV export
.sqlite3 / .sqlite / .db → SQLite database

Examples:

Programmatic usage

exporter = Iev::Exporter.new("data.xlsx", output_dir: "/tmp/output")
collection = exporter.export

With filters

Iev::Exporter.new("data.sqlite3",
  output_dir: "/tmp/output",
  only_concepts: "103-%",
  only_languages: "en,fr",
).export

Constant Summary collapse

XLSX_EXTENSIONS =
%w[.xlsx .xls].freeze
SQLITE_EXTENSIONS =
%w[.sqlite3 .sqlite .db].freeze

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(input_path, output_dir: Dir.pwd, only_concepts: nil, only_languages: nil, fetch_relaton_links: false, on_progress: nil) ⇒ Exporter

Returns a new instance of Exporter.

Parameters:

  • input_path (String, Pathname)

    path to Excel or SQLite file

  • output_dir (String, Pathname) (defaults to: Dir.pwd)

    destination for YAML files

  • only_concepts (String, nil) (defaults to: nil)

    SQL LIKE pattern for IEVREF filtering

  • only_languages (String, nil) (defaults to: nil)

    comma-separated language codes

  • fetch_relaton_links (Boolean) (defaults to: false)

    fetch source URLs via Relaton

  • on_progress (Proc, nil) (defaults to: nil)

    callback (current, total) during build



32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
# File 'lib/iev/exporter.rb', line 32

def initialize(input_path, output_dir: Dir.pwd,
               only_concepts: nil, only_languages: nil,
               fetch_relaton_links: false,
               on_progress: nil)
  @input_path = Pathname.new(input_path)
  validate_input!

  @output_dir = Pathname.new(output_dir)
  @fetch_relaton_links = fetch_relaton_links
  @on_progress = on_progress
  @filters = {
    only_concepts: only_concepts,
    only_languages: only_languages,
  }.compact
end

Instance Attribute Details

#filtersObject (readonly)

Returns the value of attribute filters.



24
25
26
# File 'lib/iev/exporter.rb', line 24

def filters
  @filters
end

#input_pathObject (readonly)

Returns the value of attribute input_path.



24
25
26
# File 'lib/iev/exporter.rb', line 24

def input_path
  @input_path
end

#output_dirObject (readonly)

Returns the value of attribute output_dir.



24
25
26
# File 'lib/iev/exporter.rb', line 24

def output_dir
  @output_dir
end

#statsHash? (readonly)

Returns stats from last export, or nil if export hasn’t run.

Returns:

  • (Hash, nil)

    stats from last export, or nil if export hasn’t run



66
67
68
# File 'lib/iev/exporter.rb', line 66

def stats
  @stats
end

Instance Method Details

#exportGlossarist::ManagedConceptCollection

Run the export pipeline: load → transform → save.

Returns:

  • (Glossarist::ManagedConceptCollection)


50
51
52
53
54
55
56
57
58
59
60
61
62
63
# File 'lib/iev/exporter.rb', line 50

def export
  start_time = Process.clock_gettime(Process::CLOCK_MONOTONIC)
  dataset = load_dataset
  collection = build_collection(dataset)
  save_collection(collection)
  elapsed = Process.clock_gettime(Process::CLOCK_MONOTONIC) - start_time

  @stats = {
    concept_count: collection.count,
    localized_count: localized_count(collection),
    elapsed_seconds: elapsed,
  }
  collection
end