Class: Woods::Unblocked::Exporter

Inherits:
Object
  • Object
show all
Defined in:
lib/woods/unblocked/exporter.rb

Overview

Orchestrates syncing Woods extraction data to an Unblocked collection.

Reads extraction output from disk via IndexReader, converts units to condensed Markdown documents, and pushes via the Unblocked Documents API. All syncs are idempotent — documents are upserted by URI.

Examples:

exporter = Exporter.new(index_dir: "tmp/woods")
stats = exporter.sync_all
# => { synced: 940, skipped: 5060, errors: [] }

Constant Summary collapse

MAX_ERRORS =
100
FULL_SYNC_TYPES =

Unit types to sync, in priority order. All units are synced for these types.

%w[
  model controller service job mailer manager decorator concern serializer
  graphql graphql_type graphql_mutation graphql_resolver graphql_query
].freeze
PARTIAL_SYNC_TYPES =

Unit types where only the most-connected units are synced. Each entry: [type, max_count]

[
  ['poro', 100],
  ['lib', 50]
].freeze

Instance Method Summary collapse

Constructor Details

#initialize(index_dir:, config: Woods.configuration, client: nil, reader: nil, output: $stdout) ⇒ Exporter

Returns a new instance of Exporter.

Parameters:

  • index_dir (String)

    Path to extraction output directory

  • config (Configuration) (defaults to: Woods.configuration)

    Woods configuration (default: global config)

  • client (Client, nil) (defaults to: nil)

    Unblocked API client (auto-created from config if nil)

  • reader (Object, nil) (defaults to: nil)

    IndexReader instance (auto-created if nil)

  • output (IO) (defaults to: $stdout)

    Progress output stream (default: $stdout)

Raises:



44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
# File 'lib/woods/unblocked/exporter.rb', line 44

def initialize(index_dir:, config: Woods.configuration, client: nil, reader: nil, output: $stdout)
  @collection_id = config.unblocked_collection_id
  raise ConfigurationError, 'unblocked_collection_id is required' unless @collection_id

  repo_url = config.unblocked_repo_url
  raise ConfigurationError, 'unblocked_repo_url is required' unless repo_url

  api_token = config.unblocked_api_token
  raise ConfigurationError, 'unblocked_api_token is required' unless api_token

  budget = ENV.fetch('UNBLOCKED_DAILY_BUDGET', RateLimiter::DEFAULT_BUDGET.to_s).to_i
  limiter = RateLimiter.new(daily_budget: budget)

  @client = client || Client.new(api_token: api_token, rate_limiter: limiter)
  @reader = reader || build_reader(index_dir)
  @builder = DocumentBuilder.new(repo_url: repo_url)
  @output = output
end

Instance Method Details

#sync_allHash

Sync all configured unit types to the Unblocked collection.

Returns:

  • (Hash)

    { synced: Integer, skipped: Integer, errors: Array<String> }



66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
# File 'lib/woods/unblocked/exporter.rb', line 66

def sync_all
  synced = 0
  skipped = 0
  errors = []

  FULL_SYNC_TYPES.each do |type|
    result = sync_type(type)
    synced += result[:synced]
    skipped += result[:skipped]
    errors.concat(result[:errors])
  end

  PARTIAL_SYNC_TYPES.each do |type, max_count|
    result = sync_type_partial(type, max_count)
    synced += result[:synced]
    skipped += result[:skipped]
    errors.concat(result[:errors])
  end

  { synced: synced, skipped: skipped, errors: cap_errors(errors) }
end

#sync_type(type) ⇒ Hash

Sync all units of a given type.

Parameters:

  • type (String)

    Unit type (e.g. “model”, “controller”)

Returns:

  • (Hash)

    { synced: Integer, skipped: Integer, errors: Array<String> }



92
93
94
95
96
97
# File 'lib/woods/unblocked/exporter.rb', line 92

def sync_type(type)
  units = @reader.list_units(type: type)
  log "  #{type}: #{units.size} units"

  sync_units(units)
end

#sync_type_partial(type, max_count) ⇒ Hash

Sync the top N most-connected units of a type (by dependent count).

Parameters:

  • type (String)

    Unit type

  • max_count (Integer)

    Maximum units to sync

Returns:

  • (Hash)

    { synced: Integer, skipped: Integer, errors: Array<String> }



104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
# File 'lib/woods/unblocked/exporter.rb', line 104

def sync_type_partial(type, max_count)
  units = @reader.list_units(type: type)
  return empty_stats if units.empty?

  # Load full data to sort by dependent count
  units_with_data = units.filter_map do |entry|
    data = @reader.find_unit(entry['identifier'])
    next unless data

    dep_count = (data['dependents'] || []).size
    { entry: entry, data: data, dep_count: dep_count }
  end

  top_units = units_with_data.sort_by { |u| -u[:dep_count] }.first(max_count)
  skipped_count = [units.size - max_count, 0].max

  log "  #{type}: #{top_units.size}/#{units.size} units (top by dependents)"

  result = sync_unit_data(top_units.map { |u| [u[:entry], u[:data]] })
  result[:skipped] += skipped_count
  result
end