Class: Woods::Unblocked::Exporter

Inherits:

Object

Object
Woods::Unblocked::Exporter

show all

Defined in:: lib/woods/unblocked/exporter.rb

Overview

Orchestrates syncing Woods extraction data to an Unblocked collection.

Reads extraction output from disk via IndexReader, converts units to condensed Markdown documents, and pushes via the Unblocked Documents API. All syncs are idempotent — documents are upserted by URI.

Examples:

exporter = Exporter.new(index_dir: "tmp/woods")
stats = exporter.sync_all
# => { synced: 940, skipped: 5060, errors: [] }

Constant Summary collapse

MAX_ERRORS =

FULL_SYNC_TYPES = Unit types to sync, in priority order. All units are synced for these types.

%w[
  model controller service job mailer manager decorator concern serializer
  graphql graphql_type graphql_mutation graphql_resolver graphql_query
].freeze

PARTIAL_SYNC_TYPES = Unit types where only the most-connected units are synced. Each entry: [type, max_count]

[
  ['poro', 100],
  ['lib', 50]
].freeze

Instance Method Summary collapse

#initialize(index_dir:, config: Woods.configuration, client: nil, reader: nil, output: $stdout) ⇒ Exporter constructor

A new instance of Exporter.
#sync_all ⇒ Hash

Sync all configured unit types to the Unblocked collection.
#sync_type(type) ⇒ Hash

Sync all units of a given type.
#sync_type_partial(type, max_count) ⇒ Hash

Sync the top N most-connected units of a type (by dependent count).

Constructor Details

#initialize(index_dir:, config: Woods.configuration, client: nil, reader: nil, output: $stdout) ⇒ `Exporter`

Returns a new instance of Exporter.

Parameters:

index_dir (String) —

Path to extraction output directory
config (Configuration) (defaults to: Woods.configuration) —

Woods configuration (default: global config)
client (Client, nil) (defaults to: nil) —

Unblocked API client (auto-created from config if nil)
reader (Object, nil) (defaults to: nil) —

IndexReader instance (auto-created if nil)
output (IO) (defaults to: $stdout) —

Progress output stream (default: $stdout)

Raises:

(ConfigurationError) —

if required config is missing

# File 'lib/woods/unblocked/exporter.rb', line 44

def initialize(index_dir:, config: Woods.configuration, client: nil, reader: nil, output: $stdout)
  @collection_id = config.unblocked_collection_id
  raise ConfigurationError, 'unblocked_collection_id is required' unless @collection_id

  repo_url = config.unblocked_repo_url
  raise ConfigurationError, 'unblocked_repo_url is required' unless repo_url

  api_token = config.unblocked_api_token
  raise ConfigurationError, 'unblocked_api_token is required' unless api_token

  budget = ENV.fetch('UNBLOCKED_DAILY_BUDGET', RateLimiter::DEFAULT_BUDGET.to_s).to_i
  limiter = RateLimiter.new(daily_budget: budget)

  @client = client || Client.new(api_token: api_token, rate_limiter: limiter)
  @reader = reader || build_reader(index_dir)
  @builder = DocumentBuilder.new(repo_url: repo_url)
  @output = output
end

Instance Method Details

#sync_all ⇒ `Hash`

Sync all configured unit types to the Unblocked collection.

Returns:

(Hash) —

{ synced: Integer, skipped: Integer, errors: Array<String> }

# File 'lib/woods/unblocked/exporter.rb', line 66

def sync_all
  synced = 0
  skipped = 0
  errors = []

  FULL_SYNC_TYPES.each do |type|
    result = sync_type(type)
    synced += result[:synced]
    skipped += result[:skipped]
    errors.concat(result[:errors])
  end

  PARTIAL_SYNC_TYPES.each do |type, max_count|
    result = sync_type_partial(type, max_count)
    synced += result[:synced]
    skipped += result[:skipped]
    errors.concat(result[:errors])
  end

  { synced: synced, skipped: skipped, errors: cap_errors(errors) }
end

#sync_type(type) ⇒ `Hash`

Sync all units of a given type.

Parameters:

type (String) —

Unit type (e.g. “model”, “controller”)

Returns:

(Hash) —

{ synced: Integer, skipped: Integer, errors: Array<String> }

# File 'lib/woods/unblocked/exporter.rb', line 92

def sync_type(type)
  units = @reader.list_units(type: type)
  log "  #{type}: #{units.size} units"

  sync_units(units)
end

#sync_type_partial(type, max_count) ⇒ `Hash`

Sync the top N most-connected units of a type (by dependent count).

Parameters:

type (String) —

Unit type
max_count (Integer) —

Maximum units to sync

Returns:

(Hash) —

{ synced: Integer, skipped: Integer, errors: Array<String> }

# File 'lib/woods/unblocked/exporter.rb', line 104

def sync_type_partial(type, max_count)
  units = @reader.list_units(type: type)
  return empty_stats if units.empty?

  # Load full data to sort by dependent count
  units_with_data = units.filter_map do |entry|
    data = @reader.find_unit(entry['identifier'])
    next unless data

    dep_count = (data['dependents'] || []).size
    { entry: entry, data: data, dep_count: dep_count }
  end

  top_units = units_with_data.sort_by { |u| -u[:dep_count] }.first(max_count)
  skipped_count = [units.size - max_count, 0].max

  log "  #{type}: #{top_units.size}/#{units.size} units (top by dependents)"

  result = sync_unit_data(top_units.map { |u| [u[:entry], u[:data]] })
  result[:skipped] += skipped_count
  result
end

Class: Woods::Unblocked::Exporter

Overview

Examples:

Constant Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(index_dir:, config: Woods.configuration, client: nil, reader: nil, output: $stdout) ⇒ Exporter

Instance Method Details

#sync_all ⇒ Hash

#sync_type(type) ⇒ Hash

#sync_type_partial(type, max_count) ⇒ Hash

#initialize(index_dir:, config: Woods.configuration, client: nil, reader: nil, output: $stdout) ⇒ `Exporter`

#sync_all ⇒ `Hash`

#sync_type(type) ⇒ `Hash`

#sync_type_partial(type, max_count) ⇒ `Hash`