Class: Relaton::Iso::DataFetcher

Inherits:
Core::DataFetcher
  • Object
show all
Defined in:
lib/relaton/iso/data_fetcher.rb

Overview

Fetch all the documents from ISO website.

Instance Method Summary collapse

Instance Method Details

#fetchvoid

This method returns an undefined value.

Go through all ICS and fetch all documents.



46
47
48
49
50
51
52
53
54
55
# File 'lib/relaton/iso/data_fetcher.rb', line 46

def fetch # rubocop:disable Metrics/AbcSize
  Util.info "Scrapping ICS pages..."
  fetch_ics
  Util.info "(#{Time.now}) Scrapping documents..."
  fetch_docs
  iso_queue.save
  # index.sort! { |a, b| compare_docids a, b }
  index.save
  report_errors
end

#indexObject



26
27
28
# File 'lib/relaton/iso/data_fetcher.rb', line 26

def index
  @index ||= Relaton::Index.find_or_create :iso, file: "#{INDEXFILE}.yaml"
end

#iso_queueRelaton::Iso::Queue

ISO has too many docs. GHA can’t get them all in one run. So, we need to split the process into several runs. The iso_queue is used to store the doc paths that have not been fetched.

Returns:



37
38
39
# File 'lib/relaton/iso/data_fetcher.rb', line 37

def iso_queue
  @iso_queue ||= Relaton::Iso::Queue.new
end

#log_error(msg) ⇒ Object



22
23
24
# File 'lib/relaton/iso/data_fetcher.rb', line 22

def log_error(msg)
  Util.error msg
end

#mutexObject



18
19
20
# File 'lib/relaton/iso/data_fetcher.rb', line 18

def mutex
  @mutex ||= Mutex.new
end

#queueQueue

The queue is used to store the ICS page paths beeing fetching in the current run.

Returns:



14
15
16
# File 'lib/relaton/iso/data_fetcher.rb', line 14

def queue
  @queue ||= ::Queue.new
end