Class: Relaton::Calconnect::Scraper
- Inherits:
-
Object
- Object
- Relaton::Calconnect::Scraper
- Includes:
- Core::ArrayWrapper, Core::HashKeysSymbolizer
- Defined in:
- lib/relaton/calconnect/scraper.rb
Constant Summary collapse
- RELEASE_ASSET_URL =
"https://github.com/%<owner>s/%<repo>s/releases/download/" \ "%<tag>s/%<asset_stem>s.zip".freeze
Instance Method Summary collapse
-
#initialize(errors = {}) ⇒ Scraper
constructor
A new instance of Scraper.
-
#parse_page(hit) ⇒ Relaton::Calconnect::ItemData
Parse an aggregate-index document entry: download the per-document GitHub release zip, extract the RXL, and parse it into a bibitem.
Constructor Details
#initialize(errors = {}) ⇒ Scraper
Returns a new instance of Scraper.
17 18 19 |
# File 'lib/relaton/calconnect/scraper.rb', line 17 def initialize(errors = {}) @errors = errors end |
Instance Method Details
#parse_page(hit) ⇒ Relaton::Calconnect::ItemData
Parse an aggregate-index document entry: download the per-document GitHub release zip, extract the RXL, and parse it into a bibitem.
29 30 31 32 33 34 |
# File 'lib/relaton/calconnect/scraper.rb', line 29 def parse_page(hit) zip_data = download_release_zip hit rxl = extract_rxl zip_data, rxl_filename(hit) xml = normalize_rxl rxl Item.from_xml xml end |