Module: Iev
- Defined in:
- lib/iev/cli.rb,
lib/iev.rb,
lib/iev/db.rb,
lib/iev/cli/ui.rb,
lib/iev/version.rb,
lib/iev/db_cache.rb,
lib/iev/profiler.rb,
lib/iev/converter.rb,
lib/iev/db_writer.rb,
lib/iev/utilities.rb,
lib/iev/relaton_db.rb,
lib/iev/cli/command.rb,
lib/iev/iso_639_code.rb,
lib/iev/term_builder.rb,
lib/iev/source_parser.rb,
lib/iev/data_conversions.rb,
lib/iev/term_attrs_parser.rb,
lib/iev/cli/command_helper.rb,
lib/iev/supersession_parser.rb,
lib/iev/converter/mathml_to_asciimath.rb
Overview
© Copyright 2020 Ribose Inc.
Defined Under Namespace
Modules: Cli, Converter, DataConversions, Utilities Classes: Db, DbCache, DbWriter, Iso639Code, Profiler, RelatonDb, SourceParser, SupersessionParser, TermAttrsParser, TermBuilder
Constant Summary collapse
- VERSION =
"0.3.8"
Class Method Summary collapse
-
.get(code, lang) ⇒ String?
Scrape Electropedia for term.
- .get_doc(code) ⇒ Object
Class Method Details
.get(code, lang) ⇒ String?
Scrape Electropedia for term.
if code not found then empty string,
if language not found then nil.
45 46 47 48 49 50 51 52 53 |
# File 'lib/iev.rb', line 45 def self.get(code, lang) doc = get_doc(code) xpath = "//table/tr/td/div/font[.=\"#{lang}\"]/../../"\ "following-sibling::td[2]" a = doc&.at(xpath)&.children&.to_xml a&.sub(%r{<br/>.*$}, "") &.sub(/, <.*$/, "") &.gsub(/<[^<>]*>/, "")&.strip end |
.get_doc(code) ⇒ Object
55 56 57 58 59 60 61 62 63 64 65 |
# File 'lib/iev.rb', line 55 def self.get_doc(code) url = "https://www.electropedia.org/iev/iev.nsf/"\ "display?openform&ievref=#{code}" # Use Mechanize with User-Agent to avoid 403 Forbidden errors from bot detection agent = Mechanize.new agent.user_agent = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36" page = agent.get(url) page.parser # Nokogiri document end |