Gem Version Build Status Pull Requests Commits since latest

Warning

As of 2026-02-10, Electropedia is behind AWS WAF, which blocks simple HTTP clients. Scraping now uses Ferrum (headless Chrome via DevTools Protocol) to handle the WAF challenge. Chrome/Chromium must be installed for scraping to work.

Purpose

This library allows accessing data of the International Electrotechnical Vocabulary (IEV):

  • Access IEV terms remotely, via the Electropedia website (www.electropedia.org)

  • Read IEV terms from an offline IEV termbase in Glossarist format

  • Parse an IEV exported Excel file and convert its contents into a Glossarist termbase

Warning
The last feature is only meant for IEC-internal use. The IEV export files can only be obtained from the IEC IT department.
Note
The iev-data gem is obsolete by this library and its functionality has been fully incorporated into this library.

Install

Add this line to your Gemfile:

gem 'iev'

And then execute:

$ bundle

Or install it yourself as:

$ gem install iev

Usage

The gem comes with the iev executable, which provides the following commands:

iev export FILE -o OUTPUT_DIR

Exports IEV data to Glossarist YAML format. Supports both Excel (.xlsx/.xls) and SQLite (.sqlite3/.sqlite/.db) input files. Format is detected automatically from the file extension.

iev xlsx2db FILE

Imports Excel to SQLite database.

iev fetch CODE

Fetches a single IEV concept and outputs YAML to stdout.

Warning
The IEV XLSX export files can only be obtained from the IEC Electropedia administrator.

Fetching IEV terms from cached data

# Get term designation (from local YAML or GitHub remote)

Iev.get("103-01-02", "en")
=> "functional"

# If code not found, returns nil (does not raise)
Iev.get("111-11-11", "en")
=> nil

# If language not found, returns nil
Iev.get("103-01-02", "eee")
=> nil

# Fetch full concept data (all languages)
# Raises Iev::DataSource::NotFoundError if code not found
Iev.fetch_concept("103-01-02")
=> { "id" => "103-01-02", "data" => { ... } }

# Fetch localized term data
# Raises Iev::DataSource::NotFoundError if code not found
Iev.fetch_term("103-01-02", "en")
=> { "term" => "functional", ... }

Scraping IEV terms from Electropedia

Requires Chrome/Chromium installed. Uses Ferrum (headless Chrome) to bypass AWS WAF.

# Scrape concept data directly from electropedia.org
Iev.scrape_concept("103-01-02")
=> { "id" => "103-01-02", "data" => { "identifier" => "103-01-02", "localized_concepts" => { "eng" => { ... }, ... } } }

# Custom browser options (e.g., headless mode, window size)
scraper = Iev::Scraper.new(browser_opts: { headless: true, window_size: [1280, 800] })
concept = scraper.fetch_concept("103-01-02")

Converting IEV data to a Glossarist dataset

The export command converts an IEV Excel export or SQLite database into Glossarist YAML concept files:

# From an Excel export
$ iev export termbase.xlsx -o /path/to/output

# From a SQLite database
$ iev export termbase.sqlite3 -o /path/to/output

# With filters
$ iev export termbase.xlsx -o /output --only-concepts "103-%" --only-languages "en,fr"

The output directory will contain a concepts/ subdirectory with Glossarist concept and localized concept YAML files.

You can also use the Iev::Exporter class programmatically:

# Export from Excel
Iev::Exporter.new("termbase.xlsx", output_dir: "/path/to/output").export

# Export from SQLite with filters
collection = Iev::Exporter.new("termbase.sqlite3",
  output_dir: "/path/to/output",
  only_concepts: "103-%",
  only_languages: "en,fr",
).export
# collection is a Glossarist::ManagedConceptCollection

Structure of the IEV Excel export

The columns are:

IEVREF

concept ID of this term

LANGUAGE

ISO 639-1 code (2 character)

TERM

the designation of this concept in language of LANGUAGE

TERMATTRIBUTE

a field of multiple uses, an array (separated by ;). More details below.

SYNONYM1

a synonym of this term

SYNONYM1ATTRIBUTE

the TERMATTRIBUTE that applies to SYNONYM1

SYNONYM1STATUS

One of Preferred, Deprecated, nil.

SYNONYM2

second synonym of this term

SYNONYM2ATTRIBUTE

the TERMATTRIBUTE that applies to SYNONYM2

SYNONYM2STATUS

One of Preferred, Deprecated, nil.

SYNONYM3

3rd synonym of this term

SYNONYM3ATTRIBUTE

the TERMATTRIBUTE that applies to SYNONYM3

SYNONYM3STATUS

One of Preferred, Deprecated, nil.

SYMBOLE

Math symbol

DEFINITION

definition text that includes <note> and <example>

SOURCE

which document was this term was taken from

PUBLICATIONDATE

YYYY-MM date of publication

STATUS

Only Standard for now

REPLACES

IEVREF for the deprecated term

Term field

  • Usually the text

  • If it is …​.. (5 dots), it means that the translation is not available.

  • If it is foobar (acronym) or foobar (akronim), it is an acronym. term.acronymtrue.

Term attribute field

There are these data types inside the term attribute field. Make sure you split at ; for multiple entries.

f or m or n

this means term.grammar-gender is one of them, term.pluralitysingular

n pl

term.grammar-gendern, term.pluralityplural

m pl

term.grammar-genderm, term.pluralityplural

f pl

term.grammar-genderf, term.pluralityplural

pl

term.pluralityplural (else, singular)

(in Zusammensetzungen) f

term.compound-prefix ⇒ true, term.grammar-genderf

(in Zusammensetzungen) m

term.compound-prefix ⇒ true, term.grammar-genderm

m, (abgelehnt)

term.rejected ⇒ true, term.grammar-genderm

f, (abgelehnt)

term.rejected ⇒ true, term.grammar-genderf

(略語)

term.abbreviation ⇒ true

<…​>

this means the text (…​) inside is the domain of this term (which field this term applies in)

<相关条目:[SOMEIEVREF]>

SOMEIEVREF here represents the "related to" term. Add a relationship of this term to SOMEIEVREF.

Adjektiv, adj, 形容詞, 형용사

sets term.grammar-particle to adj

Präfix, (prefix), (préfixe), 接尾語, 접두사, (词头)

sets term.affix to prefix

CA

term.geographical_areaCA

US

term.geographical_areaUS

noun, 名詞

term.grammar-particlenoun (all terms default to noun)

verb, 動詞

term.grammar-particleverb

(sigle international), mterm.acronym = true, term.international = true, term.gender = 'm'

Term definition field

We need to parse out all NOTEs and EXAMPLEs and normalize them.

For all This links to <a href=IEV112-01-01>quantity</a>, we parse them and replace with: This links to {{quantity, IEV:112-01-01}}.

e.g.

  1. Every <NOTE {N} - goes into a separate entry under notes::

quotient of two quantities of different dimensions, used as a multiplier to express the proportionality equation between them
<NOTE 1 – A coefficient is a quantity having a dimension other than one. Examples: Hall coefficient, damping coefficient, temperature coefficient, gyromagnetic coefficient.
<NOTE 2 – The term "modulus" is sometimes used instead of coefficient. Example: modulus of elasticity.
definition: quotient of two quantities of different dimensions, used as a multiplier to express the proportionality equation between them
notes:
  - A coefficient is a quantity having a dimension other than one. Examples: Hall coefficient, damping coefficient, temperature coefficient, gyromagnetic coefficient.
  - The term "modulus" is sometimes used instead of coefficient. Example: modulus of elasticity.
  1. A <NOTE - goes into notes::

quantity of dimension one defined by a combination of quantities
<NOTE – Characteristic numbers occur in particular in the theory of similarity. They carry the word "number" in their names. Examples: Reynolds number, Prandtl number.
definition:
notes:
  - Characteristic numbers occur in particular in the theory of similarity. They carry the word "number" in their names. Examples: Reynolds number, Prandtl number.
  1. Sometimes there are many Note {N} to entry: `, they are identical to `NOTE -.

set of interrelated items that collectively fulfil a requirement
<p>Note 1 to entry: A system is considered to have a defined real or abstract boundary.
<p>Note 2 to entry: External resources (from outside the system boundary) may be required for the system to operate.
<p>Note 3 to entry: A system structure may be hierarchical, e.g. system, subsystem, component, etc.
<p>Note 4 to entry: Conditions of use and maintenance should be expressed or implied within the requirement.
definition: set of interrelated items that collectively fulfil a requirement
notes:
  - A system is considered to have a defined real or abstract boundary.
  - External resources (from outside the system boundary) may be required for the system to operate.
  - A system structure may be hierarchical, e.g. system, subsystem, component, etc.
  - Conditions of use and maintenance should be expressed or implied within the requirement.
  1. Parse EXAMPLE:

<a href=IEV112-01-01>quantity</a> which keeps the same value under particular circumstances, or which results from theoretical considerations
<p>EXAMPLE <a href=IEV103-05-26>time constant</a>, equilibrium constant for a chemical reaction, <a href=IEV112-03-09>fundamental physical constant</a>.

definition: {{quantity, IEV:112-01-01}} which keeps the same value under particular circumstances, or which results from theoretical considerations
examples:
  - {{time constant, IEV:103-05-26}}, equilibrium constant for a chemical reaction, {{fundamental physical constant, IEV:112-03-09}}.
  1. Remember to parse both EXAMPLE and Note {N} to entry:.

level of sub-division within a system hierarchy
<p>EXAMPLE System, subsystem, assembly, and component. <p>Note 1 to entry: From the maintenance perspective, the indenture level depends upon various factors, including the complexity of the item's construction, the accessibility of sub items, skill level of maintenance personnel, test equipment facilities, and safety considerations.
definition: level of sub-division within a system hierarchy
examples:
  - System, subsystem, assembly, and component.
notes:
  - From the maintenance perspective, the indenture level depends upon various factors, including the complexity of the item's construction, the accessibility of sub items, skill level of maintenance personnel, test equipment facilities, and safety considerations.
  1. Remember to parse both EXEMPLE and Note {N} à l’article: in French

niveau de subdivision à l’intérieur de la hiérarchie d’un système
<p>EXEMPLE Système, sous-système, assemblage et composant. <p>Note 1 à l’article: Du point de vue de la maintenance, le niveau dans l’arborescence dépend de divers facteurs dont la complexité de la structure de l’entité, l’accessibilité aux sous-entités, le niveau de compétence du personnel de maintenance, les moyens de mesure et d’essai, et des considérations de sécurité.
definition: niveau de subdivision à l’intérieur de la hiérarchie d’un système
examples:
  - Système, sous-système, assemblage et composant.
notes:
  - Du point de vue de la maintenance, le niveau dans l’arborescence dépend de divers facteurs dont la complexité de la structure de l’entité, l’accessibilité aux sous-entités, le niveau de compétence du personnel de maintenance, les moyens de mesure et d’essai, et des considérations de sécurité.

Source field

Original:

IEC 60050-311:2001, 311-01-04

After parsing:

authoritative_source:
  ref: IEC 60050-311:2001, 311-01-04

Excel-to-Glossarist Column Mapping

This section provides a complete mapping from every IEV Excel export column to the corresponding Glossarist concept model field. The IEV Excel export has 19 columns (see Structure of the IEV Excel export). Each row represents one localized term entry (one language variant of one concept).

Glossarist Model Layers

The Glossarist model organizes concept data into two layers:

  • ManagedConcept — the concept entry itself (identity, domain classification, cross-concept relationships, lifecycle)

  • LocalizedConcept — a language-specific variant of a concept (designations, definition, notes, examples, sources)

One IEV Excel row produces one LocalizedConcept, which is attached to its ManagedConcept (identified by IEVREF).

Column-by-Column Mapping

The table below maps each of the 19 Excel columns to the Glossarist model.

Excel Column Glossarist Path Data Type Notes

IEVREF

ManagedConceptData#id

String

The concept identifier (e.g. 103-01-02). Also set as LocalizedConcept#id and ConceptData#id. Used to group multiple language rows into one ManagedConcept. The IEVREF pattern AAA-BB-CC is also used to derive domain references (see Derived Fields (Not Directly From Excel Columns)).

LANGUAGE

ConceptData#language_code

String (ISO 639-2/3)

Two-character code (e.g. en, fr) converted to three-character ISO 639 code (e.g. eng, fra) via Iev::Iso639Code. This determines which language slot the localized concept fills.

TERM

Designation::Expression#designation

String

Primary term designation. Creates a Designation::Expression with normative_status: "preferred". If the value is …​.. (5 dots, meaning "not available"), it is replaced with "NA". The term text undergoes MathML-to-AsciiMath conversion and cross-reference expansion.

TERMATTRIBUTE

(multiple designation fields)

Composite string

Parsed by TermAttrsParser into multiple designation attributes. See TERMATTRIBUTE Sub-Field Mapping for the full sub-mapping.

SYNONYM1

Designation::Expression#designation

String

Additional designation. Creates a Designation::Expression. Some synonyms contain multiple entries separated by <p>, <b>, <br> tags — each is split into a separate designation. normative_status comes from SYNONYM1STATUS.

SYNONYM1ATTRIBUTE

(multiple designation fields)

Composite string

Same parsing as TERMATTRIBUTE, applied to the SYNONYM1 designation. See TERMATTRIBUTE Sub-Field Mapping.

SYNONYM1STATUS

Designation::Expression#normative_status

String or nil

Maps to the synonym’s normative status. The value is lowercased. Known localized values are mapped: e.g. "obsoleto" to "deprecated", Cyrillic variants similarly. When nil, the synonym has no explicit status. Also used to derive LocalizedConcept#classification (see Derived Fields (Not Directly From Excel Columns)).

SYNONYM2

Designation::Expression#designation

String

Same pattern as SYNONYM1.

SYNONYM2ATTRIBUTE

(multiple designation fields)

Composite string

Same as SYNONYM1ATTRIBUTE.

SYNONYM2STATUS

Designation::Expression#normative_status

String or nil

Same as SYNONYM1STATUS.

SYNONYM3

Designation::Expression#designation

String

Same pattern as SYNONYM1.

SYNONYM3ATTRIBUTE

(multiple designation fields)

Composite string

Same as SYNONYM1ATTRIBUTE.

SYNONYM3STATUS

Designation::Expression#normative_status

String or nil

Same as SYNONYM1STATUS.

SYMBOLE

Designation::Symbol#designation

String

International math symbol. Creates a Designation::Symbol with international: true. If this column is empty, no symbol designation is created.

DEFINITION

ConceptData#definition, ConceptData#examples, ConceptData#notes

HTML string

The unified definition text is split by TermBuilder#split_definition which uses regex to detect EXAMPLE, EXEMPLE, Note N to entry, Note N a l’article, NOTE markers. Each part becomes a DetailedDefinition object in the corresponding collection. The content undergoes MathML-to-AsciiMath conversion and cross-reference expansion.

SOURCE

ConceptData#sources (via ConceptSource)

HTML string

Parsed by SourceParser into one or more ConceptSource objects, each with type: "authoritative". The source string is split after normalization. Each source has: status (identical/modified/similar/related/not_equal), origin (a Citation with ref, locality, link, original), and optionally modification text. See SOURCE Column Parsing.

PUBLICATIONDATE

ConceptData#dates (via ConceptDate)

String (YYYY-MM or YYYY-MM-DD)

Converted to a full ISO 8601 datetime. Creates two ConceptDate entries: {type: "accepted", date: …​} and {type: "amended", date: …​}. Also sets ConceptData#review_date and ConceptData#review_decision_date to the same value.

STATUS

LocalizedConcept#entry_status

String

Only Standard is known; it maps to "valid". Lowercased and matched.

REPLACES

ConceptData#related (via RelatedConcept)

String

Parsed by SupersessionParser. Expected format: IEVREF:VERSION (e.g. 881-01-23:1983-01). Creates a RelatedConcept with type: "supersedes" and a Citation containing {source: "IEV", id: "…​", version: "…​"}.

TERMATTRIBUTE Sub-Field Mapping

The TERMATTRIBUTE column is a composite string parsed by TermAttrsParser. It may contain multiple attributes separated by semicolons. The parser extracts them in order: gender, plurality, geographical area, part of speech, usage info, prefix.

Parsed Value Glossarist Path Notes

m, f, n

GrammarInfo#gender (via Designation::Expression#grammar_info)

Grammatical gender. May appear inside brackets: (m), [f].

pl

GrammarInfo#number (via Designation::Expression#grammar_info)

Plurality. pl maps to "plural". If gender was found but not pl, defaults to "singular".

adj, noun, verb

GrammarInfo#part_of_speech

Part of speech. Localized variants are mapped: German Adjektiv to adj, Japanese and Korean variants similarly.

Angle bracket text (ASCII or full-width)

Designation::Expression#usage_info

Usage info / domain indicator extracted from angle brackets. Full-width brackets used in some CJK terms.

Prefix keywords in multiple languages

Designation::Expression#prefix

Marks the designation as a prefix. Keywords include German, French, Japanese, Korean, Chinese, Portuguese variants.

Two-letter uppercase (e.g. CA, US)

Designation::Base#geographical_area

ISO 3166-1 alpha-2 country code.

SOURCE Column Parsing

The SOURCE column is the most complex field. It is parsed by SourceParser into one or more ConceptSource objects.

Relationship Status Detection

The parser detects the source relationship type from textual markers:

Marker Status Notes

Not-equal sign

not_equal

Definition differs from source.

Approximately-equal sign

similar

Definition is similar to source.

see, voir

related

Cross-reference to another definition.

MOD, modified, modifie (with accent)

modified

Definition modified from source. Modification text is captured in ConceptSource#modification.

(default)

identical

No special marker found.

Source Reference Extraction

The parser normalizes and extracts the source reference (e.g. IEC 60050-121), the clause locality (e.g. 151-12-05), and optionally resolves a URL via Relaton. Reference normalization handles many localized forms: CEI to IEC, UIT to ITU, VEI to IEV, etc.

Derived Fields (Not Directly From Excel Columns)

Some Glossarist model fields are derived from IEVREF or from combinations of columns during export:

Glossarist Path Source Notes

ManagedConceptData#domains

Derived from IEVREF

The IEVREF pattern AAA-BB-CC is split. Creates two ConceptReference objects with ref_type: "domain" and source: "urn:iec:std:iec:60050" (IEC URN per IEC URN specification): area-AAA and section-AAA-BB. For example, 103-01-02 produces area-103 + section-103-01.

LocalizedConcept#classification

SYNONYM1STATUS

Maps localized classification values: Chinese/Russian/Spanish "admitido" to "admitted", various forms of "preferred" similarly; other values lowercased as-is.

ConceptData#domain

Derived from IEVREF

The section-level domain URI (e.g. section-103-01), resolved from the SubjectAreas data. Falls back to area-level if section not found.

ConceptData#review_decision_event

Hard-coded

Always set to "published".

ConceptDate {type: "amended"}

PUBLICATIONDATE

A second date entry with type "amended" is created alongside the "accepted" date, using the same publication date value.

ManagedConcept#related

Derived from IEVREF

Hierarchy relations using broader/narrower. Regular IEV concepts have broader → section-AAA-BB. Section concepts have broader → area-AAA (from SubjectAreaConcepts) and narrower → child concepts (from Exporter). Area concepts have narrower → section-AAA-BB. Each RelatedConcept has both content (string) and ref (Citation with source "IEV" and id) set, so the glossarist RDF transform emits skos:broader/skos:narrower triples.

Glossarist Model Fields NOT Populated From IEV Excel

The following Glossarist model fields exist in the data model but are not populated from any IEV Excel column. They remain at their defaults:

Glossarist Field Description Default

ManagedConceptData#uri

External URI for the concept

nil

ManagedConceptData#sources

Managed-concept-level sources (distinct from localized sources)

empty

ManagedConcept#dates

Managed-concept-level dates (distinct from localized dates)

empty

ManagedConcept#status

Concept lifecycle status (draft/valid/retired etc.)

nil

ConceptData#release

Release version tag

nil

ConceptData#lineage_source_similarity

Lineage source similarity percentage

nil

ConceptData#script

ISO 15924 script code

nil

ConceptData#system

ISO 24229 conversion system code

nil

ConceptData#references

ConceptReference collection on localized concept

empty

ConceptData#entry_status

Entry status on ConceptData (duplicate of LocalizedConcept#entry_status)

nil

Concept#non_verb_rep

Non-verbal representations (images, tables, formulas)

empty

Designation::Base#language

Per-designation language override

nil

Designation::Base#script

Per-designation ISO 15924 script

nil

Designation::Base#system

Per-designation ISO 24229 system

nil

Designation::Base#international

International validity flag (set true only for SYMBOLE)

false

Designation::Base#absent

Explicitly absent designation flag

false

Designation::Base#pronunciation

Pronunciation entries (IPA, romanization, etc.)

empty

Designation::Base#sources

Per-designation bibliographic sources

empty

Designation::Base#term_type

ISO 12620 term type classification (24 values)

nil

Designation::Base#related

Designation-level relationships (abbreviated_form_for, short_form_for)

empty

Designation::Expression#field_of_application

Subject field / specific use

nil

Designation::Abbreviation#acronym

Acronym type flag

false

Designation::Abbreviation#initialism

Initialism type flag

false

Designation::Abbreviation#truncation

Truncation type flag

false

Designation::LetterSymbol

Letter symbol designation type (subclass of Symbol with text)

(not used)

Designation::GraphicalSymbol

Graphical symbol designation type (subclass of Symbol with text, image)

(not used)

LocalizedConcept#review_type

Review type

nil

Data copyright IEC. All others copyright Ribose.

Data Model

Concept Domains

Exported concepts use domains (a collection of ConceptReference objects) to represent the IEV subject area hierarchy. Each concept’s domains include references to its area (e.g. area-103) and section (e.g. section-103-01).

data:
  identifier: "103-01-01"
  domains:
    - concept_id: area-103
      source: urn:iec:std:iec:60050
      ref_type: domain
    - concept_id: section-103-01
      source: urn:iec:std:iec:60050
      ref_type: domain

The ref_type: domain distinguishes domain references from other ConceptReference types (local, urn, designation).

Subject Area Hierarchy

The SubjectAreaConcepts module creates area and section concepts that form a two-level hierarchy with symmetric broader/narrower linkages at the ManagedConcept#related level:

  • Area concepts (e.g. area-103) — domain reference to themselves, narrower relations to their sections

  • Section concepts (e.g. section-103-01) — domain references to both parent area and themselves, broader relation to parent area, narrower relations to child IEV concepts (added by Exporter)

  • Regular IEV concepts (e.g. 103-01-02) — broader relation to their section concept (added by Exporter)

All hierarchy RelatedConcept entries set both content (string, for YAML serialization) and ref (Citation with source: "IEV" and id, for RDF transformation via glossarist’s gloss ontology).

Separately, domains (classification via ConceptReference.domain(…​)) and ConceptData#domain (per-localization string) remain for classification/filtering — distinct from hierarchy.