Class: LighterpackParser::CategoryParser

Inherits:
Object
  • Object
show all
Defined in:
lib/lighterpack_parser/category_parser.rb

Overview

Parser for extracting category data from Lighterpack HTML documents.

Instance Method Summary collapse

Instance Method Details

#parse(category_element, item_parser:) ⇒ Category?

Parse a single category element.

Parameters:

  • category_element (Nokogiri::XML::Element)

    The category HTML element

  • item_parser (ItemParser)

    The parser to use for extracting items

Returns:

  • (Category, nil)

    The parsed category, or nil if name is missing



28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
# File 'lib/lighterpack_parser/category_parser.rb', line 28

def parse(category_element, item_parser:)
  # Category name is in h2.lpCategoryName
  category_header = category_element.at_css('h2.lpCategoryName')
  return nil unless category_header

  category_name = category_header.text.strip
  return nil if category_name.empty?

  # Description is typically in the category name itself (in parentheses)
  description = extract_description(category_name)

  # Find items in this category
  items = extract_items(category_element, item_parser: item_parser)

  Category.new(
    name: category_name,
    description: description,
    items: items
  )
end

#parse_all(doc, item_parser:) ⇒ Array<Category>

Parse all categories from a Lighterpack HTML document.

Parameters:

  • doc (Nokogiri::HTML::Document)

    The parsed HTML document

  • item_parser (ItemParser)

    The parser to use for extracting items

Returns:

  • (Array<Category>)

    Array of extracted categories



11
12
13
14
15
16
17
18
19
20
21
# File 'lib/lighterpack_parser/category_parser.rb', line 11

def parse_all(doc, item_parser:)
  categories = []

  # Lighterpack structure: ul.lpCategories > li.lpCategory
  doc.css('ul.lpCategories > li.lpCategory').each do |category_element|
    category = parse(category_element, item_parser: item_parser)
    categories << category if category
  end

  categories
end