Class: Yaml::Converter::Parser

Inherits:
Object
  • Object
show all
Defined in:
lib/yaml/converter/parser.rb

Overview

Tokenizes input YAML lines (with inline annotations) into structured tokens consumable by the StateMachine.

Input assumptions:

  • Comment titles: lines starting with ‘# ` become title tokens.

  • Validation marker: a comment line starting with ‘# YAML validation:` is recognized.

  • Separator lines (‘—`) are recognized and currently ignored by the state machine.

  • Inline notes: fragments after ‘#note:` are captured as out-of-band NOTE tokens.

  • Other non-empty lines are treated as YAML content.

Constant Summary collapse

Token =

Lightweight token structure used by the parser/state machine pipeline.

Class.new do
  attr_accessor :type, :text, :meta

  def initialize(type:, text:, meta: nil)
    @type = type
    @text = text
    @meta = meta
  end
end
VALIDATION_PREFIX =

Comment line prefix indicating a validation status line will be injected

"# YAML validation:"
NOTE_MARK =

Inline note marker captured from right side of a line

"#note:"

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(options = {}) ⇒ Parser

Returns a new instance of Parser.

Parameters:

  • options (Hash) (defaults to: {})

    Reserved for future parsing options



39
40
41
# File 'lib/yaml/converter/parser.rb', line 39

def initialize(options = {})
  @options = options
end

Instance Attribute Details

#metaHash?

Returns Optional metadata bag (currently unused).

Returns:

  • (Hash, nil)

    Optional metadata bag (currently unused)



23
24
25
26
27
28
29
30
31
# File 'lib/yaml/converter/parser.rb', line 23

Token = Class.new do
  attr_accessor :type, :text, :meta

  def initialize(type:, text:, meta: nil)
    @type = type
    @text = text
    @meta = meta
  end
end

#textString

Returns Payload string for this token.

Returns:

  • (String)

    Payload string for this token



23
24
25
26
27
28
29
30
31
# File 'lib/yaml/converter/parser.rb', line 23

Token = Class.new do
  attr_accessor :type, :text, :meta

  def initialize(type:, text:, meta: nil)
    @type = type
    @text = text
    @meta = meta
  end
end

#typeSymbol

Returns One of :blank, :title, :validation, :separator, :dash_heading, :yaml_line, :note.

Returns:

  • (Symbol)

    One of :blank, :title, :validation, :separator, :dash_heading, :yaml_line, :note



23
24
25
26
27
28
29
30
31
# File 'lib/yaml/converter/parser.rb', line 23

Token = Class.new do
  attr_accessor :type, :text, :meta

  def initialize(type:, text:, meta: nil)
    @type = type
    @text = text
    @meta = meta
  end
end

Instance Method Details

#tokenize(lines) ⇒ Array<Token>

Convert raw lines into token objects.

Parameters:

  • lines (Array<String>)

    Input lines (including newlines)

Returns:

  • (Array<Token>)

    Sequence of tokens representing the document structure



47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
# File 'lib/yaml/converter/parser.rb', line 47

def tokenize(lines)
  tokens = []
  lines.each do |raw|
    line = raw.rstrip
    if line.empty?
      tokens << Token.new(type: :blank, text: "")
      next
    end

    if line.start_with?("# ") || line == "#"
      if line.start_with?(VALIDATION_PREFIX)
        tokens << Token.new(type: :validation, text: line[2..])
      else
        content = (line == "#") ? "" : line[2..]
        tokens << Token.new(type: :title, text: content)
      end
      next
    end

    if line == "---"
      tokens << Token.new(type: :separator, text: line)
      next
    end

    if line.start_with?("-")
      tokens << Token.new(type: :dash_heading, text: line[1..].strip)
    end

    note_idx = line.index(NOTE_MARK)
    if note_idx
      base = line[0...note_idx].rstrip
      note = line[(note_idx + NOTE_MARK.length)..].to_s.strip
      tokens << Token.new(type: :yaml_line, text: base)
      tokens << Token.new(type: :note, text: note)
    else
      tokens << Token.new(type: :yaml_line, text: line)
    end
  end
  tokens
end