Class: Kumi::Parser::Lexer

Inherits:
Object
  • Object
show all
Defined in:
lib/kumi/parser/lexer.rb

Overview

Turns source text into a flat array of Tokens in a single StringScanner pass. Whitespace (except newlines, which are significant statement separators) is skipped; comments and newlines are emitted so the parser can ignore them uniformly. Every token carries its start offset, so all location and error-frame work is deferred to Source.

The lexer is deliberately context-free: it does not track whether it is inside ‘input do … end`. Disambiguation that used to live in the old tokenizer’s context stack is the parser’s job now.

Instance Method Summary collapse

Constructor Details

#initialize(source) ⇒ Lexer

Returns a new instance of Lexer.



17
18
19
20
21
# File 'lib/kumi/parser/lexer.rb', line 17

def initialize(source)
  @source = source
  @ss = StringScanner.new(source.text)
  @tokens = []
end

Instance Method Details

#tokenizeObject



23
24
25
26
27
28
29
30
31
32
33
# File 'lib/kumi/parser/lexer.rb', line 23

def tokenize
  until @ss.eos?
    skip_inline_whitespace
    break if @ss.eos?

    offset = @ss.pos
    scan_token(offset)
  end
  push(:eof, nil, @ss.pos)
  @tokens
end