Class: Kumi::Parser::Lexer

Inherits:

Object

Object
Kumi::Parser::Lexer

show all

Defined in:: lib/kumi/parser/lexer.rb

Overview

Turns source text into a flat array of Tokens in a single StringScanner pass. Whitespace (except newlines, which are significant statement separators) is skipped; comments and newlines are emitted so the parser can ignore them uniformly. Every token carries its start offset, so all location and error-frame work is deferred to Source.

The lexer is deliberately context-free: it does not track whether it is inside ‘input do … end`. Disambiguation that used to live in the old tokenizer’s context stack is the parser’s job now.

Instance Method Summary collapse

#initialize(source) ⇒ Lexer constructor

A new instance of Lexer.
#tokenize ⇒ Object

Constructor Details

#initialize(source) ⇒ `Lexer`

Returns a new instance of Lexer.

# File 'lib/kumi/parser/lexer.rb', line 17

def initialize(source)
  @source = source
  @ss = StringScanner.new(source.text)
  @tokens = []
end

Instance Method Details

#tokenize ⇒ `Object`

# File 'lib/kumi/parser/lexer.rb', line 23

def tokenize
  until @ss.eos?
    skip_inline_whitespace
    break if @ss.eos?

    offset = @ss.pos
    scan_token(offset)
  end
  push(:eof, nil, @ss.pos)
  @tokens
end