Class: PackratParser

Inherits:
Object
  • Object
show all
Defined in:
lib/packrat_parser.rb,
lib/packrat_parser/base.rb,
lib/packrat_parser/parser.rb,
lib/packrat_parser/result.rb,
lib/packrat_parser/version.rb

Overview

A small packrat / PEG parser-combinator library whose grammar rules can be written with the for ... then comprehension from the Ruby fork.

Defined Under Namespace

Classes: Failure, ParseError, Parser, Rule, Success

Constant Summary collapse

VERSION =
"0.1.0"

Class Method Summary collapse

Instance Method Summary collapse

Class Method Details

.method_added(name) ⇒ Object

Rewrite every method defined on a subclass into a rule that returns a lazy Rule. Guards against rewriting the base class's own infrastructure and against re-entering while we install the replacement (define_method itself fires method_added).



56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
# File 'lib/packrat_parser/base.rb', line 56

def self.method_added(name)
  return if self == PackratParser
  return if name == :initialize
  return if @__defining_rule

  @__defining_rule = true
  @start_symbol ||= name
  begin
    body = instance_method(name)
    define_method(name) do
      Rule.new(self, name, body)
    end
  ensure
    @__defining_rule = false
  end
end

.parse(input) ⇒ Object

Convenience: parse input with a fresh instance.



48
49
50
# File 'lib/packrat_parser/base.rb', line 48

def self.parse(input)
  new.parse(input)
end

.skip_whitespace(pattern = /\s+/) ⇒ Object

Enable implicit whitespace skipping (Scala's RegexParsers mode). When set, every term skips leading whitespace matching pattern before attempting its match, and parse also consumes trailing whitespace before requiring full input consumption. Off by default (terminals match exactly).

class CalcParser < PackratParser
skip_whitespace            # default /\s+/
# skip_whitespace(/[ \t]+/)  # or a custom pattern
end


36
37
38
# File 'lib/packrat_parser/base.rb', line 36

def self.skip_whitespace(pattern = /\s+/)
  @__whitespace = pattern
end

.start_symbol(name = nil) ⇒ Object

Set (or read) the rule the parser starts from. If omitted, the first defined method is used as the start symbol.



19
20
21
22
23
24
25
# File 'lib/packrat_parser/base.rb', line 19

def self.start_symbol(name = nil)
  if name
    @start_symbol = name
  else
    @start_symbol
  end
end

.whitespaceObject

The configured whitespace pattern, or nil when skipping is disabled. Inherited by subclasses so a base parser can turn the mode on once.



42
43
44
45
# File 'lib/packrat_parser/base.rb', line 42

def self.whitespace
  return @__whitespace if defined?(@__whitespace)
  superclass.respond_to?(:whitespace) ? superclass.whitespace : nil
end

Instance Method Details

#__memoObject

Per-input packrat memo table, keyed by [rule_name, pos].



74
75
76
# File 'lib/packrat_parser/base.rb', line 74

def __memo
  @__memo ||= {}
end

#__skip_ws(ws, input, pos) ⇒ Object

Advance pos past whitespace matched by the anchored regexp ws (nil when skipping is disabled). Returns the new position.



114
115
116
117
# File 'lib/packrat_parser/base.rb', line 114

def __skip_ws(ws, input, pos)
  return pos unless ws
  (m = ws.match(input, pos)) ? pos + m[0].length : pos
end

#parse(input) ⇒ Object

Parse input starting from the configured start symbol. Returns the parsed value on success; raises ParseError on failure or on leftover input.

Raises:



127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
# File 'lib/packrat_parser/base.rb', line 127

def parse(input)
  @__memo = {}
  name = self.class.start_symbol
  raise ParseError.new("no start symbol defined", 0) unless name

  result = send(name).call(input, 0)
  unless result.success?
    raise ParseError.new(result.message, result.pos)
  end
  # The last terminal skips only *leading* whitespace, so trailing whitespace
  # after the final token is left for parse to consume before requiring that
  # all input was used.
  ws = self.class.whitespace
  end_pos = __skip_ws(ws && /\G(?:#{ws})/, input, result.pos)
  if end_pos < input.length
    raise ParseError.new("unexpected trailing input", end_pos)
  end
  result.value
end

#pure(value) ⇒ Object

A parser that succeeds with value without consuming any input (monadic unit / Scala's success).



121
122
123
# File 'lib/packrat_parser/base.rb', line 121

def pure(value)
  Parser.new { |_input, pos| Success.new(value, pos) }
end

#term(pattern) ⇒ Object

A terminal parser. A String matches that exact literal at the current position; a Regexp is matched anchored at the current position. The matched substring is the parser's value.

When the class enables skip_whitespace, leading whitespace is consumed before the match is attempted, mirroring Scala's RegexParsers.



84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
# File 'lib/packrat_parser/base.rb', line 84

def term(pattern)
  ws = self.class.whitespace
  ws = /\G(?:#{ws})/ if ws
  case pattern
  when String
    Parser.new do |input, pos|
      pos = __skip_ws(ws, input, pos)
      if input[pos, pattern.length] == pattern
        Success.new(pattern, pos + pattern.length)
      else
        Failure.new(pos, "expected #{pattern.inspect}")
      end
    end
  when Regexp
    anchored = /\G(?:#{pattern})/
    Parser.new do |input, pos|
      pos = __skip_ws(ws, input, pos)
      if (m = anchored.match(input, pos))
        Success.new(m[0], pos + m[0].length)
      else
        Failure.new(pos, "expected #{pattern.inspect}")
      end
    end
  else
    raise ArgumentError, "term expects a String or Regexp, got #{pattern.class}"
  end
end