Class: PackratParser
- Inherits:
-
Object
- Object
- PackratParser
- Defined in:
- lib/packrat_parser.rb,
lib/packrat_parser/base.rb,
lib/packrat_parser/parser.rb,
lib/packrat_parser/result.rb,
lib/packrat_parser/version.rb
Overview
A small packrat / PEG parser-combinator library whose grammar rules can be
written with the for ... then comprehension from the Ruby fork.
Defined Under Namespace
Classes: Failure, ParseError, Parser, Rule, Success
Constant Summary collapse
- VERSION =
"0.1.0"
Class Method Summary collapse
-
.method_added(name) ⇒ Object
Rewrite every method defined on a subclass into a rule that returns a lazy Rule.
-
.parse(input) ⇒ Object
Convenience: parse
inputwith a fresh instance. -
.skip_whitespace(pattern = /\s+/) ⇒ Object
Enable implicit whitespace skipping (Scala's RegexParsers mode).
-
.start_symbol(name = nil) ⇒ Object
Set (or read) the rule the parser starts from.
-
.whitespace ⇒ Object
The configured whitespace pattern, or nil when skipping is disabled.
Instance Method Summary collapse
-
#__memo ⇒ Object
Per-input packrat memo table, keyed by [rule_name, pos].
-
#__skip_ws(ws, input, pos) ⇒ Object
Advance
pospast whitespace matched by the anchored regexpws(nil when skipping is disabled). -
#parse(input) ⇒ Object
Parse
inputstarting from the configured start symbol. -
#pure(value) ⇒ Object
A parser that succeeds with
valuewithout consuming any input (monadic unit / Scala'ssuccess). -
#term(pattern) ⇒ Object
A terminal parser.
Class Method Details
.method_added(name) ⇒ Object
Rewrite every method defined on a subclass into a rule that returns a lazy Rule. Guards against rewriting the base class's own infrastructure and against re-entering while we install the replacement (define_method itself fires method_added).
56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 |
# File 'lib/packrat_parser/base.rb', line 56 def self.method_added(name) return if self == PackratParser return if name == :initialize return if @__defining_rule @__defining_rule = true @start_symbol ||= name begin body = instance_method(name) define_method(name) do Rule.new(self, name, body) end ensure @__defining_rule = false end end |
.parse(input) ⇒ Object
Convenience: parse input with a fresh instance.
48 49 50 |
# File 'lib/packrat_parser/base.rb', line 48 def self.parse(input) new.parse(input) end |
.skip_whitespace(pattern = /\s+/) ⇒ Object
Enable implicit whitespace skipping (Scala's RegexParsers mode). When set,
every term skips leading whitespace matching pattern before attempting
its match, and parse also consumes trailing whitespace before requiring
full input consumption. Off by default (terminals match exactly).
class CalcParser < PackratParser
skip_whitespace # default /\s+/
# skip_whitespace(/[ \t]+/) # or a custom pattern
end
36 37 38 |
# File 'lib/packrat_parser/base.rb', line 36 def self.skip_whitespace(pattern = /\s+/) @__whitespace = pattern end |
.start_symbol(name = nil) ⇒ Object
Set (or read) the rule the parser starts from. If omitted, the first defined method is used as the start symbol.
19 20 21 22 23 24 25 |
# File 'lib/packrat_parser/base.rb', line 19 def self.start_symbol(name = nil) if name @start_symbol = name else @start_symbol end end |
.whitespace ⇒ Object
The configured whitespace pattern, or nil when skipping is disabled. Inherited by subclasses so a base parser can turn the mode on once.
42 43 44 45 |
# File 'lib/packrat_parser/base.rb', line 42 def self.whitespace return @__whitespace if defined?(@__whitespace) superclass.respond_to?(:whitespace) ? superclass.whitespace : nil end |
Instance Method Details
#__memo ⇒ Object
Per-input packrat memo table, keyed by [rule_name, pos].
74 75 76 |
# File 'lib/packrat_parser/base.rb', line 74 def __memo @__memo ||= {} end |
#__skip_ws(ws, input, pos) ⇒ Object
Advance pos past whitespace matched by the anchored regexp ws (nil when
skipping is disabled). Returns the new position.
114 115 116 117 |
# File 'lib/packrat_parser/base.rb', line 114 def __skip_ws(ws, input, pos) return pos unless ws (m = ws.match(input, pos)) ? pos + m[0].length : pos end |
#parse(input) ⇒ Object
Parse input starting from the configured start symbol. Returns the parsed
value on success; raises ParseError on failure or on leftover input.
127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 |
# File 'lib/packrat_parser/base.rb', line 127 def parse(input) @__memo = {} name = self.class.start_symbol raise ParseError.new("no start symbol defined", 0) unless name result = send(name).call(input, 0) unless result.success? raise ParseError.new(result., result.pos) end # The last terminal skips only *leading* whitespace, so trailing whitespace # after the final token is left for parse to consume before requiring that # all input was used. ws = self.class.whitespace end_pos = __skip_ws(ws && /\G(?:#{ws})/, input, result.pos) if end_pos < input.length raise ParseError.new("unexpected trailing input", end_pos) end result.value end |
#pure(value) ⇒ Object
A parser that succeeds with value without consuming any input (monadic
unit / Scala's success).
121 122 123 |
# File 'lib/packrat_parser/base.rb', line 121 def pure(value) Parser.new { |_input, pos| Success.new(value, pos) } end |
#term(pattern) ⇒ Object
A terminal parser. A String matches that exact literal at the current position; a Regexp is matched anchored at the current position. The matched substring is the parser's value.
When the class enables skip_whitespace, leading whitespace is consumed
before the match is attempted, mirroring Scala's RegexParsers.
84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 |
# File 'lib/packrat_parser/base.rb', line 84 def term(pattern) ws = self.class.whitespace ws = /\G(?:#{ws})/ if ws case pattern when String Parser.new do |input, pos| pos = __skip_ws(ws, input, pos) if input[pos, pattern.length] == pattern Success.new(pattern, pos + pattern.length) else Failure.new(pos, "expected #{pattern.inspect}") end end when Regexp anchored = /\G(?:#{pattern})/ Parser.new do |input, pos| pos = __skip_ws(ws, input, pos) if (m = anchored.match(input, pos)) Success.new(m[0], pos + m[0].length) else Failure.new(pos, "expected #{pattern.inspect}") end end else raise ArgumentError, "term expects a String or Regexp, got #{pattern.class}" end end |