Class: Rouge::Lexer Abstract
- Inherits:
-
Object
- Object
- Rouge::Lexer
- Includes:
- Token::Tokens
- Defined in:
- lib/rouge/lexer.rb
Overview
A lexer transforms text into a stream of ‘[token, chunk]` pairs.
Direct Known Subclasses
Rouge::Lexers::ConsoleLexer, Rouge::Lexers::Escape, Rouge::Lexers::PlainText, RegexLexer
Constant Summary
Constants included from Token::Tokens
Token::Tokens::Num, Token::Tokens::Str
Instance Attribute Summary collapse
-
#options ⇒ Object
readonly
-*- instance methods -*- #.
Class Method Summary collapse
-
.aliases(*args) ⇒ Object
Used to specify alternate names this lexer class may be found by.
-
.all ⇒ Object
A list of all lexers.
- .assert_utf8!(str) ⇒ Object
-
.continue_lex(*a, &b) ⇒ Object
In case #continue_lex is called statically, we simply begin a new lex from the beginning, since there is no state.
- .debug_enabled? ⇒ Boolean
-
.demo(arg = :absent) ⇒ Object
Specify or get a small demo string for this lexer.
-
.demo_file(arg = :absent) ⇒ Object
Specify or get the path name containing a small demo for this lexer (can be overriden by Lexer.demo).
-
.desc(arg = :absent) ⇒ Object
Specify or get this lexer’s description.
-
.detect?(text) ⇒ Boolean
abstract
Return true if there is an in-text indication (such as a shebang or DOCTYPE declaration) that this lexer should be used.
-
.detectable? ⇒ Boolean
Determine if a lexer has a method named
:detect?defined in its singleton class. - .disable_debug! ⇒ Object
- .eager_load! ⇒ Object
- .enable_debug! ⇒ Object
-
.filenames(*fnames) ⇒ Object
Specify a list of filename globs associated with this lexer.
-
.find(name) ⇒ Class<Rouge::Lexer>?
Given a name in string, return the correct lexer class.
-
.find_fancy(str, code = nil, default_options = {}) ⇒ Object
Find a lexer, with fancy shiny features.
-
.guess(info = {}, &fallback) ⇒ Class<Rouge::Lexer>
Guess which lexer to use based on a hash of info.
- .guess_by_filename(fname) ⇒ Object
- .guess_by_mimetype(mt) ⇒ Object
- .guess_by_source(source) ⇒ Object
-
.guesses(info = {}) ⇒ Object
Guess which lexer to use based on a hash of info.
- .lazy(auto: true, &block) ⇒ Object
-
.lex(stream, opts = {}, &b) ⇒ Object
Lexes ‘stream` with the given options.
-
.lookup_fancy(str, code = nil, default_options = {}) ⇒ Object
Same as ::find_fancy, except instead of returning an instantiated lexer, returns a pair of [lexer_class, options], so that you can modify or provide additional options to the lexer.
-
.mimetypes(*mts) ⇒ Object
Specify a list of mimetypes associated with this lexer.
- .option(name, desc) ⇒ Object
- .option_docs ⇒ Object
- .skip_auto_load? ⇒ Boolean
-
.tag(t = nil) ⇒ Object
Used to specify or get the canonical name of this lexer class.
-
.title(t = nil) ⇒ Object
Specify or get this lexer’s title.
Instance Method Summary collapse
- #as_bool(val) ⇒ Object
- #as_lexer(val) ⇒ Object
- #as_list(val) ⇒ Object
- #as_string(val) ⇒ Object
- #as_token(val) ⇒ Object
- #bool_option(name, &default) ⇒ Object
-
#continue_lex(string) {|last_token, last_val| ... } ⇒ Object
Continue the lex from the the current state without resetting.
- #eager_load! ⇒ Object
- #hash_option(name, defaults, &val_cast) ⇒ Object
-
#initialize(opts = {}) ⇒ Lexer
constructor
Create a new lexer with the given options.
-
#lex(string, opts = nil, &b) ⇒ Object
Given a string, yield [token, chunk] pairs.
- #lexer_option(name, &default) ⇒ Object
- #list_option(name, &default) ⇒ Object
-
#reset! ⇒ Object
abstract
Called after each lex is finished.
-
#stream_tokens(stream, &b) ⇒ Object
abstract
Yield ‘[token, chunk]` pairs, given a prepared input stream.
- #string_option(name, &default) ⇒ Object
-
#tag ⇒ Object
delegated to Lexer.tag.
- #token_option(name, &default) ⇒ Object
-
#with(opts = {}) ⇒ Object
Returns a new lexer with the given options set.
Methods included from Token::Tokens
Constructor Details
#initialize(opts = {}) ⇒ Lexer
Create a new lexer with the given options. Individual lexers may specify extra options. The only current globally accepted option is ‘:debug`.
346 347 348 349 350 351 352 |
# File 'lib/rouge/lexer.rb', line 346 def initialize(opts={}) @options = {} opts.each { |k, v| @options[k.to_s] = v } eager_load! unless self.class.skip_auto_load? @debug = Lexer.debug_enabled? && bool_option('debug') end |
Instance Attribute Details
#options ⇒ Object (readonly)
-*- instance methods -*- #
336 337 338 |
# File 'lib/rouge/lexer.rb', line 336 def @options end |
Class Method Details
.aliases(*args) ⇒ Object
Used to specify alternate names this lexer class may be found by.
284 285 286 287 288 |
# File 'lib/rouge/lexer.rb', line 284 def aliases(*args) args.map!(&:to_s) args.each { |arg| Lexer.register(arg, self) } (@aliases ||= []).concat(args) end |
.all ⇒ Object
Returns a list of all lexers.
138 139 140 |
# File 'lib/rouge/lexer.rb', line 138 def all @all ||= registry.values.uniq end |
.assert_utf8!(str) ⇒ Object
318 319 320 321 322 323 324 325 326 |
# File 'lib/rouge/lexer.rb', line 318 def assert_utf8!(str) encoding = str.encoding return if encoding == Encoding::US_ASCII || encoding == Encoding::UTF_8 || encoding == Encoding::BINARY raise EncodingError.new( "Bad encoding: #{str.encoding.names.join(',')}. " + "Please convert your string to UTF-8." ) end |
.continue_lex(*a, &b) ⇒ Object
In case #continue_lex is called statically, we simply begin a new lex from the beginning, since there is no state.
25 26 27 |
# File 'lib/rouge/lexer.rb', line 25 def continue_lex(*a, &b) lex(*a, &b) end |
.debug_enabled? ⇒ Boolean
212 213 214 |
# File 'lib/rouge/lexer.rb', line 212 def debug_enabled? (defined? @debug_enabled) ? true : false end |
.demo(arg = :absent) ⇒ Object
Specify or get a small demo string for this lexer
131 132 133 134 135 |
# File 'lib/rouge/lexer.rb', line 131 def demo(arg=:absent) return @demo = arg unless arg == :absent @demo ||= File.read(demo_file, mode: 'rt:bom|utf-8') end |
.demo_file(arg = :absent) ⇒ Object
Specify or get the path name containing a small demo for this lexer (can be overriden by demo).
124 125 126 127 128 |
# File 'lib/rouge/lexer.rb', line 124 def demo_file(arg=:absent) return @demo_file = Pathname.new(arg) unless arg == :absent @demo_file ||= Pathname.new(File.join(__dir__, 'demos', tag)) end |
.desc(arg = :absent) ⇒ Object
Specify or get this lexer’s description.
106 107 108 109 110 111 112 |
# File 'lib/rouge/lexer.rb', line 106 def desc(arg=:absent) if arg == :absent @desc else @desc = arg end end |
.detect?(text) ⇒ Boolean
Return true if there is an in-text indication (such as a shebang or DOCTYPE declaration) that this lexer should be used.
548 549 550 |
# File 'lib/rouge/lexer.rb', line 548 def self.detect?(text) false end |
.detectable? ⇒ Boolean
Determine if a lexer has a method named :detect? defined in its singleton class.
218 219 220 221 |
# File 'lib/rouge/lexer.rb', line 218 def detectable? return @detectable if defined?(@detectable) @detectable = singleton_methods(false).include?(:detect?) end |
.disable_debug! ⇒ Object
208 209 210 |
# File 'lib/rouge/lexer.rb', line 208 def disable_debug! remove_instance_variable :@debug_enabled if defined? @debug_enabled end |
.eager_load! ⇒ Object
236 237 238 239 240 241 242 243 244 245 |
# File 'lib/rouge/lexer.rb', line 236 def eager_load! return if @_loaded @_loaded = true lazy_procs.each { |b| instance_eval(&b) } superclass.eager_load! unless superclass == Lexer self end |
.enable_debug! ⇒ Object
204 205 206 |
# File 'lib/rouge/lexer.rb', line 204 def enable_debug! @debug_enabled = true end |
.filenames(*fnames) ⇒ Object
Specify a list of filename globs associated with this lexer.
If a filename glob is associated with more than one lexer, this can cause a Guesser::Ambiguous error to be raised in various guessing methods. These errors can be avoided by disambiguation. Filename globs are disambiguated in one of two ways. Either the lexer will define a ‘self.detect?` method (intended for use with shebangs and doctypes) or a manual rule will be specified in Guessers::Disambiguation.
303 304 305 |
# File 'lib/rouge/lexer.rb', line 303 def filenames(*fnames) (@filenames ||= []).concat(fnames) end |
.find(name) ⇒ Class<Rouge::Lexer>?
Given a name in string, return the correct lexer class.
32 33 34 |
# File 'lib/rouge/lexer.rb', line 32 def find(name) registry[name.to_s] end |
.find_fancy(str, code = nil, default_options = {}) ⇒ Object
Find a lexer, with fancy shiny features.
-
The string you pass can include CGI-style options
Lexer.find_fancy('erb?parent=tex') -
You can pass the special name ‘guess’ so we guess for you, and you can pass a second argument of the code to guess by
Lexer.find_fancy('guess', "#!/bin/bash\necho Hello, world")If the code matches more than one lexer then Guesser::Ambiguous is raised.
This is used in the Redcarpet plugin as well as Rouge’s own markdown lexer for highlighting internal code blocks.
91 92 93 94 95 |
# File 'lib/rouge/lexer.rb', line 91 def find_fancy(str, code=nil, ={}) lexer_class, opts = lookup_fancy(str, code, ) lexer_class && lexer_class.new(opts) end |
.guess(info = {}, &fallback) ⇒ Class<Rouge::Lexer>
Guess which lexer to use based on a hash of info.
179 180 181 182 183 184 185 186 187 188 189 190 |
# File 'lib/rouge/lexer.rb', line 179 def guess(info={}, &fallback) lexers = guesses(info) return Lexers::PlainText if lexers.empty? return lexers[0] if lexers.size == 1 if fallback yield(lexers) else raise Guesser::Ambiguous.new(lexers) end end |
.guess_by_filename(fname) ⇒ Object
196 197 198 |
# File 'lib/rouge/lexer.rb', line 196 def guess_by_filename(fname) guess :filename => fname end |
.guess_by_mimetype(mt) ⇒ Object
192 193 194 |
# File 'lib/rouge/lexer.rb', line 192 def guess_by_mimetype(mt) guess :mimetype => mt end |
.guess_by_source(source) ⇒ Object
200 201 202 |
# File 'lib/rouge/lexer.rb', line 200 def guess_by_source(source) guess :source => source end |
.guesses(info = {}) ⇒ Object
Guess which lexer to use based on a hash of info.
This accepts the same arguments as Lexer.guess, but will never throw an error. It will return a (possibly empty) list of potential lexers to use.
147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 |
# File 'lib/rouge/lexer.rb', line 147 def guesses(info={}) mimetype, filename, source = info.values_at(:mimetype, :filename, :source) custom_globs = info[:custom_globs] guessers = (info[:guessers] || []).dup guessers << Guessers::Mimetype.new(mimetype) if mimetype guessers << Guessers::GlobMapping.by_pairs(custom_globs, filename) if custom_globs && filename guessers << Guessers::Filename.new(filename) if filename guessers << Guessers::Modeline.new(source) if source guessers << Guessers::Source.new(source) if source guessers << Guessers::Disambiguation.new(filename, source) if source && filename Guesser.guess(guessers, Lexer.all) end |
.lazy(auto: true, &block) ⇒ Object
247 248 249 250 |
# File 'lib/rouge/lexer.rb', line 247 def lazy(auto: true, &block) @skip_auto_load = true unless auto lazy_procs << block end |
.lex(stream, opts = {}, &b) ⇒ Object
Lexes ‘stream` with the given options. The lex is delegated to a new instance.
17 18 19 |
# File 'lib/rouge/lexer.rb', line 17 def lex(stream, opts={}, &b) new(opts).lex(stream, &b) end |
.lookup_fancy(str, code = nil, default_options = {}) ⇒ Object
Same as ::find_fancy, except instead of returning an instantiated lexer, returns a pair of [lexer_class, options], so that you can modify or provide additional options to the lexer.
Please note: the lexer class might be nil!
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 |
# File 'lib/rouge/lexer.rb', line 41 def lookup_fancy(str, code=nil, ={}) if str && !str.include?('?') && str != 'guess' lexer_class = find(str) return [lexer_class, ] end name, opts = str ? str.split('?', 2) : [nil, ''] # parse the options hash from a cgi-style string cgi_opts = Hash.new { |hash, key| hash[key] = [] } URI.decode_www_form(opts || '').each do |k, val| cgi_opts[k] << val end cgi_opts.transform_values! do |vals| case vals.size when 0 then true when 1 then vals[0] else vals end end opts = .merge(cgi_opts) lexer_class = case name when 'guess', nil self.guess(:source => code, :mimetype => opts['mimetype']) when String self.find(name) end [lexer_class, opts] end |
.mimetypes(*mts) ⇒ Object
Specify a list of mimetypes associated with this lexer.
313 314 315 |
# File 'lib/rouge/lexer.rb', line 313 def mimetypes(*mts) (@mimetypes ||= []).concat(mts) end |
.option(name, desc) ⇒ Object
118 119 120 |
# File 'lib/rouge/lexer.rb', line 118 def option(name, desc) option_docs[name.to_s] = desc end |
.option_docs ⇒ Object
114 115 116 |
# File 'lib/rouge/lexer.rb', line 114 def option_docs @option_docs ||= InheritableHash.new(superclass.option_docs) end |
.skip_auto_load? ⇒ Boolean
252 253 254 255 256 |
# File 'lib/rouge/lexer.rb', line 252 def skip_auto_load? return true if @skip_auto_load return superclass.skip_auto_load? unless superclass == Lexer false end |
.tag(t = nil) ⇒ Object
Used to specify or get the canonical name of this lexer class.
268 269 270 271 272 273 |
# File 'lib/rouge/lexer.rb', line 268 def tag(t=nil) return @tag if t.nil? @tag = t.to_s Lexer.register(@tag, self) end |
.title(t = nil) ⇒ Object
Specify or get this lexer’s title. Meant to be human-readable.
98 99 100 101 102 103 |
# File 'lib/rouge/lexer.rb', line 98 def title(t=nil) if t.nil? t = tag.capitalize end @title ||= t end |
Instance Method Details
#as_bool(val) ⇒ Object
366 367 368 369 370 371 372 373 374 375 |
# File 'lib/rouge/lexer.rb', line 366 def as_bool(val) case val when nil, false, 0, '0', 'false', 'off' false when Array val.empty? ? true : as_bool(val.last) else true end end |
#as_lexer(val) ⇒ Object
394 395 396 397 398 399 400 401 402 403 404 405 |
# File 'lib/rouge/lexer.rb', line 394 def as_lexer(val) return as_lexer(val.last) if val.is_a?(Array) return val.new(@options) if val.is_a?(Class) && val < Lexer case val when Lexer val when String lexer_class = Lexer.find(val) lexer_class && lexer_class.new(@options) end end |
#as_list(val) ⇒ Object
383 384 385 386 387 388 389 390 391 392 |
# File 'lib/rouge/lexer.rb', line 383 def as_list(val) case val when Array val.flat_map { |v| as_list(v) } when String val.split(',') else [] end end |
#as_string(val) ⇒ Object
377 378 379 380 381 |
# File 'lib/rouge/lexer.rb', line 377 def as_string(val) return as_string(val.last) if val.is_a?(Array) val ? val.to_s : nil end |
#as_token(val) ⇒ Object
407 408 409 410 411 412 413 414 415 |
# File 'lib/rouge/lexer.rb', line 407 def as_token(val) return as_token(val.last) if val.is_a?(Array) case val when Token val else Token[val] end end |
#bool_option(name, &default) ⇒ Object
417 418 419 420 421 422 423 424 425 |
# File 'lib/rouge/lexer.rb', line 417 def bool_option(name, &default) name_str = name.to_s if @options.key?(name_str) as_bool(@options[name_str]) else default ? yield : false end end |
#continue_lex(string) {|last_token, last_val| ... } ⇒ Object
Continue the lex from the the current state without resetting
502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 |
# File 'lib/rouge/lexer.rb', line 502 def continue_lex(string, &b) return enum_for(:continue_lex, string, &b) unless block_given? # consolidate consecutive tokens of the same type last_token = nil last_val = nil stream_tokens(string) do |tok, val| next if val.empty? if tok == last_token last_val << val next end yield(last_token, last_val) if last_token last_token = tok last_val = val end yield(last_token, last_val) if last_token end |
#eager_load! ⇒ Object
354 355 356 |
# File 'lib/rouge/lexer.rb', line 354 def eager_load! self.class.eager_load! end |
#hash_option(name, defaults, &val_cast) ⇒ Object
443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 |
# File 'lib/rouge/lexer.rb', line 443 def hash_option(name, defaults, &val_cast) name = name.to_s out = defaults.dup base = @options.delete(name.to_s) base = {} unless base.is_a?(Hash) base.each { |k, v| out[k.to_s] = val_cast ? yield(v) : v } @options.keys.each do |key| next unless key =~ /(\w+)\[(\w+)\]/ and $1 == name value = @options.delete(key) out[$2] = val_cast ? yield(value) : value end out end |
#lex(string, opts = nil, &b) ⇒ Object
The use of :continue => true has been deprecated. A warning is issued if run with ‘$VERBOSE` set to true.
The use of arbitrary ‘opts` has never been supported, but we previously ignored them with no error. We now warn unconditionally.
Given a string, yield [token, chunk] pairs. If no block is given, an enumerator is returned.
479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 |
# File 'lib/rouge/lexer.rb', line 479 def lex(string, opts=nil, &b) if opts if (opts.keys - [:continue]).size > 0 # improper use of options hash warn('Improper use of Lexer#lex - this method does not receive options.' + ' This will become an error in a future version.') end if opts[:continue] warn '`lex :continue => true` is deprecated, please use #continue_lex instead' return continue_lex(string, &b) end end return enum_for(:lex, string) unless block_given? Lexer.assert_utf8!(string) reset! continue_lex(string, &b) end |
#lexer_option(name, &default) ⇒ Object
431 432 433 |
# File 'lib/rouge/lexer.rb', line 431 def lexer_option(name, &default) as_lexer(@options.delete(name.to_s, &default)) end |
#list_option(name, &default) ⇒ Object
435 436 437 |
# File 'lib/rouge/lexer.rb', line 435 def list_option(name, &default) as_list(@options.delete(name.to_s, &default)) end |
#reset! ⇒ Object
Called after each lex is finished. The default implementation is a noop.
465 466 |
# File 'lib/rouge/lexer.rb', line 465 def reset! end |
#stream_tokens(stream, &b) ⇒ Object
Yield ‘[token, chunk]` pairs, given a prepared input stream. This must be implemented.
536 537 538 |
# File 'lib/rouge/lexer.rb', line 536 def stream_tokens(stream, &b) raise 'abstract' end |
#string_option(name, &default) ⇒ Object
427 428 429 |
# File 'lib/rouge/lexer.rb', line 427 def string_option(name, &default) as_string(@options.delete(name.to_s, &default)) end |
#tag ⇒ Object
delegated to tag
525 526 527 |
# File 'lib/rouge/lexer.rb', line 525 def tag self.class.tag end |
#token_option(name, &default) ⇒ Object
439 440 441 |
# File 'lib/rouge/lexer.rb', line 439 def token_option(name, &default) as_token(@options.delete(name.to_s, &default)) end |
#with(opts = {}) ⇒ Object
Returns a new lexer with the given options set. Useful for e.g. setting debug flags post hoc, or providing global overrides for certain options
360 361 362 363 364 |
# File 'lib/rouge/lexer.rb', line 360 def with(opts={}) = @options.dup opts.each { |k, v| [k.to_s] = v } self.class.new() end |