Class: Fast::Regexp

Inherits:
Object
  • Object
show all
Defined in:
lib/fast_regexp.rb,
lib/fast_regexp.rb,
lib/fast_regexp/version.rb

Overview

Façade over rust/regex with a transparent fallback to stdlib ‘::Regexp`.

‘Fast::Regexp.new(pattern)` first tries to compile with rust/regex (fast, byte-based). If the pattern uses features rust/regex does not support (lookaround, backreferences, possessive quantifiers, etc.) we fall back to `::Regexp` so consumers don’t have to juggle two libraries.

Defined Under Namespace

Classes: MatchData

Constant Summary collapse

NATIVE_EXTENSIONS =
%w[.bundle .so .rb].freeze
RUBY_FLAG_MAP =
{
  ::Regexp::IGNORECASE => "i",
  ::Regexp::EXTENDED => "x",
  ::Regexp::MULTILINE => "s" # Ruby's /m = dotall; in rust/regex that is (?s).
}.freeze
BACKENDS =
%i[auto fast stdlib].freeze
VERSION =
"0.6.1"

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(pattern, original: pattern, backend: :auto, **opts) ⇒ Regexp

Internal — use ‘Fast::Regexp.new`. `original` is the unmodified input (String or ::Regexp) so we can build an accurate stdlib fallback. `backend:` forces a specific engine: `:auto` (default) tries rust/regex and falls back to stdlib, `:fast` raises if rust/regex rejects the pattern, `:stdlib` skips rust/regex entirely.

Raises:

  • (ArgumentError)


76
77
78
79
80
# File 'lib/fast_regexp.rb', line 76

def initialize(pattern, original: pattern, backend: :auto, **opts)
  raise ArgumentError, "backend must be one of #{BACKENDS.inspect}" unless BACKENDS.include?(backend)
  @pattern = pattern
  @backend = compile_backend(pattern, original, backend, opts)
end

Instance Attribute Details

#backendObject (readonly)

Returns the value of attribute backend.



67
68
69
# File 'lib/fast_regexp.rb', line 67

def backend
  @backend
end

#patternObject (readonly) Also known as: to_s

Returns the value of attribute pattern.



67
68
69
# File 'lib/fast_regexp.rb', line 67

def pattern
  @pattern
end

Class Method Details

.create_many(**patterns) ⇒ Object

Bulk-compile a symbol-keyed hash of patterns. Handy for defining a set of regex constants in one shot:

RE = Fast::Regexp.create_many(word: '\w+', num: '\d+').freeze
RE[:word].match("hello")


53
54
55
# File 'lib/fast_regexp.rb', line 53

def create_many(**patterns)
  patterns.transform_values { |pat| new(pat) }
end

.locate_native(base, ruby_version: RUBY_VERSION) ⇒ Object

Precompiled native gems ship per-ABI subdirs (‘fast_regexp/4.0/…`), the source-gem `rake compile` build lands flat (`fast_regexp/…`). Pick whichever exists for the current Ruby ABI, with the per-ABI path winning when both are present.



19
20
21
22
23
# File 'lib/fast_regexp.rb', line 19

def self.locate_native(base, ruby_version: RUBY_VERSION)
  abi = ruby_version[/\d+\.\d+/]
  candidates = [File.join(base, abi, "fast_regexp"), File.join(base, "fast_regexp")]
  candidates.find { |stem| NATIVE_EXTENSIONS.any? { |ext| File.exist?(stem + ext) } }
end

.new(pattern, **opts) ⇒ Object

Compile a pattern. Accepts a String or a ‘::Regexp` (flags are translated into a leading `(?…)` group so rust/regex sees them). Falls back to `::Regexp` if rust/regex rejects the pattern.



43
44
45
46
# File 'lib/fast_regexp.rb', line 43

def new(pattern, **opts)
  translated = pattern.is_a?(::Regexp) ? translate_regexp(pattern) : pattern
  allocate.tap { |re| re.send(:initialize, translated, original: pattern, **opts) }
end

Instance Method Details

#===(other) ⇒ Object



99
100
101
102
# File 'lib/fast_regexp.rb', line 99

def ===(other)
  return false unless other.respond_to?(:to_str)
  match?(other.to_str)
end

#=~(other) ⇒ Object

Byte offset of the first match (rust/regex is byte-based; stdlib path also returns bytes here for API consistency).



106
107
108
109
110
# File 'lib/fast_regexp.rb', line 106

def =~(other)
  return nil unless other.respond_to?(:to_str)
  m = match(other.to_str)
  m && m.byte_begin(0)
end

#captures_countObject

On the fast path this comes from rust/regex directly. On the stdlib fallback we walk the source counting capturing groups while honoring escapes, character classes, and non-capturing / lookaround prefixes.



164
165
166
167
# File 'lib/fast_regexp.rb', line 164

def captures_count
  return @backend.captures_count if fast?
  count_stdlib_captures(@backend.source)
end

#fast?Boolean

Returns:

  • (Boolean)


82
# File 'lib/fast_regexp.rb', line 82

def fast? = @backend.is_a?(Native)

#gsub(haystack, replacement = nil, literal: false, &block) ⇒ Object



145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
# File 'lib/fast_regexp.rb', line 145

def gsub(haystack, replacement = nil, literal: false, &block)
  haystack = coerce_string(haystack)
  if block
    raise ArgumentError, "wrong number of arguments (given 2, expected 1 with block)" if replacement
    fast? ? fast_gsub_with_block(haystack, &block) : stdlib_gsub_with_block(haystack, &block)
  else
    raise ArgumentError, "wrong number of arguments (given 1, expected 2)" if replacement.nil?
    replacement = coerce_string(replacement)
    if fast?
      @backend._native_gsub(haystack, replacement, literal)
    else
      haystack.gsub(@backend, stdlib_replacement(replacement, literal))
    end
  end
end

#inspectObject



173
# File 'lib/fast_regexp.rb', line 173

def inspect = "#<Fast::Regexp #{@pattern.inspect}#{stdlib? ? " (stdlib)" : ""}>"

#match(haystack) ⇒ Object



89
90
91
92
93
# File 'lib/fast_regexp.rb', line 89

def match(haystack)
  haystack = coerce_string(haystack)
  raw = fast? ? @backend._native_match(haystack) : @backend.match(haystack)
  raw && MatchData.new(raw, haystack)
end

#match?(haystack) ⇒ Boolean

Returns:

  • (Boolean)


95
96
97
# File 'lib/fast_regexp.rb', line 95

def match?(haystack)
  @backend.match?(coerce_string(haystack))
end

#namesObject



169
170
171
# File 'lib/fast_regexp.rb', line 169

def names
  @backend.names
end

#nativeObject

Escape hatches for callers that need the underlying object directly.



86
# File 'lib/fast_regexp.rb', line 86

def native = fast? ? @backend : nil

#scan(haystack) ⇒ Object



112
113
114
# File 'lib/fast_regexp.rb', line 112

def scan(haystack)
  @backend.scan(coerce_string(haystack))
end

#scan_matches(haystack) ⇒ Object



116
117
118
119
120
121
122
123
124
125
# File 'lib/fast_regexp.rb', line 116

def scan_matches(haystack)
  haystack = coerce_string(haystack)
  if fast?
    @backend.scan_matches(haystack).map { |m| MatchData.new(m, haystack) }
  else
    results = []
    haystack.scan(@backend) { results << MatchData.new(::Regexp.last_match, haystack) }
    results
  end
end

#stdlibObject



87
# File 'lib/fast_regexp.rb', line 87

def stdlib = stdlib? ? @backend : nil

#stdlib?Boolean

Returns:

  • (Boolean)


83
# File 'lib/fast_regexp.rb', line 83

def stdlib? = !fast?

#sub(haystack, replacement = nil, literal: false, &block) ⇒ Object



127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
# File 'lib/fast_regexp.rb', line 127

def sub(haystack, replacement = nil, literal: false, &block)
  haystack = coerce_string(haystack)
  if block
    raise ArgumentError, "wrong number of arguments (given 2, expected 1 with block)" if replacement
    m = match(haystack)
    return haystack.dup unless m
    "#{m.pre_match}#{block.call(m)}#{m.post_match}"
  else
    raise ArgumentError, "wrong number of arguments (given 1, expected 2)" if replacement.nil?
    replacement = coerce_string(replacement)
    if fast?
      @backend._native_sub(haystack, replacement, literal)
    else
      haystack.sub(@backend, stdlib_replacement(replacement, literal))
    end
  end
end