Class: LexerKit::Core::Source
- Inherits:
-
Object
- Object
- LexerKit::Core::Source
- Defined in:
- lib/lexer_kit/core/source.rb
Overview
Source holds the input byte sequence and optional filename. It provides line/column conversion for diagnostics.
Instance Attribute Summary collapse
-
#bytes ⇒ Object
readonly
Returns the value of attribute bytes.
-
#filename ⇒ Object
readonly
Returns the value of attribute filename.
Instance Method Summary collapse
-
#byte_offset_for_char_index(char_index) ⇒ Integer
Convert character index to byte offset.
-
#initialize(input, filename: nil) ⇒ Source
constructor
A new instance of Source.
- #inspect ⇒ Object
-
#length ⇒ Integer
(also: #size)
Length in bytes.
-
#line_col(byte_offset) ⇒ Array(Integer, Integer)
Convert byte offset to line and column (1-based) Builds line index if not already built.
-
#line_count ⇒ Integer
Get the number of lines.
-
#line_index! ⇒ self
Build line index (explicit, not automatic) Call this before using line_col or line_slice on large inputs.
-
#line_slice(line) ⇒ String?
Get the content of a specific line (1-based).
-
#span(start, len) ⇒ Span
Create a span for the given range.
-
#span_for_char_index(char_index, len: 1) ⇒ Span
Get the span for a character index.
-
#span_for_line(line) ⇒ Span
Get the span covering an entire line (1-based).
-
#text(span) ⇒ String
Extract text for a span.
Constructor Details
#initialize(input, filename: nil) ⇒ Source
Returns a new instance of Source.
12 13 14 15 16 17 |
# File 'lib/lexer_kit/core/source.rb', line 12 def initialize(input, filename: nil) @original_string = input.freeze @bytes = input.dup.force_encoding(Encoding::BINARY).freeze @filename = filename&.freeze @line_starts = nil end |
Instance Attribute Details
#bytes ⇒ Object (readonly)
Returns the value of attribute bytes.
8 9 10 |
# File 'lib/lexer_kit/core/source.rb', line 8 def bytes @bytes end |
#filename ⇒ Object (readonly)
Returns the value of attribute filename.
8 9 10 |
# File 'lib/lexer_kit/core/source.rb', line 8 def filename @filename end |
Instance Method Details
#byte_offset_for_char_index(char_index) ⇒ Integer
Convert character index to byte offset. For BINARY input, returns char_index directly (O(1)). For other encodings (e.g. UTF-8), computes byte offset (O(n), error paths only).
129 130 131 132 133 134 135 |
# File 'lib/lexer_kit/core/source.rb', line 129 def byte_offset_for_char_index(char_index) if @original_string.encoding == Encoding::BINARY char_index else @original_string[0...char_index].bytesize end end |
#inspect ⇒ Object
148 149 150 151 |
# File 'lib/lexer_kit/core/source.rb', line 148 def inspect filename_str = @filename ? " #{@filename.inspect}" : "" "#<LexerKit::Core::Source#{filename_str} #{length} bytes>" end |
#length ⇒ Integer Also known as: size
Length in bytes
21 22 23 |
# File 'lib/lexer_kit/core/source.rb', line 21 def length @bytes.bytesize end |
#line_col(byte_offset) ⇒ Array(Integer, Integer)
Convert byte offset to line and column (1-based) Builds line index if not already built
50 51 52 53 54 55 56 57 58 59 60 |
# File 'lib/lexer_kit/core/source.rb', line 50 def line_col(byte_offset) line_index! unless @line_starts # Binary search for line line = @line_starts.bsearch_index { |start| start > byte_offset } line ||= @line_starts.length line_start = @line_starts[line - 1] col = byte_offset - line_start + 1 [line, col] end |
#line_count ⇒ Integer
Get the number of lines
84 85 86 87 |
# File 'lib/lexer_kit/core/source.rb', line 84 def line_count line_index! unless @line_starts @line_starts.length end |
#line_index! ⇒ self
Build line index (explicit, not automatic) Call this before using line_col or line_slice on large inputs
30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
# File 'lib/lexer_kit/core/source.rb', line 30 def line_index! return self if @line_starts @line_starts = [0] pos = 0 while pos < @bytes.bytesize byte = @bytes.getbyte(pos) if byte == 0x0A # LF @line_starts << (pos + 1) end pos += 1 end @line_starts.freeze self end |
#line_slice(line) ⇒ String?
Get the content of a specific line (1-based)
65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 |
# File 'lib/lexer_kit/core/source.rb', line 65 def line_slice(line) line_index! unless @line_starts return nil if line < 1 || line > @line_starts.length start = @line_starts[line - 1] if line < @line_starts.length # Not the last line end_pos = @line_starts[line] content = @bytes.byteslice(start, end_pos - start) # Remove trailing newline content.chomp else # Last line @bytes.byteslice(start, @bytes.bytesize - start) end end |
#span(start, len) ⇒ Span
Create a span for the given range
93 94 95 |
# File 'lib/lexer_kit/core/source.rb', line 93 def span(start, len) Span.new(start, len) end |
#span_for_char_index(char_index, len: 1) ⇒ Span
Get the span for a character index. For BINARY input, O(1). For other encodings, O(n) (error paths only).
142 143 144 145 146 |
# File 'lib/lexer_kit/core/source.rb', line 142 def span_for_char_index(char_index, len: 1) byte_start = byte_offset_for_char_index(char_index) byte_end = byte_offset_for_char_index(char_index + len) Span.new(byte_start, byte_end - byte_start) end |
#span_for_line(line) ⇒ Span
Get the span covering an entire line (1-based)
107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 |
# File 'lib/lexer_kit/core/source.rb', line 107 def span_for_line(line) line_index! unless @line_starts line = [line, 1].max return Span.new(0, 0) if line > @line_starts.length start = @line_starts[line - 1] line_end = if line < @line_starts.length # Not the last line - span up to (but not including) newline @line_starts[line] - 1 else # Last line @bytes.bytesize end Span.new(start, line_end - start) end |
#text(span) ⇒ String
Extract text for a span
100 101 102 |
# File 'lib/lexer_kit/core/source.rb', line 100 def text(span) span.slice(@bytes) end |