Class: RubyWorker::Parser
- Inherits:
-
Object
- Object
- RubyWorker::Parser
- Defined in:
- lib/ruby_worker/parser.rb
Overview
Parser walks each requested Ruby file via the whitequark/parser gem and produces the ParsedFile messages the Go-side structural detectors consume.
Mirrors ‘workers/ts/src/parser.ts` and `workers/php/src/Parser.php` structurally — same six collectors, same hashing scheme, same block-extraction shape. Cross-language symmetry isn’t accidental; it’s how the canonical-hash detector clusters Ruby / TS / Go / PHP functions that share a structural shape.
## Hashing scheme
Two hashes per function:
* `hash` (language-specific): captures Ruby-flavored AST
node types including operator method names on `:send`
nodes. Two Ruby methods with the same hash share AST
shape modulo identifier names and literal values.
* `canonical_hash` (cross-language): same scheme using
the universal token vocabulary defined in
core/pkg/structural/lang/canonical.go.
Both hashes use SHA-1 truncated to 16 hex chars. Trivial bodies (≤2 nodes) short-circuit to “”.
## Concerns
Per-file concerns are categorized into ConcernEvidenceRef entries tagged with one of the eight canonical categories. The classifier looks at:
* `:send` nodes whose method is a known state / network
/ io / config / dataaccess identifier
* `[]` accesses on session / cookies / ENV
* Rails.cache, Rails.application.config
* High-complexity methods → business
The taxonomy aligns with the Rails framework profile in ‘core/pkg/structural/framework/rails.go`.
## Error tolerance
Syntactically broken Ruby still yields a partial ParsedFile with whatever the parser salvaged. The ‘parse_error` field carries the SyntaxError’s message verbatim. The parser gem itself has best-in-class error recovery (it’s what RuboCop relies on for partial parses).
Constant Summary collapse
- BUSINESS_COMPLEXITY =
8- MIN_BLOCK_STMTS =
3- HASH_HEX_LEN =
16
Instance Method Summary collapse
-
#initialize ⇒ Parser
constructor
A new instance of Parser.
-
#parse_files(repo_path, rel_paths) ⇒ Array<Object>
ParsedFile proto messages.
Constructor Details
#initialize ⇒ Parser
Returns a new instance of Parser.
65 66 67 68 69 70 71 |
# File 'lib/ruby_worker/parser.rb', line 65 def initialize @ruby_parser = ::Parser::CurrentRuby.new # Silence parser-gem warnings on stderr — they pollute # the gRPC server logs. @ruby_parser.diagnostics.all_errors_are_fatal = false @ruby_parser.diagnostics.ignore_warnings = true end |
Instance Method Details
#parse_files(repo_path, rel_paths) ⇒ Array<Object>
Returns ParsedFile proto messages.
76 77 78 |
# File 'lib/ruby_worker/parser.rb', line 76 def parse_files(repo_path, rel_paths) rel_paths.map { |rel| parse_one(repo_path, rel) } end |