Class: Rubino::Compression::PythonCodeSkeleton

Inherits:
LineSkeleton
  • Object
show all
Defined in:
lib/rubino/compression/python_code_skeleton.rb

Overview

The Python strategy for LineSkeleton — the exact parity of RubyCodeSkeleton, only the parser differs. Every import, comment, class structure and method SIGNATURE is kept VERBATIM; only LARGE function/method BODIES are elided behind a pointer (see LineSkeleton for the pointer format and the drill-in invariant).

Ruby has Prism built in; Python’s parser lives in the ‘python3` stdlib, so we SHELL OUT to a tiny embedded `ast` driver. This is an internal, read-only parse — `ast.parse` never executes the target source — and it deliberately does NOT route through the agent’s Shell tool/approval.

NO-OP FALLBACK (the user’s hard rule): if python3 is absent, errors, times out, or emits anything we can’t read as JSON, #collect_elisions returns nil and the caller sends the ORIGINAL output unchanged. There is no regex/indentation approximation anywhere — when ‘ast` can’t run, we do not guess.

Constant Summary collapse

PARSE_TIMEOUT_SECONDS =

Hard ceiling on the parse subprocess; a pathological/huge file must never stall a read. A timeout is just another no-op trigger.

5
DRIVER =

The embedded driver. Reads the target source from STDIN (never as an argv path, never executed) and prints, to STDOUT, a JSON array of

first_line, line_count

body ranges to elide — or ‘null` on ANY parse

failure. The keep threshold is read from argv as an integer.

Parity with the Ruby skeletoner:

- only ast.FunctionDef / ast.AsyncFunctionDef BODIES are elided;
- ClassDef is structure → recurse into it so method signatures stay;
- decorators sit ABOVE node.lineno (the `def` line) so they're kept;
- require the body to start strictly below the `def` line (skip
  one-liners that can't round-trip), mirroring Ruby's distinct-lines
  rule;
- an elided body is pruned (we don't recurse into it) so a nested def
  inside it is never double-counted → ranges stay non-overlapping;
- a kept (small) body IS recursed into, to find nested big defs.
<<~PY
  import ast, sys, json

  try:
      keep = int(sys.argv[1])
  except (IndexError, ValueError):
      print("null")
      sys.exit(0)

  try:
      tree = ast.parse(sys.stdin.read())
  except Exception:
      print("null")
      sys.exit(0)

  out = []

  def visit(node):
      for child in ast.iter_child_nodes(node):
          if isinstance(child, (ast.FunctionDef, ast.AsyncFunctionDef)) and child.body:
              body_first = child.body[0].lineno
              body_last = child.end_lineno
              if body_first > child.lineno:
                  line_count = body_last - body_first + 1
                  if line_count > keep:
                      out.append([body_first, line_count])
                      continue  # prune: don't recurse into an elided body
          # not elided (other node, or a small/one-line def) → recurse
          visit(child)

  visit(tree)
  print(json.dumps(out))
PY

Method Summary

Methods inherited from LineSkeleton

#build, #initialize

Constructor Details

This class inherits a constructor from Rubino::Compression::LineSkeleton