Class: OllamaAgent::Indexing::RepoScanner
- Inherits:
-
Object
- Object
- OllamaAgent::Indexing::RepoScanner
- Defined in:
- lib/ollama_agent/indexing/repo_scanner.rb
Overview
Scans a repository and returns a file inventory with language tags. Language detection is extension-based (no external gems required). Used by ContextPacker to select relevant files for the agent context.
Defined Under Namespace
Classes: FileEntry
Constant Summary collapse
- LANGUAGE_EXTENSIONS =
{ ruby: %w[.rb .rake .gemspec], javascript: %w[.js .jsx .mjs .cjs], typescript: %w[.ts .tsx], python: %w[.py .pyw], go: %w[.go], rust: %w[.rs], java: %w[.java], kotlin: %w[.kt .kts], swift: %w[.swift], cpp: %w[.cpp .cc .cxx .hpp .hh .h], c: %w[.c .h], csharp: %w[.cs], php: %w[.php], elixir: %w[.ex .exs], erlang: %w[.erl .hrl], haskell: %w[.hs .lhs], scala: %w[.scala], clojure: %w[.clj .cljs .cljc], shell: %w[.sh .bash .zsh .fish], yaml: %w[.yml .yaml], json: %w[.json .jsonc], toml: %w[.toml], markdown: %w[.md .mdx .markdown], html: %w[.html .htm .xhtml], css: %w[.css .scss .sass .less], sql: %w[.sql], dockerfile: %w[Dockerfile], terraform: %w[.tf .tfvars], proto: %w[.proto] }.freeze
- IGNORED_DIRS =
%w[ .git .svn .hg .bzr node_modules vendor .bundle tmp log coverage .nyc_output dist build out target __pycache__ .pytest_cache .mypy_cache .tox venv env .venv .ollama_agent .idea .vscode .cursor ].freeze
- IGNORED_FILES =
%w[ Gemfile.lock yarn.lock package-lock.json pnpm-lock.yaml .DS_Store Thumbs.db *.min.js *.min.css ].freeze
Instance Method Summary collapse
-
#initialize(root:, exclude_dirs: nil, max_file_size: 1_048_576) ⇒ RepoScanner
constructor
A new instance of RepoScanner.
-
#recently_modified(n: 20) ⇒ Object
Files most recently modified.
-
#scan(languages: nil) ⇒ Array<FileEntry>
Scan the repository and return FileEntry objects.
-
#stats ⇒ Object
Summary statistics about the repository.
Constructor Details
#initialize(root:, exclude_dirs: nil, max_file_size: 1_048_576) ⇒ RepoScanner
Returns a new instance of RepoScanner.
58 59 60 61 62 63 |
# File 'lib/ollama_agent/indexing/repo_scanner.rb', line 58 def initialize(root:, exclude_dirs: nil, max_file_size: 1_048_576) @root = File.(root) @exclude_dirs = (exclude_dirs || []) + IGNORED_DIRS @max_file_size = max_file_size @ext_map = build_ext_map end |
Instance Method Details
#recently_modified(n: 20) ⇒ Object
Files most recently modified.
117 118 119 |
# File 'lib/ollama_agent/indexing/repo_scanner.rb', line 117 def recently_modified(n: 20) scan.max_by(n, &:modified_at) end |
#scan(languages: nil) ⇒ Array<FileEntry>
Scan the repository and return FileEntry objects.
68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 |
# File 'lib/ollama_agent/indexing/repo_scanner.rb', line 68 def scan(languages: nil) results = [] Find.find(@root) do |path| basename = File.basename(path) if File.directory?(path) Find.prune if prune_dir?(path, basename) next end next unless File.file?(path) next if ignored_file?(basename) size = File.size(path) next if size > @max_file_size lang = detect_language(path) next if languages && !languages.map(&:to_sym).include?(lang) rel = path.sub("#{@root}/", "") results << FileEntry.new( path: path, relative_path: rel, language: lang, size: size, modified_at: File.mtime(path) ) rescue StandardError next end results.sort_by(&:relative_path) end |
#stats ⇒ Object
Summary statistics about the repository.
104 105 106 107 108 109 110 111 112 113 114 |
# File 'lib/ollama_agent/indexing/repo_scanner.rb', line 104 def stats files = scan by_lang = files.group_by(&:language) { total_files: files.size, total_bytes: files.sum(&:size), root: @root, languages: by_lang.transform_values { |fs| { files: fs.size, bytes: fs.sum(&:size) } } } end |