Class: Archsight::Import::Handlers::Repository
- Inherits:
-
Archsight::Import::Handler
- Object
- Archsight::Import::Handler
- Archsight::Import::Handlers::Repository
- Defined in:
- lib/archsight/import/handlers/repository.rb
Overview
Repository handler - clones/syncs and analyzes a git repository, generates a TechnologyArtifact
Configuration:
import/config/path - Path where the git repository should be cloned
import/config/gitUrl - Git URL to clone from (if not already cloned)
import/config/archived - Optional "true" if repository is archived
import/config/visibility - Optional visibility (internal, public, open-source)
import/config/sccPath - Optional path to scc binary (default: scc)
import/config/fallbackTeam - Optional team name when no contributor match found
import/config/botTeam - Optional team name for bot-only repositories
import/config/corporateAffixes - Optional comma-separated corporate username affixes for team matching (e.g., "ionos,1and1")
Instance Attribute Summary
Attributes inherited from Archsight::Import::Handler
#database, #import_resource, #progress, #resources_dir, #shared_writer
Instance Method Summary collapse
-
#access_denied_error?(message) ⇒ Boolean
Check if error message indicates access denied.
- #clone_repository ⇒ Object
- #empty_repository? ⇒ Boolean
- #execute ⇒ Object
-
#run_git(command, dir) ⇒ String
Run a git command safely using array form to prevent shell injection.
-
#sanitize_error(message) ⇒ Object
Sanitize error message to prevent breaking TTY progress display.
- #sync_repository ⇒ Object
- #update_repository ⇒ Object
-
#write_minimal_artifact(status:, reason:, error: nil, visibility: nil) ⇒ Object
Write a minimal TechnologyArtifact for repositories that can’t be fully analyzed.
Methods inherited from Archsight::Import::Handler
#compute_config_hash, #config, #config_all, #import_yaml, #initialize, #resource_yaml, #resources_to_yaml, #self_marker, #write_generates_meta, #write_yaml
Constructor Details
This class inherits a constructor from Archsight::Import::Handler
Instance Method Details
#access_denied_error?(message) ⇒ Boolean
Check if error message indicates access denied
163 164 165 166 167 168 169 170 171 172 173 174 175 |
# File 'lib/archsight/import/handlers/repository.rb', line 163 def access_denied_error?() return false if .nil? patterns = [ /could not read from remote repository/i, /permission denied/i, /access denied/i, /authentication failed/i, /repository not found/i, /fatal: '.*' does not appear to be a git repository/i ] patterns.any? { |p| .match?(p) } end |
#clone_repository ⇒ Object
112 113 114 115 |
# File 'lib/archsight/import/handlers/repository.rb', line 112 def clone_repository FileUtils.mkdir_p(File.dirname(@path)) run_git(%w[git clone --quiet] + [@git_url, @path], Dir.pwd) end |
#empty_repository? ⇒ Boolean
133 134 135 136 137 |
# File 'lib/archsight/import/handlers/repository.rb', line 133 def empty_repository? # Check if HEAD exists (empty repos have no commits) _, _, status = Open3.capture3("git", "rev-parse", "HEAD", chdir: @path) !status.success? end |
#execute ⇒ Object
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 |
# File 'lib/archsight/import/handlers/repository.rb', line 24 def execute @path = config("path") @git_url = config("gitUrl") raise "Missing required config: path" unless @path # Clone or update the repository if gitUrl is provided if @git_url begin sync_repository if @skip_analysis return end rescue StandardError => e # Access denied or other git errors - create minimal artifact if access_denied_error?(e.) progress.update("Access denied - creating minimal artifact") write_minimal_artifact( status: "inaccessible", reason: "Repository not accessible", error: e., visibility: "private" ) return end raise end end raise "Directory not found: #{@path}" unless File.directory?(@path) raise "Not a git repository: #{@path}" unless File.directory?(File.join(@path, ".git")) # Check if empty repository (no code) progress.update("Analyzing code") scc_data = run_scc(@path) estimated_cost = scc_data["estimatedCost"] if !estimated_cost.nil? && estimated_cost.to_f.zero? progress.update("No analyzable code - creating minimal artifact") write_minimal_artifact( status: "no-code", reason: "No analyzable source code found" ) return end # Run native git analytics progress.update("Analyzing git history") git_data = run_git_analytics(@path) # Analyze licenses and match teams in parallel (independent of each other) progress.update("Analyzing licenses & matching teams") license_data, team_result = run_license_and_team_analysis(@path, scc_data, git_data) # Build resource progress.update("Generating resource") resource = build_technology_artifact(@path, scc_data, git_data, team_result, license_data) # Write output with self-marker for caching yaml_content = YAML.dump(resource) + YAML.dump(self_marker) write_yaml(yaml_content) end |
#run_git(command, dir) ⇒ String
Run a git command safely using array form to prevent shell injection
143 144 145 146 147 148 |
# File 'lib/archsight/import/handlers/repository.rb', line 143 def run_git(command, dir) out, err, status = Open3.capture3(*command, chdir: dir) raise "Git command failed: #{sanitize_error(err)}" unless status.success? out end |
#sanitize_error(message) ⇒ Object
Sanitize error message to prevent breaking TTY progress display
151 152 153 154 155 156 157 158 159 160 |
# File 'lib/archsight/import/handlers/repository.rb', line 151 def sanitize_error() return "" if .nil? || .empty? # Take first meaningful line, strip ANSI codes and remote prefixes lines = .lines.map(&:strip).reject { |l| l.empty? || l.start_with?("remote:") } first_line = lines.first || .lines.first&.strip || "" # Truncate if too long first_line.length > 100 ? "#{first_line[0, 97]}..." : first_line end |
#sync_repository ⇒ Object
90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 |
# File 'lib/archsight/import/handlers/repository.rb', line 90 def sync_repository if File.directory?(File.join(@path, ".git")) # Update existing repository progress.update("Updating repository") update_repository else # Clone new repository progress.update("Cloning repository") clone_repository end # Check if repository is empty (no commits) return unless empty_repository? progress.update("Empty repository - creating minimal artifact") write_minimal_artifact( status: "empty", reason: "Repository has no commits" ) @skip_analysis = true end |
#update_repository ⇒ Object
117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 |
# File 'lib/archsight/import/handlers/repository.rb', line 117 def update_repository run_git(%w[git fetch --quiet], @path) return if empty_repository? # Skip merge for empty repos # Check if update is needed current_head = run_git(%w[git rev-parse HEAD], @path).strip fetch_head = run_git(%w[git rev-parse FETCH_HEAD], @path).strip return if current_head == fetch_head # Already up-to-date run_git(%w[git merge --ff-only FETCH_HEAD], @path) rescue StandardError => e # If merge fails (diverged history), reset to remote state progress.warn("Merge failed: #{e.}, resetting to remote") run_git(%w[git reset --hard FETCH_HEAD], @path) end |
#write_minimal_artifact(status:, reason:, error: nil, visibility: nil) ⇒ Object
Write a minimal TechnologyArtifact for repositories that can’t be fully analyzed
182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 |
# File 'lib/archsight/import/handlers/repository.rb', line 182 def write_minimal_artifact(status:, reason:, error: nil, visibility: nil) git_url = @git_url vis = visibility || config("visibility", default: "internal") annotations = { "artifact/type" => "repo", "repository/git" => git_url, "repository/visibility" => vis, "activity/status" => status, "activity/reason" => reason, "generated/script" => import_resource.name, "generated/at" => Time.now.utc.iso8601 } annotations["repository/accessible"] = "false" if status == "inaccessible" annotations["repository/error"] = sanitize_error(error) if error resource = resource_yaml( kind: "TechnologyArtifact", name: repository_name(git_url), annotations: annotations, spec: {} ) # Write output with self-marker for caching yaml_content = YAML.dump(resource) + YAML.dump(self_marker) write_yaml(yaml_content) end |