Class: Archsight::Import::LicenseAnalyzer
- Inherits:
-
Object
- Object
- Archsight::Import::LicenseAnalyzer
- Defined in:
- lib/archsight/import/license_analyzer.rb
Overview
License detection and dependency license scanning for repositories
Detects the repository’s own license from LICENSE/COPYING files and SPDX headers, then scans dependency licenses using language-specific tools when available.
Constant Summary collapse
- SPDX_PATTERNS =
SPDX patterns: match against LICENSE/COPYING file content Order matters - more specific patterns first
[ { id: "Apache-2.0", re: /Apache License.*(?:Version 2|v2\.0)/mi }, { id: "MIT", re: /\bMIT License\b|Permission is hereby granted, free of charge/mi }, { id: "BSD-3-Clause", re: /BSD 3-Clause|Redistribution and use.*three conditions/mi }, { id: "BSD-2-Clause", re: /BSD 2-Clause|Simplified BSD/mi }, { id: "GPL-3.0", re: /GNU GENERAL PUBLIC LICENSE.*Version 3/mi }, { id: "GPL-2.0", re: /GNU GENERAL PUBLIC LICENSE.*Version 2/mi }, { id: "AGPL-3.0", re: /GNU AFFERO GENERAL PUBLIC LICENSE.*Version 3/mi }, { id: "LGPL-3.0", re: /GNU LESSER GENERAL PUBLIC LICENSE.*Version 3/mi }, { id: "LGPL-2.1", re: /GNU LESSER GENERAL PUBLIC LICENSE.*Version 2\.1/mi }, { id: "MPL-2.0", re: /Mozilla Public License.*(?:Version 2|v2\.0)/mi }, { id: "ISC", re: /\bISC License\b|ISC\s+license/mi }, { id: "Unlicense", re: /\bThis is free and unencumbered software\b/mi }, { id: "CC0-1.0", re: /Creative Commons.*CC0|CC0 1\.0 Universal/mi }, { id: "BSL-1.0", re: /Boost Software License/mi }, { id: "BUSL-1.1", re: /Business Source License.*1\.1/mi }, { id: "EUPL-1.2", re: /European Union Public Licen[cs]e.*1\.2/mi } ].freeze
- CATEGORIES =
License category classification
{ "permissive" => %w[Apache-2.0 MIT BSD-3-Clause BSD-2-Clause ISC Unlicense CC0-1.0 BSL-1.0 0BSD Ruby], "copyleft" => %w[GPL-3.0 GPL-2.0 AGPL-3.0], "weak-copyleft" => %w[LGPL-3.0 LGPL-2.1 MPL-2.0 EUPL-1.2 CDDL-1.0], "source-available" => %w[BUSL-1.1], "proprietary" => %w[proprietary] }.freeze
- CATEGORY_LOOKUP =
CATEGORIES.each_with_object({}) do |(cat, ids), h| ids.each { |id| h[id] = cat } end.freeze
- PROPRIETARY_RE =
Proprietary / copyright patterns — matched against the trimmed string
/ \bcopyright\b | \bproprietary\b | \bUNLICENSED\b | \binternal\b | \ACustom:\s | \b[a-z0-9-]+\.(com|io|de|net|org|cloud)\b | \(c\)\s /xi- CUSTOM_LICENSE_VALUES =
Custom non-SPDX values we accept
Set.new(%w[NOASSERTION proprietary unknown]).freeze
- LICENSE_FILES =
License file names to search (in order of priority)
%w[ LICENSE LICENSE.md LICENSE.txt LICENCE LICENCE.md LICENCE.txt COPYING COPYING.md COPYING.txt ].freeze
- ECOSYSTEM_MANIFESTS =
Manifest files that indicate an ecosystem
{ "go" => %w[go.mod], "python" => %w[requirements.txt setup.py pyproject.toml Pipfile], "ruby" => %w[Gemfile Gemfile.lock], "java" => %w[pom.xml build.gradle build.gradle.kts], "nodejs" => %w[package.json], "rust" => %w[Cargo.toml] }.freeze
- LANGUAGE_TO_ECOSYSTEM =
Map scc language names to ecosystem keys. When scc data is available, only ecosystems matching detected languages are probed.
{ "Go" => "go", "Python" => "python", "Ruby" => "ruby", "Java" => "java", "Kotlin" => "java", "Groovy" => "java", "Scala" => "java", "JavaScript" => "nodejs", "TypeScript" => "nodejs", "JSX" => "nodejs", "TSX" => "nodejs", "Rust" => "rust" }.freeze
- @@command_cache =
Cache of resolved command variants per ecosystem. After the first repo probes which tool works, all subsequent repos reuse it. { “go” => [“go-licenses”, …args], “nodejs” => :none, … }
{}
- @@command_cache_mutex =
rubocop:disable Style/ClassVars
Mutex.new
Class Method Summary collapse
-
.reset_command_cache! ⇒ Object
Reset the command cache (useful in tests).
Instance Method Summary collapse
- #analyze ⇒ Object
-
#initialize(repo_path, options = {}) ⇒ LicenseAnalyzer
constructor
A new instance of LicenseAnalyzer.
Constructor Details
#initialize(repo_path, options = {}) ⇒ LicenseAnalyzer
Returns a new instance of LicenseAnalyzer.
103 104 105 106 107 |
# File 'lib/archsight/import/license_analyzer.rb', line 103 def initialize(repo_path, = {}) @repo_path = repo_path @options = @languages = [:languages] end |
Class Method Details
.reset_command_cache! ⇒ Object
Reset the command cache (useful in tests)
110 111 112 |
# File 'lib/archsight/import/license_analyzer.rb', line 110 def self.reset_command_cache! @@command_cache_mutex.synchronize { @@command_cache = {} } # rubocop:disable Style/ClassVars end |
Instance Method Details
#analyze ⇒ Object
114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 |
# File 'lib/archsight/import/license_analyzer.rb', line 114 def analyze repo_license = detect_repo_license dep_data = scan_dependencies result = {} result["license_spdx"] = repo_license[:spdx] result["license_file"] = repo_license[:file] if repo_license[:file] result["license_category"] = repo_license[:category] result["dependency_count"] = dep_data[:count] result["dependency_ecosystems"] = dep_data[:ecosystems].join(",") if dep_data[:ecosystems].any? result["dependency_licenses"] = dep_data[:licenses].join(",") if dep_data[:licenses].any? result["dependency_copyleft"] = dep_data[:copyleft].to_s result["dependency_risk"] = dep_data[:risk] result["dependency_license_counts"] = dep_data[:license_counts] if dep_data[:license_counts].any? result end |