Class: Gitlab::SecretDetection::Core::Scanner
- Inherits:
-
Object
- Object
- Gitlab::SecretDetection::Core::Scanner
- Defined in:
- lib/gitlab/secret_detection/core/scanner.rb
Overview
Scan is responsible for running Secret Detection scan operation
Constant Summary collapse
- RulesetParseError =
RulesetParseError is thrown when the code fails to parse the ruleset file from the given path
Class.new(StandardError)
- RulesetCompilationError =
RulesetCompilationError is thrown when the code fails to compile the predefined rulesets
Class.new(StandardError)
- DEFAULT_SCAN_TIMEOUT_SECS =
default time limit(in seconds) for running the scan operation per invocation
180- DEFAULT_PAYLOAD_TIMEOUT_SECS =
default time limit(in seconds) for running the scan operation on a single payload
30- DEFAULT_PATTERN_MATCHER_TAGS =
Tags used for creating default pattern matcher
['gitlab_blocking'].freeze
- MAX_PROCS_PER_REQUEST =
Max no of child processes to spawn per request ref: gitlab.com/gitlab-org/gitlab/-/issues/430160
5- MIN_CHUNK_SIZE_PER_PROC_BYTES =
Minimum cumulative size of the payloads required to spawn and run the scan within a new subprocess.
2_097_152- RUN_IN_SUBPROCESS =
Whether to run scan in subprocesses or not. Default is false.
false
Instance Method Summary collapse
-
#initialize(rules:, logger: Logger.new($stdout)) ⇒ Scanner
constructor
Initializes the instance with logger along with following operations: 1.
-
#secrets_scan(payloads, timeout: DEFAULT_SCAN_TIMEOUT_SECS, payload_timeout: DEFAULT_PAYLOAD_TIMEOUT_SECS, raw_value_exclusions: [], rule_exclusions: [], tags: DEFAULT_PATTERN_MATCHER_TAGS, subprocess: RUN_IN_SUBPROCESS) ⇒ Object
Runs Secret Detection scan on the list of given payloads.
Constructor Details
#initialize(rules:, logger: Logger.new($stdout)) ⇒ Scanner
Initializes the instance with logger along with following operations:
-
Extract keywords from the parsed ruleset to use it for matching keywords before regex operation.
-
Build and Compile rule regex patterns obtained from the ruleset with
DEFAULT_PATTERN_MATCHER_TAGS
tags. Raises RulesetCompilationError in case the regex pattern compilation fails.
41 42 43 44 45 46 47 48 49 50 51 52 53 |
# File 'lib/gitlab/secret_detection/core/scanner.rb', line 41 def initialize(rules:, logger: Logger.new($stdout)) @logger = logger @rules = rules @keywords = create_keywords(rules) @default_keyword_matcher = build_keyword_matcher( tags: DEFAULT_PATTERN_MATCHER_TAGS, include_missing_tags: false ) @default_pattern_matcher = build_pattern_matcher( tags: DEFAULT_PATTERN_MATCHER_TAGS, include_missing_tags: false ) # includes only gitlab_blocking rules end |
Instance Method Details
#secrets_scan(payloads, timeout: DEFAULT_SCAN_TIMEOUT_SECS, payload_timeout: DEFAULT_PAYLOAD_TIMEOUT_SECS, raw_value_exclusions: [], rule_exclusions: [], tags: DEFAULT_PATTERN_MATCHER_TAGS, subprocess: RUN_IN_SUBPROCESS) ⇒ Object
Runs Secret Detection scan on the list of given payloads. Both the total scan duration and the duration for each payload is time bound via timeout and payload_timeout respectively.
payloads-
Array of payloads where each payload should have ‘id` and `data` properties.
timeout-
No of seconds(accepts floating point for smaller time values) to limit the total scan duration
payload_timeout-
No of seconds(accepts floating point for smaller time values) to limit
the scan duration on each payload
raw_value_exclusions:-
Array of raw values to exclude from the scan.
rule_exclusions-
Array of rules to exclude from the ruleset used for the scan. Each rule is represented
by its ID. For example: `gitlab_personal_access_token` for representing Gitlab Personal Access
Token. By default, no rule is excluded from the ruleset.
tags-
Array of tag values to filter from the default ruleset when determining the rules used for the scan.
For example: Add `gitlab_blocking` to include only rules for Push Protection. Defaults to [`gitlab_blocking`] (+DEFAULT_PATTERN_MATCHER_TAGS+).
NOTE: Running the scan in fork mode primarily focuses on reducing the memory consumption of the scan by offloading regex operations on large payloads to sub-processes. However, it does not assure the improvement in the overall latency of the scan, specifically in the case of smaller payloads, where the overhead of forking a new process adds to the overall latency of the scan instead. More reference on Subprocess-based execution is found here: gitlab.com/gitlab-org/gitlab/-/issues/430160.
Returns an instance of Gitlab::SecretDetection::Core::Response by following below structure:
status: One of the Core::Status values
results: [SecretDetection::Finding]
83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 |
# File 'lib/gitlab/secret_detection/core/scanner.rb', line 83 def secrets_scan( payloads, timeout: DEFAULT_SCAN_TIMEOUT_SECS, payload_timeout: DEFAULT_PAYLOAD_TIMEOUT_SECS, raw_value_exclusions: [], rule_exclusions: [], tags: DEFAULT_PATTERN_MATCHER_TAGS, subprocess: RUN_IN_SUBPROCESS ) return Core::Response.new(Core::Status::INPUT_ERROR) unless validate_scan_input(payloads) # assign defaults since grpc passing zero timeout value to `Timeout.timeout(..)` makes it effectively useless. timeout = DEFAULT_SCAN_TIMEOUT_SECS unless timeout.positive? payload_timeout = DEFAULT_PAYLOAD_TIMEOUT_SECS unless payload_timeout.positive? = DEFAULT_PATTERN_MATCHER_TAGS if .empty? Timeout.timeout(timeout) do keyword_matcher = build_keyword_matcher(tags:) matched_payloads = filter_by_keywords(keyword_matcher, payloads) next Core::Response.new(Core::Status::NOT_FOUND) if matched_payloads.empty? scan_args = { payloads: matched_payloads, payload_timeout:, pattern_matcher: build_pattern_matcher(tags:), raw_value_exclusions:, rule_exclusions: } secrets = subprocess ? run_scan_within_subprocess(**scan_args) : run_scan(**scan_args) scan_status = overall_scan_status(secrets) Core::Response.new(scan_status, secrets) end rescue Timeout::Error => e logger.error "Secret detection operation timed out: #{e}" Core::Response.new(Core::Status::SCAN_TIMEOUT) end |