Module: StillActive::LockfileDependencyParser

Extended by:
LockfileDependencyParser
Included in:
LockfileDependencyParser
Defined in:
lib/helpers/lockfile_dependency_parser.rb

Overview

Side-effect-free Gemfile.lock parser: extracts each top-level dependency’s name, version, and source (type + URI) straight from the lockfile text.

We deliberately do NOT load the Gemfile (evaluating it is arbitrary code execution when the audited project is untrusted, e.g. CI on a pull request) and do NOT use Bundler::LockfileParser. The latter is not side-effect-free: a ‘PLUGIN SOURCE` block runs `Bundler::Plugin.from_lock` at parse time, which resolves against the on-disk plugin registry and can raise or activate an installed plugin. (Bundler’s own ‘@gemfile_parse` guard that neutralizes this is set only inside `Bundler::Plugin.gemfile_install`, never for a standalone parse.) This mirrors what OSV-Scanner and Trivy do for the same threat model, and what `LockfileIndexer` already does here. Refs #37.

Defined Under Namespace

Classes: Spec

Constant Summary collapse

SOURCE_TYPES =

Lockfile source blocks and the source_type each maps to. PLUGIN SOURCE is recognized as a block (so its lines are consumed as inert data, not mis-read as specs) but yields no auditable gems.

{ "GEM" => :rubygems, "GIT" => :git, "PATH" => :path }.freeze
PLUGIN_SOURCE =
"PLUGIN SOURCE"
SECTION_HEADER =

A section header sits at column 0; Bundler emits them in SCREAMING form.

/\A[A-Z]/
SPEC_LINE =

A top-level spec is indented exactly 4 spaces: ‘ name (1.2.3)` or, for a platform gem, ` name (1.2.3-x86_64-linux)`. Nested deps (6 spaces) and the `specs:`/`remote:` option lines (2 spaces) do not match. We do NOT anchor the end of the line: Bundler’s grammar allows an optional trailing checksum on a spec line, and an audit tool must never silently drop a gem because of unexpected trailing content (that would be a false-negative evasion on a hand-crafted lockfile).

/\A {4}(\S+) \(([^-)]+)(?:-[^)]*)?\)/
REMOTE_LINE =
/\A {2}remote: (.+)\z/
NESTED_DEP_LINE =

A spec’s nested runtime dep is indented exactly 6 spaces: ‘ name` or ` name (~> 1.0)`.

/\A {6}([^\s(!]+)/
DEPENDENCY_LINE =

A DEPENDENCIES entry is indented 2 spaces: ‘ name`, ` name (~> 1.0)`, or ` name!` (the `!` marks a pinned git/path source).

/\A {2}([^\s(!]+)/

Instance Method Summary collapse

Instance Method Details

#parse(content) ⇒ Object

Parses lockfile text into { specs:, direct:, plugin_source? }. ‘specs` is every locked top-level spec (a Spec per gem); `direct` is the names from the DEPENDENCIES section; `plugin_source?` flags that a PLUGIN SOURCE block was present (and skipped).



53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
# File 'lib/helpers/lockfile_dependency_parser.rb', line 53

def parse(content)
  # A hand-edited or re-encoded lockfile can carry a leading UTF-8 BOM.
  # Section headers anchor at column 0 (\A), so a BOM glued to "GEM" would
  # drop the entire first block: a silent false-negative, the exact evasion
  # this parser is written to avoid.
  content = content.delete_prefix("")
  specs = []
  direct = []
  section = nil
  source_type = nil
  remote = nil
  current_spec = nil
  plugin_source = false

  content.each_line do |raw|
    line = raw.chomp

    if line.match?(SECTION_HEADER)
      section = line
      source_type = SOURCE_TYPES[line]
      remote = nil
      current_spec = nil
      plugin_source ||= (line == PLUGIN_SOURCE)
      next
    end

    case section
    when "GEM", "GIT", "PATH"
      if (m = REMOTE_LINE.match(line))
        remote ||= m[1] # first remote wins, matching Bundler's remotes.first
      elsif (m = SPEC_LINE.match(line))
        current_spec = Spec.new(name: m[1], version: m[2], source_type: source_type, source_uri: remote, dependencies: [])
        specs << current_spec
      elsif current_spec && (m = NESTED_DEP_LINE.match(line))
        current_spec.dependencies << m[1]
      end
    when "DEPENDENCIES"
      if (m = DEPENDENCY_LINE.match(line))
        direct << m[1]
      end
    end
  end

  { specs: specs, direct: direct, plugin_source?: plugin_source }
end