Class: SasLinter::Rules::InconsistentVariableCase

Inherits:
SasLinter::Rule show all
Defined in:
lib/sas_linter/rules/inconsistent_variable_case.rb

Overview

Flag identifiers that are spelled with inconsistent letter case across the file. SAS resolves variable references case-insensitively, so ‘myVar` and `MyVar` end up bound to the same column — but mixing the two within one program is sloppy and makes the source harder to grep, diff, and read.

The most-used spelling wins; every other casing is reported (and rewritten when autofix is on). Ties resolve to the first occurrence so the canonical form is reading-order deterministic.

Skipped on purpose:

* identifiers immediately followed by `.` (format references like
  `agecat.`, library references like `work.foo`);
* identifiers immediately preceded by `.` (the column half of
  `lib.member` / `dataset.col`) — those name a column in another
  dataset, not a variable in the current step;
* `value` / `invalue` / `picture` themselves and the format name
  directly following them — these are proc-format definitions,
  not variable references. We match locally rather than tracking
  a `proc format ... run;` block because real-world SAS files
  meant to be `%include`d into a caller's data step often omit
  the terminating `run;`, so a state machine would never close.

Constant Summary collapse

TT =
SasLexer::Lexer::TokenType
FORMAT_DEF_KEYWORDS =

Identifiers that introduce a format / informat / picture definition in a ‘proc format` step. The lexer types these as plain IDENTIFIERs (not keywords), so we recognize them by text.

%w[value invalue picture].freeze

Class Method Summary collapse

Instance Method Summary collapse

Methods inherited from SasLinter::Rule

all, #autofix?, description, fetch, from_config, inherited, #initialize, register, registry, rule_id, severity

Constructor Details

This class inherits a constructor from SasLinter::Rule

Class Method Details

.supports_autofix?Boolean

Returns:

  • (Boolean)


43
44
45
# File 'lib/sas_linter/rules/inconsistent_variable_case.rb', line 43

def self.supports_autofix?
  true
end

Instance Method Details

#autofix(source) ⇒ Object



61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
# File 'lib/sas_linter/rules/inconsistent_variable_case.rb', line 61

def autofix(source)
  return source if source.nil? || source.empty?

  # If a previous rule's autofix returned ASCII-8BIT (e.g.
  # EncodingIssues#autofix walks bytes and returns binary), tag
  # it UTF-8 before slicing. The lexer treats the bytes as UTF-8
  # and reports character offsets either way; only Ruby's
  # `String#[]=` cares about the encoding label, and it indexes
  # by bytes for ASCII-8BIT but by characters for UTF-8 — so a
  # binary tag plus any multi-byte sequence earlier in the file
  # would shift every replacement by the byte/char gap.
  src = source.encoding == Encoding::UTF_8 ? source : source.dup.force_encoding("UTF-8")

  lexer = SasLexer::Lexer.new
  begin
    all_tokens = lexer.tokenize(src)
  ensure
    lexer.free
  end
  tokens = all_tokens.reject do |t|
    t[:channel] == SasLexer::Lexer::TokenChannel::HIDDEN ||
      t[:channel] == SasLexer::Lexer::TokenChannel::COMMENT
  end

  edits = []
  each_inconsistent_use(tokens) do |token, canonical|
    edits << [token[:start], token[:end], canonical]
  end

  # Apply right-to-left so earlier offsets stay valid.
  out = src.dup
  edits.sort_by! { |start, _, _| -start }
  edits.each { |start, finish, repl| out[start...finish] = repl }
  out
end

#check(tokens, path:, all_tokens: nil, source: nil) ⇒ Object

rubocop:disable Lint/UnusedMethodArgument



47
48
49
50
51
52
53
54
55
56
57
58
59
# File 'lib/sas_linter/rules/inconsistent_variable_case.rb', line 47

def check(tokens, path:, all_tokens: nil, source: nil) # rubocop:disable Lint/UnusedMethodArgument
  findings = []
  each_inconsistent_use(tokens) do |token, canonical|
    findings << finding(
      line: token[:start_line],
      column: token[:start_column] + 1,
      message: "variable `#{token[:text]}` is spelled `#{canonical}` " \
               "elsewhere in this file — pick one case and stick with it.",
      path: path
    )
  end
  findings
end