sas-linter
A configurable lint engine for SAS source files. Built on the sas-lexer gem (a Ruby FFI binding to Misha Perlov's Rust sas-lexer) and ships with eleven pluggable rules covering structural defects, cosmetic issues, and source-header conventions.
Installation
Add to your Gemfile:
gem "sas-linter"
Or install directly:
gem install sas-linter
CLI usage
# Run every rule on a single file
bin/sas_lint path/to/source.sas
# List all registered rules with their description and autofix capability
bin/sas_lint --list-rules
# Run only specific rules
bin/sas_lint --rules malformed_if_condition,identical_if_else_branches src/*.sas
# Use a YAML config (default: config/lint.yaml)
bin/sas_lint --config my-lint.yaml src/*.sas
# Lint without applying any autofixes the config requested
bin/sas_lint --no-autofix src/*.sas
Exit codes: 0 clean, 1 findings, 2 invalid args.
YAML config
Every rule with options, plus its defaults. Rules omitted from the config default to enabled with no options, so adding a new rule to the gem won't silently disable it for users with existing configs. To suppress a rule, list it with enabled: false.
rules:
# ── Structural / semantic rules ─────────────────────────────────────
unreachable_inner_branch_value:
enabled: true # default for every rule
identical_if_else_branches:
enabled: true
malformed_if_condition:
enabled: true
commented_out_guard:
enabled: true
choose_one_template:
enabled: true
missing_assignment_semicolon:
enabled: true
autofix: false # rule supports autofix; off by default
variable_value_out_of_known_range:
enabled: true
csv_paths: # empty list = rule is a no-op
- metadata/variables.csv
- metadata/variables-extra.csv
name_column: "Variable" # default
values_column: "Acceptable Values" # default
name_match: case_insensitive # case_insensitive | exact
delimiter: "," # CSV column separator: "," | ";" | "\t"
# ── Source-hygiene rules (all support autofix) ──────────────────────
trailing_whitespace:
enabled: true
autofix: false
tab_expansion:
enabled: true
autofix: false
width: 8 # tab stop width
source_headers:
enabled: true
autofix: false # rewrap **…**; 90-char header rows when true
line_endings:
enabled: true
autofix: false # collapse \r\r\n → \r\n; lone \r → dominant ending
encoding_issues:
enabled: true
autofix: false
use_defaults: false # apply built-in smart-quote / em-dash / Win-1252 map
replacements: # project-specific byte→ASCII rewrites (run BEFORE defaults)
"—": "--"
"\x85": "Ö"
enabled and autofix are accepted on every rule. Options not listed above are ignored.
Library usage
require "sas_linter"
linter = SasLinter.new # all registered rules
linter = SasLinter.new(rules: [:malformed_if_condition]) # subset by rule id
linter = SasLinter.from_config(YAML.load_file("lint.yaml"))
findings = linter.lint(source_string, path: "demo.sas")
findings.each { |f| puts f.to_s } # path:line:col: [rule_id] message
# Lint a file. If any rule has autofix enabled and changed the source,
# the file is rewritten in place.
findings = linter.lint_file("path/to/source.sas")
Built-in rules
| rule id | description |
|---|---|
unreachable_inner_branch_value |
Outer if VAR in (S) then do; guards an inner branch whose comparison values aren't all in S. |
identical_if_else_branches |
if COND then S; else S; with identical bodies — almost always a copy-paste error. |
commented_out_guard |
SAS line-comment * if ... then do; pattern indicating a disabled outer validity guard. |
choose_one_template |
** CHOOSE ONE OF THE BELOW STATEMENTS; banner indicating a broken-by-default source. |
trailing_whitespace |
Trailing spaces/tabs at end of line. |
tab_expansion |
Tab characters that should be spaces (configurable width). |
source_headers |
Restore the **...**; 90-char header convention to broken sources. |
line_endings |
Mixed or non-CRLF line terminators (configurable target). |
encoding_issues |
Smart-quote / em-dash / Win-1252 byte sequences that confuse downstream tooling. |
malformed_if_condition |
Empty conditions, missing operators, orphan then, unbalanced parens, etc. |
missing_assignment_semicolon |
Assignment statements followed by an inline ** comment but no terminating ;. |
variable_value_out_of_known_range |
if VAR = N / if VAR in (...) literals fall outside the variable's documented acceptable values. Loads the catalog from one or more CSVs with configurable column names and column separator (,, ;, tab). |
bin/sas_lint --list-rules prints the same set with autofix capability.
Writing a custom rule
Subclass SasLinter::Rule, declare an id, description, and severity, then implement #check:
class MyRule < SasLinter::Rule
rule_id :my_rule
description "Flag occurrences of FOO in DATA steps."
severity :warning
def check(tokens, path:, all_tokens: nil, source: nil)
findings = []
tokens.each do |t|
next unless t[:text] == "FOO"
findings << finding(line: t[:start_line], column: t[:start_column],
message: "FOO is forbidden", path: path)
end
findings
end
end
Subclasses self-register on the rule registry via rule_id — once required, they're picked up by SasLinter.new (no rule list) and resolvable via SasLinter::Rule.fetch(:my_rule).
To support autofix, override self.supports_autofix? to return true and implement #autofix(source) to return the rewritten source.
Testing
bundle install
bundle exec rake spec
License
GNU Affero General Public License v3.0 or later — chosen to match the upstream sas-lexer gem (which sas-linter requires at runtime). © Mon Ami, Inc.
Practical implications:
- Internal / personal use has no obligations beyond preserving notices.
- Redistribution (shipping the gem inside a binary, container image, or product) requires offering the complete corresponding source under AGPL-3.0.
- Network use (running
sas-linteras a backend that users interact with remotely) triggers the AGPL's source-disclosure clause for those network users. - Combined works with
sas-lintermust be licensed under AGPL-compatible terms.
If those terms don't fit your use case, run a standalone lint job (CLI / CI step) instead of embedding the linter in a redistributed product.