Module: Woods::Console::SqlNoiseStripper

Defined in:
lib/woods/console/sql_noise_stripper.rb

Overview

Strips SQL comments and string literals from a SQL string so that downstream checks (keyword scanning, table scanning) are not confused by content embedded inside comments or literals.

This is a shared utility used by SqlValidator and TableGate to avoid duplicating comment- and literal-stripping logic. All methods are module-level and stateless — pass a SQL string in, receive a stripped string out.

Examples:

Strip comments only

SqlNoiseStripper.strip_comments("SELECT 1 -- pick one\nFROM t")
# => "SELECT 1 \nFROM t"

Strip literals (PostgreSQL dialect)

SqlNoiseStripper.strip_literals("SELECT 'it''s ok' FROM t")
# => "SELECT '' FROM t"

Strip literals (MySQL dialect — backslash escapes)

SqlNoiseStripper.strip_literals("SELECT 'it\\'s ok' FROM t", dialect: :mysql)
# => "SELECT '' FROM t"

Constant Summary collapse

LINE_COMMENT =

Strips SQL line comments (‘– …`) and block comments (`/* … */`). Line comments are stripped to (but not including) the newline so that newline-separated statement structure is preserved for callers that check for multiple statements.

Block comments are non-nested — real SQL engines do not support nested block comments, and neither does this stripper.

Returns:

  • (String)

    a new string with all SQL comments removed

/--[^\n]*/
BLOCK_COMMENT =
%r{/\*.*?\*/}m
DOLLAR_QUOTED =

Strips single-quoted string literals and (for the ‘:postgres` dialect) PostgreSQL dollar-quoted string literals from a SQL string, replacing each with an empty `”` placeholder so that the structure of the SQL is maintained for subsequent checks.

Dollar-quoted strings are stripped before single-quoted strings so that stray apostrophes inside a dollar-quoted body do not confuse the single-quote scanner.

Returns:

  • (String)

    a new string with all string literals replaced by ‘”`

Raises:

  • (ArgumentError)

    if an unsupported dialect is provided

/\$(\w*)\$.*?\$\1\$/m
SINGLE_QUOTED_POSTGRES =
/'(?:''|[^'])*'/m
SINGLE_QUOTED_MYSQL =
/'(?:\\.|''|[^'])*'/m

Class Method Summary collapse

Class Method Details

.strip_comments(sql) ⇒ Object



41
42
43
44
# File 'lib/woods/console/sql_noise_stripper.rb', line 41

def self.strip_comments(sql)
  out = sql.gsub(LINE_COMMENT, '')
  out.gsub(BLOCK_COMMENT, '')
end

.strip_literals(sql, dialect: :postgres) ⇒ Object



73
74
75
76
77
78
79
80
81
82
83
84
# File 'lib/woods/console/sql_noise_stripper.rb', line 73

def self.strip_literals(sql, dialect: :postgres)
  unless SUPPORTED_DIALECTS.include?(dialect)
    raise ArgumentError, "Unknown dialect #{dialect.inspect}. Supported: #{SUPPORTED_DIALECTS.inspect}"
  end

  # Strip dollar-quoted strings first so stray apostrophes inside them
  # do not interfere with the single-quote scanner.
  out = sql.gsub(DOLLAR_QUOTED, "''")

  pattern = dialect == :mysql ? SINGLE_QUOTED_MYSQL : SINGLE_QUOTED_POSTGRES
  out.gsub(pattern, "''")
end