Class: HTM::SqlBuilder
- Inherits:
-
Object
- Object
- HTM::SqlBuilder
- Defined in:
- lib/htm/sql_builder.rb
Overview
SQL building utilities for constructing safe, parameterized queries
Provides class methods for building SQL conditions for:
-
Timeframe filtering (single range or multiple ranges)
-
Metadata filtering (JSONB containment)
-
Embedding sanitization and padding (SQL injection prevention)
-
LIKE pattern sanitization (wildcard injection prevention)
All methods use proper escaping and parameterization to prevent SQL injection.
Constant Summary collapse
- MAX_EMBEDDING_DIMENSION =
Maximum embedding dimension supported by pgvector with HNSW index
2000
Class Method Summary collapse
-
.apply_metadata(scope, metadata, column: "metadata") ⇒ Sequel::Dataset
Apply metadata filter to Sequel dataset.
-
.apply_timeframe(scope, timeframe, column: :created_at) ⇒ Sequel::Dataset
Apply timeframe filter to Sequel dataset.
-
.metadata_condition(metadata, table_alias: nil, column: "metadata") ⇒ String?
Build SQL condition for metadata filtering (JSONB containment).
-
.pad_embedding(embedding, target_dimension: MAX_EMBEDDING_DIMENSION) ⇒ Array<Numeric>
Pad embedding to target dimension.
-
.sanitize_embedding(embedding) ⇒ String
Sanitize embedding for SQL use.
-
.sanitize_like_pattern(pattern) ⇒ String
Sanitize a string for use in SQL LIKE patterns.
-
.timeframe_condition(timeframe, table_alias: nil, column: "created_at") ⇒ String?
Build SQL condition for timeframe filtering.
Class Method Details
.apply_metadata(scope, metadata, column: "metadata") ⇒ Sequel::Dataset
Apply metadata filter to Sequel dataset
169 170 171 172 173 |
# File 'lib/htm/sql_builder.rb', line 169 def (scope, , column: "metadata") return scope if .nil? || .empty? scope.where(Sequel.lit("#{column} @> ?::jsonb", .to_json)) end |
.apply_timeframe(scope, timeframe, column: :created_at) ⇒ Sequel::Dataset
Apply timeframe filter to Sequel dataset
131 132 133 134 135 136 137 138 139 140 141 142 143 |
# File 'lib/htm/sql_builder.rb', line 131 def apply_timeframe(scope, timeframe, column: :created_at) return scope if timeframe.nil? case timeframe when Range scope.where(column => timeframe) when Array conditions = timeframe.map { |range| Sequel.expr(column => range) } scope.where(Sequel.|(*conditions)) else scope end end |
.metadata_condition(metadata, table_alias: nil, column: "metadata") ⇒ String?
Build SQL condition for metadata filtering (JSONB containment)
152 153 154 155 156 157 158 159 160 |
# File 'lib/htm/sql_builder.rb', line 152 def (, table_alias: nil, column: "metadata") return nil if .nil? || .empty? prefix = table_alias ? "#{table_alias}." : "" full_column = "#{prefix}#{column}" = HTM.db.literal(.to_json) "(#{full_column} @> #{}::jsonb)" end |
.pad_embedding(embedding, target_dimension: MAX_EMBEDDING_DIMENSION) ⇒ Array<Numeric>
Pad embedding to target dimension
Pads embedding with zeros to reach the target dimension for pgvector compatibility.
77 78 79 80 81 |
# File 'lib/htm/sql_builder.rb', line 77 def (, target_dimension: MAX_EMBEDDING_DIMENSION) return if .length >= target_dimension + Array.new(target_dimension - .length, 0.0) end |
.sanitize_embedding(embedding) ⇒ String
Sanitize embedding for SQL use
Validates that all values are numeric and converts to safe PostgreSQL vector format. This prevents SQL injection by ensuring only valid numeric values are included.
44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 |
# File 'lib/htm/sql_builder.rb', line 44 def () unless .is_a?(Array) raise ArgumentError, "Embedding must be an Array, got #{.class}" end if .empty? raise ArgumentError, "Embedding cannot be empty" end # Find invalid values for detailed error message invalid_indices = [] .each_with_index do |v, i| unless v.is_a?(Numeric) && v.respond_to?(:finite?) && v.finite? invalid_indices << i end end unless invalid_indices.empty? sample = invalid_indices.first(5).map { |i| "index #{i}: #{[i].inspect}" }.join(", ") raise ArgumentError, "Embedding contains invalid values at #{sample}" end "[#{.map(&:to_f).join(',')}]" end |
.sanitize_like_pattern(pattern) ⇒ String
Sanitize a string for use in SQL LIKE patterns
Escapes SQL LIKE wildcards (% and _) to prevent pattern injection.
90 91 92 93 94 |
# File 'lib/htm/sql_builder.rb', line 90 def sanitize_like_pattern(pattern) return "" if pattern.nil? pattern.to_s.gsub(/[%_\\]/) { |match| "\\#{match}" } end |
.timeframe_condition(timeframe, table_alias: nil, column: "created_at") ⇒ String?
Build SQL condition for timeframe filtering
103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 |
# File 'lib/htm/sql_builder.rb', line 103 def timeframe_condition(timeframe, table_alias: nil, column: "created_at") return nil if timeframe.nil? prefix = table_alias ? "#{table_alias}." : "" full_column = "#{prefix}#{column}" case timeframe when Range begin_quoted = HTM.db.literal(timeframe.begin.iso8601) end_quoted = HTM.db.literal(timeframe.end.iso8601) "(#{full_column} BETWEEN #{begin_quoted} AND #{end_quoted})" when Array conditions = timeframe.map do |range| begin_quoted = HTM.db.literal(range.begin.iso8601) end_quoted = HTM.db.literal(range.end.iso8601) "(#{full_column} BETWEEN #{begin_quoted} AND #{end_quoted})" end "(#{conditions.join(' OR ')})" end end |