Class: Exwiw::Adapter::Base
- Inherits:
-
Object
- Object
- Exwiw::Adapter::Base
- Defined in:
- lib/exwiw/adapter.rb
Direct Known Subclasses
MongodbAdapter, MysqlAdapter, PostgresqlAdapter, SqliteAdapter
Instance Attribute Summary collapse
-
#connection_config ⇒ Object
readonly
Returns the value of attribute connection_config.
Class Method Summary collapse
-
.table_config_class ⇒ Object
The config class that this adapter consumes.
Instance Method Summary collapse
-
#build_query(table, dump_target, table_by_name) ⇒ Object
Adapter-specific query object (e.g. Exwiw::QueryAst::Select for SQL).
-
#commented_sql(query_ast) ⇒ Object
Comment-prefixed SELECT.
-
#default_bulk_insert_chunk_size ⇒ Object
Default bulk-insert chunk size when a table config does not set one.
-
#describe_query(query_ast) ⇒ Object
One-line, human-readable description of the extraction query, used by the Runner in error messages so a failure during INSERT/COPY generation (or query execution) can be traced back to the query that produced the data.
-
#dump_schema(ordered_tables, output_path) ⇒ Object
Write the leading schema-creation file for this adapter to ‘output_path`.
-
#dumpable?(_config) ⇒ Boolean
Whether the given config produces its own dump output and needs an independent processing pass.
-
#explain(_query_ast) ⇒ Object
Run the database-specific EXPLAIN for the given query and return the output as a single string for ‘explain` subcommand to print.
-
#initialize(connection_config, logger) ⇒ Base
constructor
A new instance of Base.
-
#output_extension ⇒ Object
File extension used for dump output (e.g. ‘sql’ for SQL, ‘jsonl’ for MongoDB).
-
#post_insert_sql(_table) ⇒ Object
Optional SQL appended to the per-table insert-NNN-<table>.* file after the bulk INSERT statements.
-
#pre_insert_sql(_table) ⇒ Object
Optional SQL prepended to the per-table insert-NNN-<table>.* file before the bulk INSERT/COPY statements.
-
#query_comment_text(label = nil) ⇒ Object
Identifier text prepended to every query exwiw sends to the (often production) source DB, so the statement is recognizable in the processlist / slow-query log / db.currentOp() and can be killed if it runs long.
-
#schema_output_extension ⇒ Object
File extension used for the leading ‘insert-000-schema.*` file.
-
#sql_query_comment(query_ast) ⇒ Object
SQL block-comment form, prefixed to SELECT / EXPLAIN.
-
#supports_bulk_delete? ⇒ Boolean
Whether this adapter emits delete-NNN-*.sql files.
- #to_copy_from_stdin(_results, _table) ⇒ Object
-
#validate_as_dump_target!(_config) ⇒ Object
Hook for adapter-specific validation when this config is used as the dump_target.
-
#write_inserts(io, results, table, chunk_size) ⇒ Array(Integer, Integer)
Write the bulk INSERT/JSONL output for ‘results` to the open `io`, returning the number of statements written.
Constructor Details
#initialize(connection_config, logger) ⇒ Base
Returns a new instance of Base.
38 39 40 41 |
# File 'lib/exwiw/adapter.rb', line 38 def initialize(connection_config, logger) @connection_config = connection_config @logger = logger end |
Instance Attribute Details
#connection_config ⇒ Object (readonly)
Returns the value of attribute connection_config.
36 37 38 |
# File 'lib/exwiw/adapter.rb', line 36 def connection_config @connection_config end |
Class Method Details
.table_config_class ⇒ Object
The config class that this adapter consumes. Runner uses this to decide which Serdes type to load scenario JSON into. SQL adapters share the SQL-shaped TableConfig; non-SQL adapters override.
46 47 48 |
# File 'lib/exwiw/adapter.rb', line 46 def self.table_config_class TableConfig end |
Instance Method Details
#build_query(table, dump_target, table_by_name) ⇒ Object
Returns adapter-specific query object (e.g. Exwiw::QueryAst::Select for SQL).
54 55 56 |
# File 'lib/exwiw/adapter.rb', line 54 def build_query(table, dump_target, table_by_name) raise NotImplementedError end |
#commented_sql(query_ast) ⇒ Object
Comment-prefixed SELECT. Relies on the SQL adapter’s #compile_ast (dispatched to the subclass at runtime).
214 215 216 |
# File 'lib/exwiw/adapter.rb', line 214 def commented_sql(query_ast) "#{sql_query_comment(query_ast)} #{compile_ast(query_ast)}" end |
#default_bulk_insert_chunk_size ⇒ Object
Default bulk-insert chunk size when a table config does not set one. The Runner streams each chunk straight to the output file, so a non-nil value here bounds how much serialized output (and how many transient intermediate objects) live in memory at once. SQL adapters keep nil (one statement per table, as before); adapters whose output is large and built per-row (e.g. MongoDB JSONL) override with a positive value.
122 123 124 |
# File 'lib/exwiw/adapter.rb', line 122 def default_bulk_insert_chunk_size nil end |
#describe_query(query_ast) ⇒ Object
One-line, human-readable description of the extraction query, used by the Runner in error messages so a failure during INSERT/COPY generation (or query execution) can be traced back to the query that produced the data. SQL adapters expose the compiled, comment-prefixed SELECT; non-SQL adapters (e.g. MongodbAdapter) override or fall back to the query object’s own inspect output. Best-effort: never raise from here, since it runs on an error path.
225 226 227 228 229 230 231 232 233 |
# File 'lib/exwiw/adapter.rb', line 225 def describe_query(query_ast) if respond_to?(:compile_ast) commented_sql(query_ast) else query_ast.inspect end rescue => e "<unavailable: #{e.class}: #{e.}>" end |
#dump_schema(ordered_tables, output_path) ⇒ Object
Write the leading schema-creation file for this adapter to ‘output_path`. Default is a no-op; subclasses override to emit idempotent DDL so the generated dump can be applied to an empty database.
76 77 |
# File 'lib/exwiw/adapter.rb', line 76 def dump_schema(ordered_tables, output_path) end |
#dumpable?(_config) ⇒ Boolean
Whether the given config produces its own dump output and needs an independent processing pass. SQL adapters always do; non-SQL adapters may exclude e.g. embedded subdocument configs.
87 88 89 |
# File 'lib/exwiw/adapter.rb', line 87 def dumpable?(_config) true end |
#explain(_query_ast) ⇒ Object
Run the database-specific EXPLAIN for the given query and return the output as a single string for ‘explain` subcommand to print. SQL adapters override; MongodbAdapter currently raises.
186 187 188 |
# File 'lib/exwiw/adapter.rb', line 186 def explain(_query_ast) raise NotImplementedError, "#{self.class.name} does not implement #explain" end |
#output_extension ⇒ Object
File extension used for dump output (e.g. ‘sql’ for SQL, ‘jsonl’ for MongoDB).
59 60 61 |
# File 'lib/exwiw/adapter.rb', line 59 def output_extension 'sql' end |
#post_insert_sql(_table) ⇒ Object
Optional SQL appended to the per-table insert-NNN-<table>.* file after the bulk INSERT statements. Use to bring side-state in sync with the explicit IDs that were just inserted (e.g. PostgreSQL sequences). Default: nil (nothing appended).
108 109 110 |
# File 'lib/exwiw/adapter.rb', line 108 def post_insert_sql(_table) nil end |
#pre_insert_sql(_table) ⇒ Object
Optional SQL prepended to the per-table insert-NNN-<table>.* file before the bulk INSERT/COPY statements. Use for session-level setup required before loading data (e.g. MySQL FOREIGN_KEY_CHECKS). Default: nil (nothing prepended).
100 101 102 |
# File 'lib/exwiw/adapter.rb', line 100 def pre_insert_sql(_table) nil end |
#query_comment_text(label = nil) ⇒ Object
Identifier text prepended to every query exwiw sends to the (often production) source DB, so the statement is recognizable in the processlist / slow-query log / db.currentOp() and can be killed if it runs long. e.g. “exwiw table=shops”. ‘label` is “table=…” / “collection=…”. The version is intentionally omitted to keep the comment stable across releases (snapshots / diffs). Strips `*/` to avoid breaking out of the comment.
197 198 199 200 201 |
# File 'lib/exwiw/adapter.rb', line 197 def query_comment_text(label = nil) parts = ["exwiw"] parts << label if label parts.join(' ').gsub('*/', '') end |
#schema_output_extension ⇒ Object
File extension used for the leading ‘insert-000-schema.*` file. SQL adapters emit `.sql` (CREATE TABLE IF NOT EXISTS …); MongodbAdapter overrides to `.js` (mongosh-runnable createCollection / createIndex).
66 67 68 |
# File 'lib/exwiw/adapter.rb', line 66 def schema_output_extension 'sql' end |
#sql_query_comment(query_ast) ⇒ Object
SQL block-comment form, prefixed to SELECT / EXPLAIN.
204 205 206 207 208 209 210 |
# File 'lib/exwiw/adapter.rb', line 204 def sql_query_comment(query_ast) label = if query_ast.respond_to?(:from_table_name) && query_ast.from_table_name "table=#{query_ast.from_table_name}" end "/* #{query_comment_text(label)} */" end |
#supports_bulk_delete? ⇒ Boolean
Whether this adapter emits delete-NNN-*.sql files.
80 81 82 |
# File 'lib/exwiw/adapter.rb', line 80 def supports_bulk_delete? true end |
#to_copy_from_stdin(_results, _table) ⇒ Object
112 113 114 |
# File 'lib/exwiw/adapter.rb', line 112 def to_copy_from_stdin(_results, _table) raise NotImplementedError, "COPY format is not supported by #{self.class.name}" end |
#validate_as_dump_target!(_config) ⇒ Object
Hook for adapter-specific validation when this config is used as the dump_target. Default: nothing to validate.
93 94 |
# File 'lib/exwiw/adapter.rb', line 93 def validate_as_dump_target!(_config) end |
#write_inserts(io, results, table, chunk_size) ⇒ Array(Integer, Integer)
Write the bulk INSERT/JSONL output for ‘results` to the open `io`, returning the number of statements written. The Runner calls this once per table for the non-COPY path.
Default: build each chunk’s output as a full string via #to_bulk_insert and write it, separating statements with “n” — exactly what the Runner used to inline. This keeps the dominant memory cost at one chunk’s serialized string (bounded by ‘chunk_size`), which is why MongoDB sets a positive default chunk size. Adapters whose output is a single large statement (the SQL adapters, where chunk_size is nil) override this to stream the statement to `io` in bounded buffers instead of holding the whole thing in memory.
record_count is tallied from the rows actually streamed here so the Runner no longer needs a separate upfront count query (MongoDB’s count_documents / the SQL adapters’ SELECT COUNT(*)) just to log the row count and decide whether an empty table can be skipped. That count was a second full pass over the same filter — a wasted COLLSCAN when the scope is unindexed; counting during the single streaming pass removes it.
Batches rows by accumulating into a buffer and flushing every chunk_size rows, rather than ‘results.each_slice(chunk_size)`. This is deliberate: `Enumerable#each_slice` calls `#size` on the receiver as an allocation hint, which for a streaming result (MongoDB’s StreamingResult) issues a ‘count_documents` — the very redundant count this single-pass design removes. Driving the buffer off `#each` keeps the result’s ‘#size` untouched, so the cursor is walked exactly once. The chunk boundaries and “n” separators reproduce the each_slice output byte-for-byte.
chunk_size is always positive for callers of this default (MongoDB); the SQL adapters pass nil and override #write_inserts, so the unbounded nil-branch buffer is never reached here in practice.
164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 |
# File 'lib/exwiw/adapter.rb', line 164 def write_inserts(io, results, table, chunk_size) statement_count = 0 record_count = 0 buffer = [] flush = lambda do io.print("\n") if statement_count.positive? io.print(to_bulk_insert(buffer, table)) statement_count += 1 record_count += buffer.size buffer.clear end results.each do |row| buffer << row flush.call if chunk_size && buffer.size >= chunk_size end flush.call unless buffer.empty? [statement_count, record_count] end |