Class: DataDrain::Storage::S3
Overview
Implementación del adaptador de almacenamiento para Amazon S3.
Instance Attribute Summary
Attributes inherited from Base
Instance Method Summary collapse
- #build_path(bucket, folder_name, partition_path) ⇒ String
- #destroy_partitions(bucket, folder_name, partition_keys, partitions) ⇒ Integer
-
#setup_duckdb(connection) ⇒ Object
rubocop:disable Metrics/AbcSize, Metrics/CyclomaticComplexity, Metrics/MethodLength Carga la extensión httpfs en DuckDB e inyecta las credenciales de AWS.
Methods inherited from Base
#initialize, #prepare_export_path
Constructor Details
This class inherits a constructor from DataDrain::Storage::Base
Instance Method Details
#build_path(bucket, folder_name, partition_path) ⇒ String
59 60 61 62 63 |
# File 'lib/data_drain/storage/s3.rb', line 59 def build_path(bucket, folder_name, partition_path) base = File.join(bucket, folder_name) base = File.join(base, partition_path) if partition_path && !partition_path.empty? "s3://#{base}/**/*.parquet" end |
#destroy_partitions(bucket, folder_name, partition_keys, partitions) ⇒ Integer
70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 |
# File 'lib/data_drain/storage/s3.rb', line 70 def destroy_partitions(bucket, folder_name, partition_keys, partitions) client = Aws::S3::Client.new( region: @config.aws_region, access_key_id: @config.aws_access_key_id, secret_access_key: @config.aws_secret_access_key ) regex_parts = partition_keys.map do |key| val = partitions[key] val.nil? || val.to_s.empty? ? "#{key}=[^/]+" : "#{key}=#{val}" end pattern_regex = Regexp.new("^#{folder_name}/#{regex_parts.join("/")}") objects_to_delete = [] prefix = "#{folder_name}/" first_key = partition_keys.first prefix += "#{first_key}=#{partitions[first_key]}/" if partitions[first_key] client.list_objects_v2(bucket: bucket, prefix: prefix).each do |response| response.contents.each do |obj| objects_to_delete << { key: obj.key } if obj.key.match?(pattern_regex) end end delete_in_batches(client, bucket, objects_to_delete) end |
#setup_duckdb(connection) ⇒ Object
rubocop:disable Metrics/AbcSize, Metrics/CyclomaticComplexity, Metrics/MethodLength Carga la extensión httpfs en DuckDB e inyecta las credenciales de AWS. Si aws_access_key_id y aws_secret_access_key están seteados, usa credenciales explícitas. Si no, usa credential_chain (IAM role, env vars, ~/.aws/credentials).
14 15 16 17 |
# File 'lib/data_drain/storage/s3.rb', line 14 def setup_duckdb(connection) connection.query("INSTALL httpfs; LOAD httpfs;") create_s3_secret(connection) end |