fluent-plugin-gcs
A Fluentd output plugin that buffers events and uploads them to Google Cloud Storage.
Features
- Multiple formats — store objects as gzip, plain text, or JSON.
- Fast compression — optionally shell out to the external
gzipbinary, with automatic fallback to the pure-Ruby compressor. - Flexible object keys — build paths from time slices, tags, hostnames, random tokens, and UUIDs.
- Server-side controls — set ACLs, storage class, customer-supplied encryption keys, and custom object metadata.
- Flexible auth — explicit credentials or Application Default Credentials on GCE / GKE / Cloud Run.
Table of contents
Requirements
| fluent-plugin-gcs | fluentd | ruby |
|---|---|---|
| >= 0.5.0 | >= 1.0 | >= 3.3 |
Installation
gem install fluent-plugin-gcs
Using td-agent / fluent-package:
fluent-gem install fluent-plugin-gcs
Quick start
The minimal configuration needs only a bucket. On GCE, GKE, or Cloud Run the credentials are picked up automatically from the environment.
<match your.tag>
@type gcs
bucket YOUR_GCS_BUCKET_NAME
path logs/
<buffer time>
@type file
path /var/log/fluent/gcs
timekey 1h
timekey_wait 10m
timekey_use_utc true
</buffer>
</match>
This writes gzip-compressed objects such as logs/2024010112_0.gz, one per hourly time slice.
Configuration
Authentication
Provide credentials explicitly, or rely on Application Default Credentials when running on Google Cloud.
| Option | Type | Default | Description |
|---|---|---|---|
project |
string | nil |
GCS project identifier |
keyfile |
string | nil |
Path to a service account credentials JSON file |
credentials_json |
hash | nil |
Service account credentials inline as JSON. Takes precedence over keyfile |
client_retries |
integer | nil |
Number of retries on server error |
client_timeout |
integer | nil |
Request timeout in seconds |
project is resolved in the following order: the project option, then the STORAGE_PROJECT / GOOGLE_CLOUD_PROJECT / GCLOUD_PROJECT environment variables, then GCE metadata.
keyfile is resolved in the following order: the keyfile option, the GOOGLE_CLOUD_KEYFILE / GCLOUD_KEYFILE (path) or GOOGLE_CLOUD_KEYFILE_JSON / GCLOUD_KEYFILE_JSON (inline) environment variables, the Cloud SDK's well-known path, then GCE metadata.
Object placement
| Option | Type | Default | Description |
|---|---|---|---|
bucket |
string | — | Required. GCS bucket name |
path |
string | "" |
Path prefix for objects |
object_key_format |
string | %{path}%{time_slice}_%{index}.%{file_extension} |
Template for object keys. See Object key format |
hex_random_length |
integer | 4 |
Length of the %{hex_random} placeholder (max 32) |
overwrite |
bool | false |
Overwrite the existing object instead of incrementing %{index} |
blind_write |
bool | false |
Skip the existence check before writing (see below) |
Avoiding key collisions. When object_key_format contains %{index} (the default), the plugin checks GCS for an existing object and increments %{index} until it finds an unused key, so existing objects are never overwritten. This existence check requires the storage.objects.get permission.
blind_write skips that existence check, so the storage.objects.get permission is no longer needed. The trade-off is that %{index} stops working (it always stays 0), so you must keep keys unique another way, with %{hex_random} (unique per chunk) or %{uuid_flush} (unique per flush).
[!WARNING] If a key collides with an existing object (which can happen with
blind_write true, or withoverwrite true), uploading it overwrites the existing object, and GCS requires thestorage.objects.deletepermission to do so. Without that permission the flush fails repeatedly and the buffer chunk is eventually lost. Withblind_write true, include%{hex_random}or%{uuid_flush}inobject_key_formatto avoid collisions.
Format and compression
| Option | Type | Default | Description |
|---|---|---|---|
store_as |
enum | gzip |
Object format. See the table below |
command_parameter |
string | (per format) | Override the default arguments for the compression command (gzip_command / lzo / lzma2 / zstd) |
transcoding |
bool | false |
Enable decompressive transcoding (gzip only) |
store_as |
Compression | Requires | Default args | Extension | content_type |
|---|---|---|---|---|---|
gzip |
Ruby's built-in Zlib::GzipWriter |
(none) | — | gz |
application/gzip |
gzip_command |
External gzip. Faster for large chunks, falls back to Zlib::GzipWriter on failure |
gzip command |
(none) | gz |
application/gzip |
lzo |
External lzop |
lzop command |
-qf1 |
lzo |
application/x-lzop |
lzma2 |
External xz |
xz command |
-qf0 |
xz |
application/x-xz |
zstd |
External zstd |
zstd command |
(none) | zst |
application/x-zst |
json |
None (upload as JSON) | (none) | — | json |
application/json |
text |
None (upload as text) | (none) | — | txt |
text/plain |
The command-based formats (gzip_command, lzo, lzma2, zstd) stream the chunk through the command's stdin (no intermediate temp file). Each has a sensible default argument set; override it with command_parameter. Multiple arguments are separated by spaces; the value is parsed with shellsplit, so it is not evaluated by a shell:
store_as gzip_command
command_parameter -1 # single argument
store_as zstd
command_parameter -19 --long # multiple arguments, split on spaces
Quote a value that itself contains a space, the same way you would in a shell (command_parameter -o "with space").
gzip_command falls back to Zlib::GzipWriter if the gzip command fails. lzo / lzma2 / zstd have no fallback, so the command must be installed (checked at startup), and they are not compatible with transcoding, which is gzip-specific.
[!NOTE]
gzip_command_parameteris a deprecated alias ofcommand_parameter, kept for backward compatibility with v0.4.x configs. New configs should usecommand_parameter.
The per-line format is configured with a <format> section (default out_file):
<format>
@type json
</format>
See the Formatter documentation for available types (out_file, json, ltsv, single_value, ...).
GCS object settings
| Option | Type | Default | Description |
|---|---|---|---|
auto_create_bucket |
bool | true |
Create the bucket if it does not exist |
acl |
enum | nil |
Predefined ACL for uploaded objects (see below) |
storage_class |
enum | nil |
Storage class for uploaded objects (see below) |
encryption_key |
string | nil |
Customer-supplied AES-256 key for server-side encryption |
acl accepts one of auth_read, owner_full, owner_read, private, project_private, public_read. Defaults to the bucket's default object ACL. See the access control documentation.
storage_class accepts one of dra, nearline, coldline, multi_regional, regional, standard. See the storage classes documentation.
encryption_key enables customer-supplied encryption; the encryption_key_sha256 is computed automatically.
Object key format
object_key_format supports the following placeholders:
| Placeholder | Description |
|---|---|
%{path} |
The value of the path option |
%{time_slice} |
Time slice text derived from the <buffer> timekey |
%{index} |
Sequential number (from 0) within the same time slice |
%{file_extension} |
Inferred from store_as (gz / lzo / xz / zst / json / txt) |
%{uuid_flush} |
A UUID generated on every buffer flush |
%{hex_random} |
A random hex string per chunk, length set by hex_random_length |
%{hostname} |
The hostname of the running server |
The default is %{path}%{time_slice}_%{index}.%{file_extension}.
Object metadata
Attach arbitrary x-goog-meta-* headers to uploaded objects with one or more <object_metadata> sections:
<object_metadata>
key KEY_1
value VALUE_1
</object_metadata>
<object_metadata>
key KEY_2
value VALUE_2
</object_metadata>
Examples
Partition by tag and date
<match app.**>
@type gcs
project YOUR_PROJECT
bucket YOUR_GCS_BUCKET_NAME
object_key_format %{path}%{time_slice}/%{hostname}_%{index}.%{file_extension}
path logs/${tag}/
<buffer tag,time>
@type file
path /var/log/fluent/gcs
timekey 1d
timekey_wait 10m
timekey_use_utc true
</buffer>
<format>
@type json
</format>
</match>
For the tag app.web on host web1, this writes objects such as logs/app.web/20240101/web1_0.gz.
Fine-grained 1-minute partitions
When timekey is under an hour, %{time_slice} automatically resolves to minute granularity (%Y%m%d%H%M).
<match app.**>
@type gcs
bucket YOUR_GCS_BUCKET_NAME
path logs/
<buffer time>
@type file
path /var/log/fluent/gcs
timekey 1m # 1 minute partition
timekey_wait 10s # short wait for late events
timekey_use_utc true
</buffer>
</match>
This writes objects such as logs/202401011230_0.gz, one (or more) per minute.
Fast compression with the external gzip
<match app.**>
@type gcs
bucket YOUR_GCS_BUCKET_NAME
path logs/
store_as gzip_command
command_parameter -1
<buffer time>
@type file
path /var/log/fluent/gcs
timekey 1h
timekey_wait 10m
</buffer>
</match>
Using the default object_key_format, this writes objects such as logs/2024010112_0.gz, one per hourly slice.
Cost-optimized cold storage
<match archive.**>
@type gcs
bucket YOUR_GCS_BUCKET_NAME
path archive/
storage_class coldline
acl project_private
<buffer time>
@type file
path /var/log/fluent/gcs-archive
timekey 1d
timekey_wait 1h
</buffer>
</match>
Using the default object_key_format, this writes objects such as archive/20240101_0.gz, one per day, stored in the Coldline class.
Write without the get permission (blind_write)
blind_write true skips the existence check, so the storage.objects.get permission is not required. Because %{index} does not work in this mode, include %{hex_random} or %{uuid_flush} to keep keys unique.
<match app.**>
@type gcs
bucket YOUR_GCS_BUCKET_NAME
path logs/
object_key_format %{path}%{time_slice}_%{hex_random}.%{file_extension}
blind_write true
<buffer time>
@type file
path /var/log/fluent/gcs
timekey 1h
timekey_wait 10m
timekey_use_utc true
</buffer>
</match>
This writes objects such as logs/2024010112_a1b2.gz, with a per-chunk random suffix instead of an incrementing index.
Development
bundle install
bundle exec rake test # run the test suite
bundle exec bundler-audit check --update # audit dependencies
gem build fluent-plugin-gcs.gemspec # build the gem
Author
Daichi HIRATA
License
Apache License 2.0. See LICENSE.txt.