Class: DurableHuggingfaceHub::DatasetCard

Inherits:
RepoCard
  • Object
show all
Defined in:
lib/durable_huggingface_hub/repo_card.rb

Overview

Dataset card for documenting datasets.

Dataset cards provide information about datasets including:

  • Dataset description and structure

  • Data collection methodology

  • Intended use cases

  • Limitations and biases

Examples:

Create a dataset card

card = DatasetCard.new(
  text: "# My Dataset\n\nThis dataset contains...",
  data: {
    "license" => "cc-by-4.0",
    "language" => ["en", "es"],
    "task_categories" => ["text-classification"]
  }
)

Instance Attribute Summary

Attributes inherited from RepoCard

#data, #text

Class Method Summary collapse

Instance Method Summary collapse

Methods inherited from RepoCard

from_hub, #initialize, load, parse, #push_to_hub, #save, #to_s, #update_metadata

Constructor Details

This class inherits a constructor from DurableHuggingfaceHub::RepoCard

Class Method Details

.default_repo_typeString

Returns Default repository type for dataset cards.

Returns:

  • (String)

    Default repository type for dataset cards



339
340
341
# File 'lib/durable_huggingface_hub/repo_card.rb', line 339

def self.default_repo_type
  "dataset"
end

Instance Method Details

#validateArray<String>

Validate dataset card metadata.

Returns:

  • (Array<String>)

    List of validation errors



346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
# File 'lib/durable_huggingface_hub/repo_card.rb', line 346

def validate
  errors = []

  # Check for required fields
  errors << "license is required" unless @data["license"]

  # Validate license format
  if @data["license"] && !@data["license"].is_a?(String)
    errors << "license must be a string"
  end

  # Validate language format
  if @data["language"]
    if @data["language"].is_a?(String)
      # Single language
    elsif @data["language"].is_a?(Array)
      # Multiple languages
      @data["language"].each do |lang|
        errors << "language array elements must be strings" unless lang.is_a?(String)
      end
    else
      errors << "language must be a string or array of strings"
    end
  end

  # Validate task_categories format
  if @data["task_categories"] && !@data["task_categories"].is_a?(Array)
    errors << "task_categories must be an array"
  end

  errors
end