Class: GPT2BPETables
- Inherits:
-
Object
- Object
- GPT2BPETables
- Defined in:
- lib/toy/io/bpe.rb
Overview
Load and hold the three lookup tables. Built once at startup.
Instance Attribute Summary collapse
-
#byte_chars ⇒ Object
Array<String>(256): byte → utf-8 char.
-
#char_bytes ⇒ Object
Array<String>(256): byte → utf-8 char.
-
#merge_rank ⇒ Object
Array<String>(256): byte → utf-8 char.
-
#punct_byte ⇒ Object
Array<String>(256): byte → utf-8 char.
-
#vocab_id ⇒ Object
Array<String>(256): byte → utf-8 char.
-
#vocab_tok ⇒ Object
Array<String>(256): byte → utf-8 char.
Instance Method Summary collapse
-
#initialize ⇒ GPT2BPETables
constructor
A new instance of GPT2BPETables.
Constructor Details
#initialize ⇒ GPT2BPETables
Returns a new instance of GPT2BPETables.
31 32 33 34 35 36 37 38 |
# File 'lib/toy/io/bpe.rb', line 31 def initialize @byte_chars = Array.new(256, "?") @char_bytes = {} @vocab_id = {} @vocab_tok = Array.new(50257, "") @merge_rank = {} @punct_byte = Array.new(256, 0) end |
Instance Attribute Details
#byte_chars ⇒ Object
Array<String>(256): byte → utf-8 char
23 24 25 |
# File 'lib/toy/io/bpe.rb', line 23 def byte_chars @byte_chars end |
#char_bytes ⇒ Object
Array<String>(256): byte → utf-8 char
23 24 25 |
# File 'lib/toy/io/bpe.rb', line 23 def char_bytes @char_bytes end |
#merge_rank ⇒ Object
Array<String>(256): byte → utf-8 char
23 24 25 |
# File 'lib/toy/io/bpe.rb', line 23 def merge_rank @merge_rank end |
#punct_byte ⇒ Object
Array<String>(256): byte → utf-8 char
23 24 25 |
# File 'lib/toy/io/bpe.rb', line 23 def punct_byte @punct_byte end |
#vocab_id ⇒ Object
Array<String>(256): byte → utf-8 char
23 24 25 |
# File 'lib/toy/io/bpe.rb', line 23 def vocab_id @vocab_id end |
#vocab_tok ⇒ Object
Array<String>(256): byte → utf-8 char
23 24 25 |
# File 'lib/toy/io/bpe.rb', line 23 def vocab_tok @vocab_tok end |