Class: Kotoshu::Documents::Document
- Inherits:
-
Object
- Object
- Kotoshu::Documents::Document
- Defined in:
- lib/kotoshu/documents/document.rb
Overview
Abstract base class for documents.
Provides a unified interface for different document formats:
-
Plain text
-
Markdown
AsciiDoc Code files (with syntax awareness)
Subclasses implement format-specific parsing and context retrieval.
Direct Known Subclasses
Constant Summary collapse
- FORMATS =
Supported document formats
{ text: 'Plain Text', markdown: 'Markdown', asciidoc: 'AsciiDoc', code: 'Code' }.freeze
Instance Attribute Summary collapse
-
#content ⇒ Object
readonly
Returns the value of attribute content.
-
#format ⇒ Object
readonly
Returns the value of attribute format.
-
#language_code ⇒ Object
readonly
Returns the value of attribute language_code.
Class Method Summary collapse
-
.detect_format(content) ⇒ Symbol
Detect format from content.
-
.detect_language_from_path(path) ⇒ String
Detect language code from file path.
-
.from_file(path) ⇒ Document
Create document from file.
-
.from_string(content, language_code: 'en') ⇒ Document
Create document from string with format detection.
Instance Method Summary collapse
-
#apply(corrections) ⇒ Document
Apply corrections and return new document.
-
#context_for(location, window: 5) ⇒ Models::Context
Get context around a specific location.
-
#get_node(path) ⇒ Object?
Get node at a specific path (for structured formats).
-
#initialize(content, format: :text, language_code: 'en') ⇒ Document
constructor
Create a new document.
-
#line_count ⇒ Integer
Get line count.
-
#name ⇒ String
Get document name (for display).
-
#replace_node(location, new_text) ⇒ Document
Replace text at a specific location.
-
#text_nodes ⇒ Array<TextNode>
Get all text nodes for spell checking.
-
#word_count ⇒ Integer
Get word count.
Constructor Details
#initialize(content, format: :text, language_code: 'en') ⇒ Document
Create a new document.
102 103 104 105 106 107 108 |
# File 'lib/kotoshu/documents/document.rb', line 102 def initialize(content, format: :text, language_code: 'en') raise ArgumentError, "Invalid format: #{format}" unless FORMATS.key?(format) @content = content @format = format @language_code = language_code end |
Instance Attribute Details
#content ⇒ Object (readonly)
Returns the value of attribute content.
87 88 89 |
# File 'lib/kotoshu/documents/document.rb', line 87 def content @content end |
#format ⇒ Object (readonly)
Returns the value of attribute format.
87 88 89 |
# File 'lib/kotoshu/documents/document.rb', line 87 def format @format end |
#language_code ⇒ Object (readonly)
Returns the value of attribute language_code.
87 88 89 |
# File 'lib/kotoshu/documents/document.rb', line 87 def language_code @language_code end |
Class Method Details
.detect_format(content) ⇒ Symbol
Detect format from content.
178 179 180 181 182 |
# File 'lib/kotoshu/documents/document.rb', line 178 def self.detect_format(content) return :markdown if content.start_with?('#') return :code if content.end_with?('.') :text end |
.detect_language_from_path(path) ⇒ String
Detect language code from file path.
219 220 221 222 223 224 225 226 |
# File 'lib/kotoshu/documents/document.rb', line 219 def self.detect_language_from_path(path) # Extract from path like "README.en.md" or "document.de.txt" if path =~ /\.([a-z]{2})\./i Regexp.last_match(1) else 'en' end end |
.from_file(path) ⇒ Document
Create document from file.
188 189 190 191 192 193 194 195 196 197 198 199 200 201 |
# File 'lib/kotoshu/documents/document.rb', line 188 def self.from_file(path) content = File.read(path, encoding: 'UTF-8') format = detect_format(content) language_code = detect_language_from_path(path) case format when :markdown MarkdownDocument.new(content, language_code: language_code) when :asciidoc AsciidocDocument.new(content, language_code: language_code) else PlainTextDocument.new(content, language_code: language_code) end end |
.from_string(content, language_code: 'en') ⇒ Document
Create document from string with format detection.
208 209 210 211 |
# File 'lib/kotoshu/documents/document.rb', line 208 def self.from_string(content, language_code: 'en') format = detect_format(content) new(content, format: format, language_code: language_code) end |
Instance Method Details
#apply(corrections) ⇒ Document
Apply corrections and return new document.
149 150 151 |
# File 'lib/kotoshu/documents/document.rb', line 149 def apply(corrections) raise NotImplementedError, "#{self.class} must implement #apply" end |
#context_for(location, window: 5) ⇒ Models::Context
Get context around a specific location.
141 142 143 |
# File 'lib/kotoshu/documents/document.rb', line 141 def context_for(location, window: 5) raise NotImplementedError, "#{self.class} must implement #context_for" end |
#get_node(path) ⇒ Object?
Get node at a specific path (for structured formats).
123 124 125 |
# File 'lib/kotoshu/documents/document.rb', line 123 def get_node(path) raise NotImplementedError, "#{self.class} must implement #get_node" end |
#line_count ⇒ Integer
Get line count.
163 164 165 |
# File 'lib/kotoshu/documents/document.rb', line 163 def line_count @content.lines.size end |
#name ⇒ String
Get document name (for display).
170 171 172 |
# File 'lib/kotoshu/documents/document.rb', line 170 def name "document" end |
#replace_node(location, new_text) ⇒ Document
Replace text at a specific location.
132 133 134 |
# File 'lib/kotoshu/documents/document.rb', line 132 def replace_node(location, new_text) raise NotImplementedError, "#{self.class} must implement #replace_node" end |
#text_nodes ⇒ Array<TextNode>
Get all text nodes for spell checking.
Subclasses implement format-specific text extraction.
115 116 117 |
# File 'lib/kotoshu/documents/document.rb', line 115 def text_nodes raise NotImplementedError, "#{self.class} must implement #text_nodes" end |
#word_count ⇒ Integer
Get word count.
156 157 158 |
# File 'lib/kotoshu/documents/document.rb', line 156 def word_count @content.split(/\s+/).size end |