Class: TopSecret::Text
- Inherits:
-
Object
- Object
- TopSecret::Text
- Defined in:
- lib/top_secret/text.rb,
lib/top_secret/text/result.rb,
lib/top_secret/text/scan_result.rb,
lib/top_secret/text/batch_result.rb,
lib/top_secret/text/global_mapping.rb,
lib/top_secret/text/label_sequence.rb
Overview
Processes text to identify and redact sensitive information using configured filters.
Defined Under Namespace
Classes: BatchResult, GlobalMapping, LabelSequence, Result, ScanResult
Class Method Summary collapse
-
.clear_model_cache! ⇒ void
Clears the cached model, forcing reinitialization on next access.
-
.filter(input, custom_filters: [], **filters) ⇒ Result
Convenience method to create an instance and filter input.
-
.filter_all(messages, custom_filters: [], **filters) ⇒ BatchResult
Filters multiple messages with globally consistent redaction labels.
-
.scan(input, custom_filters: [], **filters) ⇒ ScanResult
Convenience method to scan input text for sensitive information without redacting it.
-
.shared_model ⇒ Mitie::NER, NullModel
Returns a cached MITIE model instance to avoid expensive reinitialization.
Instance Method Summary collapse
-
#filter ⇒ Result
Applies configured filters to the input, redacting matches and building a mapping.
-
#initialize(input, custom_filters: [], filters: {}, model: nil) ⇒ Text
constructor
A new instance of Text.
-
#scan ⇒ ScanResult
Scans the input text for sensitive information using configured filters.
Constructor Details
#initialize(input, custom_filters: [], filters: {}, model: nil) ⇒ Text
Returns a new instance of Text.
49 50 51 52 53 54 55 56 57 58 |
# File 'lib/top_secret/text.rb', line 49 def initialize(input, custom_filters: [], filters: {}, model: nil) @input = input @output = input.dup @mapping = {} @model = model || default_model @filters = filters @custom_filters = custom_filters end |
Class Method Details
.clear_model_cache! ⇒ void
This method returns an undefined value.
Clears the cached model, forcing reinitialization on next access
38 39 40 41 42 |
# File 'lib/top_secret/text.rb', line 38 def clear_model_cache! @mutex.synchronize do @shared_model = nil end end |
.filter(input, custom_filters: [], **filters) ⇒ Result
Convenience method to create an instance and filter input
67 68 69 |
# File 'lib/top_secret/text.rb', line 67 def self.filter(input, custom_filters: [], **filters) new(input, filters:, custom_filters:).filter end |
.filter_all(messages, custom_filters: [], **filters) ⇒ BatchResult
Filters multiple messages with globally consistent redaction labels
Processes a collection of messages and ensures that identical sensitive values receive the same redaction labels across all messages. This is useful when processing conversation threads or document collections where consistency matters.
94 95 96 |
# File 'lib/top_secret/text.rb', line 94 def self.filter_all(, custom_filters: [], **filters) Text::BatchResult.(, custom_filters:, **filters) end |
.scan(input, custom_filters: [], **filters) ⇒ ScanResult
Convenience method to scan input text for sensitive information without redacting it
This method detects sensitive information using configured filters but does not modify the original text. Use this when you only need to check if sensitive data exists or get a mapping of what was found.
124 125 126 |
# File 'lib/top_secret/text.rb', line 124 def self.scan(input, custom_filters: [], **filters) new(input, filters:, custom_filters:).scan end |
.shared_model ⇒ Mitie::NER, NullModel
Returns a cached MITIE model instance to avoid expensive reinitialization
21 22 23 24 25 26 27 28 29 30 31 32 33 |
# File 'lib/top_secret/text.rb', line 21 def shared_model return @shared_model if @shared_model @mutex.synchronize do return @shared_model if @shared_model @shared_model = if TopSecret.model_path Mitie::NER.new(TopSecret.model_path) else NullModel.new end end end |
Instance Method Details
#filter ⇒ Result
Applies configured filters to the input, redacting matches and building a mapping.
165 166 167 168 169 170 171 |
# File 'lib/top_secret/text.rb', line 165 def filter scan_result = scan substitute_text if scan_result.sensitive? Text::Result.new(input, output, scan_result.mapping) end |
#scan ⇒ ScanResult
Scans the input text for sensitive information using configured filters
This method applies all active filters to detect sensitive information but does not redact the original text. It builds a mapping of found values and returns whether any sensitive information was detected.
137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 |
# File 'lib/top_secret/text.rb', line 137 def scan @doc ||= model.doc(@output) if model @entities ||= doc.entities if model validate_filters! all_filters.each do |filter| next if filter.nil? values = case filter when TopSecret::Filters::Regex filter.call(input) when TopSecret::Filters::NER filter.call(entities) else raise Error, "Unsupported filter. Expected TopSecret::Filters::Regex or TopSecret::Filters::NER, but got #{filter.class}" end build_mapping(values, label: filter.label) end ScanResult.new(mapping) end |