Class: Google::Cloud::DocumentAI::V1beta3::OcrConfig
- Inherits:
-
Object
- Object
- Google::Cloud::DocumentAI::V1beta3::OcrConfig
- Extended by:
- Protobuf::MessageExts::ClassMethods
- Includes:
- Protobuf::MessageExts
- Defined in:
- proto_docs/google/cloud/documentai/v1beta3/document_io.rb
Overview
Config for Document OCR.
Defined Under Namespace
Classes: Hints, PremiumFeatures
Instance Attribute Summary collapse
-
#advanced_ocr_options ⇒ ::Array<::String>
A list of advanced OCR options to further fine-tune OCR behavior.
-
#compute_style_info ⇒ ::Boolean
deprecated
Deprecated.
This field is deprecated and may be removed in the next major version update.
-
#disable_character_boxes_detection ⇒ ::Boolean
Turn off character box detector in OCR engine.
-
#enable_image_quality_scores ⇒ ::Boolean
Enables intelligent document quality scores after OCR.
-
#enable_native_pdf_parsing ⇒ ::Boolean
Enables special handling for PDFs with existing text information.
-
#enable_symbol ⇒ ::Boolean
Includes symbol level OCR information if set to true.
-
#hints ⇒ ::Google::Cloud::DocumentAI::V1beta3::OcrConfig::Hints
Hints for the OCR model.
-
#premium_features ⇒ ::Google::Cloud::DocumentAI::V1beta3::OcrConfig::PremiumFeatures
Configurations for premium OCR features.
Instance Attribute Details
#advanced_ocr_options ⇒ ::Array<::String>
Returns A list of advanced OCR options to further fine-tune OCR behavior. Current valid values are:
legacy_layout: a heuristics layout detection algorithm, which serves as an alternative to the current ML-based layout detection algorithm. Customers can choose the best suitable layout algorithm based on their situation.
186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 |
# File 'proto_docs/google/cloud/documentai/v1beta3/document_io.rb', line 186 class OcrConfig include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods # Hints for OCR Engine # @!attribute [rw] language_hints # @return [::Array<::String>] # List of BCP-47 language codes to use for OCR. In most cases, not # specifying it yields the best results since it enables automatic language # detection. For languages based on the Latin alphabet, setting hints is # not needed. In rare cases, when the language of the text in the # image is known, setting a hint will help get better results (although it # will be a significant hindrance if the hint is wrong). class Hints include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods end # Configurations for premium OCR features. # @!attribute [rw] enable_selection_mark_detection # @return [::Boolean] # Turn on selection mark detector in OCR engine. Only available in OCR 2.0 # (and later) processors. # @!attribute [rw] compute_style_info # @return [::Boolean] # Turn on font identification model and return font style information. # @!attribute [rw] enable_math_ocr # @return [::Boolean] # Turn on the model that can extract LaTeX math formulas. class PremiumFeatures include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods end end |
#compute_style_info ⇒ ::Boolean
This field is deprecated and may be removed in the next major version update.
Returns Turn on font identification model and return font style information. Deprecated, use PremiumFeatures.compute_style_info instead.
186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 |
# File 'proto_docs/google/cloud/documentai/v1beta3/document_io.rb', line 186 class OcrConfig include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods # Hints for OCR Engine # @!attribute [rw] language_hints # @return [::Array<::String>] # List of BCP-47 language codes to use for OCR. In most cases, not # specifying it yields the best results since it enables automatic language # detection. For languages based on the Latin alphabet, setting hints is # not needed. In rare cases, when the language of the text in the # image is known, setting a hint will help get better results (although it # will be a significant hindrance if the hint is wrong). class Hints include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods end # Configurations for premium OCR features. # @!attribute [rw] enable_selection_mark_detection # @return [::Boolean] # Turn on selection mark detector in OCR engine. Only available in OCR 2.0 # (and later) processors. # @!attribute [rw] compute_style_info # @return [::Boolean] # Turn on font identification model and return font style information. # @!attribute [rw] enable_math_ocr # @return [::Boolean] # Turn on the model that can extract LaTeX math formulas. class PremiumFeatures include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods end end |
#disable_character_boxes_detection ⇒ ::Boolean
Returns Turn off character box detector in OCR engine. Character box detection is enabled by default in OCR 2.0 (and later) processors.
186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 |
# File 'proto_docs/google/cloud/documentai/v1beta3/document_io.rb', line 186 class OcrConfig include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods # Hints for OCR Engine # @!attribute [rw] language_hints # @return [::Array<::String>] # List of BCP-47 language codes to use for OCR. In most cases, not # specifying it yields the best results since it enables automatic language # detection. For languages based on the Latin alphabet, setting hints is # not needed. In rare cases, when the language of the text in the # image is known, setting a hint will help get better results (although it # will be a significant hindrance if the hint is wrong). class Hints include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods end # Configurations for premium OCR features. # @!attribute [rw] enable_selection_mark_detection # @return [::Boolean] # Turn on selection mark detector in OCR engine. Only available in OCR 2.0 # (and later) processors. # @!attribute [rw] compute_style_info # @return [::Boolean] # Turn on font identification model and return font style information. # @!attribute [rw] enable_math_ocr # @return [::Boolean] # Turn on the model that can extract LaTeX math formulas. class PremiumFeatures include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods end end |
#enable_image_quality_scores ⇒ ::Boolean
Returns Enables intelligent document quality scores after OCR. Can help with diagnosing why OCR responses are of poor quality for a given input. Adds additional latency comparable to regular OCR to the process call.
186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 |
# File 'proto_docs/google/cloud/documentai/v1beta3/document_io.rb', line 186 class OcrConfig include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods # Hints for OCR Engine # @!attribute [rw] language_hints # @return [::Array<::String>] # List of BCP-47 language codes to use for OCR. In most cases, not # specifying it yields the best results since it enables automatic language # detection. For languages based on the Latin alphabet, setting hints is # not needed. In rare cases, when the language of the text in the # image is known, setting a hint will help get better results (although it # will be a significant hindrance if the hint is wrong). class Hints include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods end # Configurations for premium OCR features. # @!attribute [rw] enable_selection_mark_detection # @return [::Boolean] # Turn on selection mark detector in OCR engine. Only available in OCR 2.0 # (and later) processors. # @!attribute [rw] compute_style_info # @return [::Boolean] # Turn on font identification model and return font style information. # @!attribute [rw] enable_math_ocr # @return [::Boolean] # Turn on the model that can extract LaTeX math formulas. class PremiumFeatures include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods end end |
#enable_native_pdf_parsing ⇒ ::Boolean
Returns Enables special handling for PDFs with existing text information. Results in better text extraction quality in such PDF inputs.
186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 |
# File 'proto_docs/google/cloud/documentai/v1beta3/document_io.rb', line 186 class OcrConfig include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods # Hints for OCR Engine # @!attribute [rw] language_hints # @return [::Array<::String>] # List of BCP-47 language codes to use for OCR. In most cases, not # specifying it yields the best results since it enables automatic language # detection. For languages based on the Latin alphabet, setting hints is # not needed. In rare cases, when the language of the text in the # image is known, setting a hint will help get better results (although it # will be a significant hindrance if the hint is wrong). class Hints include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods end # Configurations for premium OCR features. # @!attribute [rw] enable_selection_mark_detection # @return [::Boolean] # Turn on selection mark detector in OCR engine. Only available in OCR 2.0 # (and later) processors. # @!attribute [rw] compute_style_info # @return [::Boolean] # Turn on font identification model and return font style information. # @!attribute [rw] enable_math_ocr # @return [::Boolean] # Turn on the model that can extract LaTeX math formulas. class PremiumFeatures include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods end end |
#enable_symbol ⇒ ::Boolean
Returns Includes symbol level OCR information if set to true.
186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 |
# File 'proto_docs/google/cloud/documentai/v1beta3/document_io.rb', line 186 class OcrConfig include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods # Hints for OCR Engine # @!attribute [rw] language_hints # @return [::Array<::String>] # List of BCP-47 language codes to use for OCR. In most cases, not # specifying it yields the best results since it enables automatic language # detection. For languages based on the Latin alphabet, setting hints is # not needed. In rare cases, when the language of the text in the # image is known, setting a hint will help get better results (although it # will be a significant hindrance if the hint is wrong). class Hints include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods end # Configurations for premium OCR features. # @!attribute [rw] enable_selection_mark_detection # @return [::Boolean] # Turn on selection mark detector in OCR engine. Only available in OCR 2.0 # (and later) processors. # @!attribute [rw] compute_style_info # @return [::Boolean] # Turn on font identification model and return font style information. # @!attribute [rw] enable_math_ocr # @return [::Boolean] # Turn on the model that can extract LaTeX math formulas. class PremiumFeatures include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods end end |
#hints ⇒ ::Google::Cloud::DocumentAI::V1beta3::OcrConfig::Hints
Returns Hints for the OCR model.
186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 |
# File 'proto_docs/google/cloud/documentai/v1beta3/document_io.rb', line 186 class OcrConfig include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods # Hints for OCR Engine # @!attribute [rw] language_hints # @return [::Array<::String>] # List of BCP-47 language codes to use for OCR. In most cases, not # specifying it yields the best results since it enables automatic language # detection. For languages based on the Latin alphabet, setting hints is # not needed. In rare cases, when the language of the text in the # image is known, setting a hint will help get better results (although it # will be a significant hindrance if the hint is wrong). class Hints include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods end # Configurations for premium OCR features. # @!attribute [rw] enable_selection_mark_detection # @return [::Boolean] # Turn on selection mark detector in OCR engine. Only available in OCR 2.0 # (and later) processors. # @!attribute [rw] compute_style_info # @return [::Boolean] # Turn on font identification model and return font style information. # @!attribute [rw] enable_math_ocr # @return [::Boolean] # Turn on the model that can extract LaTeX math formulas. class PremiumFeatures include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods end end |
#premium_features ⇒ ::Google::Cloud::DocumentAI::V1beta3::OcrConfig::PremiumFeatures
Returns Configurations for premium OCR features.
186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 |
# File 'proto_docs/google/cloud/documentai/v1beta3/document_io.rb', line 186 class OcrConfig include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods # Hints for OCR Engine # @!attribute [rw] language_hints # @return [::Array<::String>] # List of BCP-47 language codes to use for OCR. In most cases, not # specifying it yields the best results since it enables automatic language # detection. For languages based on the Latin alphabet, setting hints is # not needed. In rare cases, when the language of the text in the # image is known, setting a hint will help get better results (although it # will be a significant hindrance if the hint is wrong). class Hints include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods end # Configurations for premium OCR features. # @!attribute [rw] enable_selection_mark_detection # @return [::Boolean] # Turn on selection mark detector in OCR engine. Only available in OCR 2.0 # (and later) processors. # @!attribute [rw] compute_style_info # @return [::Boolean] # Turn on font identification model and return font style information. # @!attribute [rw] enable_math_ocr # @return [::Boolean] # Turn on the model that can extract LaTeX math formulas. class PremiumFeatures include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods end end |