Tesseract OCR Wrapper (v1.0)
About this version
- Submitter: keighrim
- Submission Time: 2023-07-26T00:03:43+00:00
- Prebuilt Container Image: ghcr.io/clamsproject/app-tesseractocr-wrapper:v1.0
-
Release Notes
(no notes provided by the developer)
About this app (See raw metadata.json)
This tool applies Tesseract OCR to a video or image and generates text boxes and OCR results.
- App ID: http://apps.clams.ai/tesseractocr-wrapper/v1.0
- App License: MIT
- Source Repository: https://github.com/clamsproject/app-tesseractocr-wrapper (source tree of the submitted version)
- Analyzer Version: tesseract4
- Analyzer License: apache
Inputs
(Note: “*” as a property value means that the property is required but can be any value.)
-
http://mmif.clams.ai/vocabulary/VideoDocument/v1 (required) (of any properties)
- http://mmif.clams.ai/vocabulary/BoundingBox/v1 (required)
- boxType = “text”
- http://mmif.clams.ai/vocabulary/TimeFrame/v1 (of any properties)
Configurable Parameters
(Note: Multivalued means the parameter can have one or more values.)
-
frameType: optional, defaults to""- Type: string
- Multivalued: True
Use this to specify TimeFrame to use for filtering “text”-typed BoundingBox annotations. Can be “slate”, “chyron”, “speech”, etc.. If not set, the app won’t use TimeFrames for filtering.
-
threshold: optional, defaults to0.9- Type: number
- Multivalued: False
Use this value between 0 and 1 to filter out low-confidence text boxes.
-
psm: optional, defaults to0- Type: integer
- Multivalued: False
- Choices:
0,1,2,3,4,5,6,7,8,9,10,11,12,13
Tesseract Page Segmentation Modes. See https://tesseract-ocr.github.io/tessdoc/ImproveQuality.html#page-segmentation-method
-
pretty: optional, defaults tofalse- Type: boolean
- Multivalued: False
- Choices:
false,true
The JSON body of the HTTP response will be re-formatted with 2-space indentation
Outputs
(Note: “*” as a property value means that the property is required but can be any value.)
(Note: Not all output annotations are always generated.)
-
http://mmif.clams.ai/vocabulary/TextDocument/v1 (of any properties)
-
http://mmif.clams.ai/vocabulary/Alignment/v1 (of any properties)