Scenes-with-text Detection (v7.1)
About this version
- Submitter: keighrim
- Submission Time: 2024-12-02T17:48:12+00:00
- Prebuilt Container Image: ghcr.io/clamsproject/app-swt-detection:v7.1
-
Release Notes
Release with newly trained models:
- training data is expanded with new annotations: https://github.com/clamsproject/aapb-annotations/pull/98 and https://github.com/clamsproject/aapb-annotations/pull/104 .
- label
Uis added, total number of “raw” labels is now 18. - in additional to
convnext_lgandconvnext_tiny,convnext_small-based models are added. The default is nowconvnext_smallmodel.
About this app (See raw metadata.json)
Detects scenes with text, like slates, chyrons and credits. This app can run in three modes, depending on useClassifier, useStitcher parameters. When useClassifier=True, it runs in the “TimePoint mode” and generates TimePoint annotations. When useStitcher=True, it runs in the “TimeFrame mode” and generates TimeFrame annotations based on existing TimePoint annotations – if no TimePoint is found, it produces an error. By default, it runs in the ‘both’ mode and first generates TimePoint annotations and then TimeFrame annotations on them.
- App ID: http://apps.clams.ai/swt-detection/v7.1
- App License: Apache 2.0
- Source Repository: https://github.com/clamsproject/app-swt-detection (source tree of the submitted version)
Inputs
(Note: “*” as a property value means that the property is required but can be any value.)
- http://mmif.clams.ai/vocabulary/VideoDocument/v1 (required) (of any properties)
Configurable Parameters
(Note: Multivalued means the parameter can have one or more values.)
-
useClassifier: optional, defaults totrue- Type: boolean
- Multivalued: False
- Choices:
false,true
Use the image classifier model to generate TimePoint annotations.
-
tpModelName: optional, defaults toconvnext_small- Type: string
- Multivalued: False
- Choices:
convnext_lg,convnext_tiny,convnext_small
Model name to use for classification, only applies when
useClassifier=true. -
tpUsePosModel: optional, defaults totrue- Type: boolean
- Multivalued: False
- Choices:
false,true
Use the model trained with positional features, only applies when
useClassifier=true. -
tpStartAt: optional, defaults to0- Type: integer
- Multivalued: False
Number of milliseconds into the video to start processing, only applies when
useClassifier=true. -
tpStopAt: optional, defaults to9223372036854775807- Type: integer
- Multivalued: False
Number of milliseconds into the video to stop processing, only applies when
useClassifier=true. -
tpSampleRate: optional, defaults to1000- Type: integer
- Multivalued: False
Milliseconds between sampled frames, only applies when
useClassifier=true. -
useStitcher: optional, defaults totrue- Type: boolean
- Multivalued: False
- Choices:
false,true
Use the stitcher after classifying the TimePoints.
-
tfMinTPScore: optional, defaults to0.5- Type: number
- Multivalued: False
Minimum score for a TimePoint to be included in a TimeFrame. A lower value will include more TimePoints in the TimeFrame (increasing recall in exchange for precision). Only applies when
useStitcher=true. -
tfMinTFScore: optional, defaults to0.9- Type: number
- Multivalued: False
Minimum score for a TimeFrame. A lower value will include more TimeFrames in the output (increasing recall in exchange for precision). Only applies when
useStitcher=true -
tfMinTFDuration: optional, defaults to5000- Type: integer
- Multivalued: False
Minimum duration of a TimeFrame in milliseconds, only applies when
useStitcher=true. -
tfAllowOverlap: optional, defaults tofalse- Type: boolean
- Multivalued: False
- Choices:
false,true
Allow overlapping time frames, only applies when
useStitcher=true -
tfDynamicSceneLabels: optional, defaults to['credit', 'credits']- Type: string
- Multivalued: True
Labels that are considered dynamic scenes. For dynamic scenes, TimeFrame annotations contains multiple representative points to follow any changes in the scene. Only applies when
useStitcher=true -
tfLabelMap: optional, defaults to[]- Type: map
- Multivalued: True
(See also
tfLabelMapPreset, settfLabelMapPreset=nopresetto make sure that a preset does not overridetfLabelMapwhen using this) Mapping of a label in the input TimePoint annotations to a new label of the stitched TimeFrame annotations. Must be formatted as IN_LABEL:OUT_LABEL (with a colon). To pass multiple mappings, use this parameter multiple times. When two+ TP labels are mapped to a TF label, it essentially works as a “binning” operation. If no mapping is used, all the input labels are passed-through, meaning no change in both TP & TF labelsets. However, when at least one label is mapped, all the other “unset” labels are mapped to the negative label (-) and if-does not exist in the TF labelset, it is added automatically. Only applies whenuseStitcher=true. -
tfLabelMapPreset: optional, defaults torelaxed- Type: string
- Multivalued: False
- Choices:
noprebin,nomap,strict,simpler,simple,relaxed,binary-bars,binary-slate,binary-chyron-strict,binary-chyron-relaxed,binary-credits
(See also
tfLabelMap) Preset alias of a label mapping. If notnopreset, this parameter will override thetfLabelMapparameter. Available presets are:
-noprebin: []
-nomap: []
-strict: [‘B:Bars’, ‘S:Slate’, ‘S:H:Slate’, ‘S:C:Slate’, ‘S:D:Slate’, ‘S:B:Slate’, ‘S:G:Slate’, ‘I:Chyron-person’, ‘N:Chyron-person’, ‘C:Credits’, ‘R:Credits’, ‘M:Main’, ‘O:Opening’, ‘W:Opening’, ‘Y:Chyron-other’, ‘U:Chyron-other’, ‘K:Chyron-other’, ‘L:Other-text’, ‘G:Other-text’, ‘F:Other-text’, ‘E:Other-text’, ‘T:Other-text’]
-simpler: [‘B:Bars’, ‘S:Slate’, ‘S:H:Slate’, ‘S:C:Slate’, ‘S:D:Slate’, ‘S:B:Slate’, ‘S:G:Slate’, ‘I:Chyron’, ‘N:Chyron’, ‘C:Credits’, ‘R:Credits’]
-simple: [‘B:Bars’, ‘S:Slate’, ‘S:H:Slate’, ‘S:C:Slate’, ‘S:D:Slate’, ‘S:B:Slate’, ‘S:G:Slate’, ‘I:Chyron-person’, ‘N:Chyron-person’, ‘C:Credits’, ‘R:Credits’, ‘M:Other-text’, ‘O:Other-text’, ‘W:Other-text’, ‘Y:Other-text’, ‘U:Other-text’, ‘K:Other-text’, ‘L:Other-text’, ‘G:Other-text’, ‘F:Other-text’, ‘E:Other-text’, ‘T:Other-text’]
-relaxed: [‘B:Bars’, ‘S:Slate’, ‘S:H:Slate’, ‘S:C:Slate’, ‘S:D:Slate’, ‘S:B:Slate’, ‘S:G:Slate’, ‘Y:Chyron’, ‘U:Chyron’, ‘K:Chyron’, ‘I:Chyron’, ‘N:Chyron’, ‘C:Credits’, ‘R:Credits’, ‘M:Other-text’, ‘O:Other-text’, ‘W:Other-text’, ‘L:Other-text’, ‘G:Other-text’, ‘F:Other-text’, ‘E:Other-text’, ‘T:Other-text’]
-binary-bars: [‘B:Bars’]
-binary-slate: [‘S:Slate’, ‘S:H:Slate’, ‘S:C:Slate’, ‘S:D:Slate’, ‘S:B:Slate’, ‘S:G:Slate’]
-binary-chyron-strict: [‘I:Chyron-person’, ‘N:Chyron-person’]
-binary-chyron-relaxed: [‘Y:Chyron’, ‘U:Chyron’, ‘K:Chyron’, ‘I:Chyron’, ‘N:Chyron’]
-binary-credits: [‘C:Credits’, ‘R:Credits’]
Only applies whenuseStitcher=true. -
pretty: optional, defaults tofalse- Type: boolean
- Multivalued: False
- Choices:
false,true
The JSON body of the HTTP response will be re-formatted with 2-space indentation
-
runningTime: optional, defaults tofalse- Type: boolean
- Multivalued: False
- Choices:
false,true
The running time of the app will be recorded in the view metadata
-
hwFetch: optional, defaults tofalse- Type: boolean
- Multivalued: False
- Choices:
false,true
The hardware information (architecture, GPU and vRAM) will be recorded in the view metadata
Outputs
(Note: “*” as a property value means that the property is required but can be any value.)
(Note: Not all output annotations are always generated.)
- http://mmif.clams.ai/vocabulary/TimeFrame/v5
- timeUnit = “milliseconds”
- http://mmif.clams.ai/vocabulary/TimePoint/v4
- timeUnit = “milliseconds”
- labelset = a list of [“B”, “S”, “I”, “C”, “R”, “M”, “O”, “W”, “N”, “Y”, “U”, “K”, “L”, “G”, “F”, “E”, “T”, “P”]