Scenes-with-text Detection (v7.5)
About this version
- Submitter: keighrim
- Submission Time: 2025-02-24T10:41:50+00:00
- Prebuilt Container Image: ghcr.io/clamsproject/app-swt-detection:v7.5
-
Release Notes
Re-release of 7.4 with support for prebuilt image for arm64 architecture
About this app (See raw metadata.json)
Detects scenes with text, like slates, chyrons and credits. This app can run in three modes, depending on useClassifier
, useStitcher
parameters. When useClassifier=True
, it runs in the “TimePoint mode” and generates TimePoint annotations. When useStitcher=True
, it runs in the “TimeFrame mode” and generates TimeFrame annotations based on existing TimePoint annotations – if no TimePoint is found, it produces an error. By default, it runs in the ‘both’ mode and first generates TimePoint annotations and then TimeFrame annotations on them.
- App ID: http://apps.clams.ai/swt-detection/v7.5
- App License: Apache 2.0
- Source Repository: https://github.com/clamsproject/app-swt-detection (source tree of the submitted version)
Inputs
(Note: “*” as a property value means that the property is required but can be any value.)
- http://mmif.clams.ai/vocabulary/VideoDocument/v1 (required) (of any properties)
Configurable Parameters
(Note: Multivalued means the parameter can have one or more values.)
-
useClassifier
: optional, defaults totrue
- Type: boolean
- Multivalued: False
- Choices:
false
,true
Use the image classifier model to generate TimePoint annotations.
-
tpModelName
: optional, defaults toconvnext_small
- Type: string
- Multivalued: False
- Choices:
convnext_small
,convnext_tiny
,convnext_lg
Model name to use for classification, only applies when
useClassifier=true
. -
tpUsePosModel
: optional, defaults totrue
- Type: boolean
- Multivalued: False
- Choices:
false
,true
Use the model trained with positional features, only applies when
useClassifier=true
. -
tpStartAt
: optional, defaults to0
- Type: integer
- Multivalued: False
Number of milliseconds into the video to start processing, only applies when
useClassifier=true
. -
tpStopAt
: optional, defaults to9223372036854775807
- Type: integer
- Multivalued: False
Number of milliseconds into the video to stop processing, only applies when
useClassifier=true
. -
tpSampleRate
: optional, defaults to1000
- Type: integer
- Multivalued: False
Milliseconds between sampled frames, only applies when
useClassifier=true
. -
useStitcher
: optional, defaults totrue
- Type: boolean
- Multivalued: False
- Choices:
false
,true
Use the stitcher after classifying the TimePoints.
-
tfMinTPScore
: optional, defaults to0.5
- Type: number
- Multivalued: False
Minimum score for a TimePoint to be included in a TimeFrame. A lower value will include more TimePoints in the TimeFrame (increasing recall in exchange for precision). Only applies when
useStitcher=true
. -
tfMinTFScore
: optional, defaults to0.9
- Type: number
- Multivalued: False
Minimum score for a TimeFrame. A lower value will include more TimeFrames in the output (increasing recall in exchange for precision). Only applies when
useStitcher=true
-
tfMinTFDuration
: optional, defaults to5000
- Type: integer
- Multivalued: False
Minimum duration of a TimeFrame in milliseconds, only applies when
useStitcher=true
. -
tfAllowOverlap
: optional, defaults tofalse
- Type: boolean
- Multivalued: False
- Choices:
false
,true
Allow overlapping time frames, only applies when
useStitcher=true
-
tfDynamicSceneLabels
: optional, defaults to['credit', 'credits']
- Type: string
- Multivalued: True
Labels that are considered dynamic scenes. For dynamic scenes, TimeFrame annotations contains multiple representative points to follow any changes in the scene. Only applies when
useStitcher=true
-
tfLabelMap
: optional, defaults to[]
- Type: map
- Multivalued: True
(See also
tfLabelMapPreset
, settfLabelMapPreset=nopreset
to make sure that a preset does not overridetfLabelMap
when using this) Mapping of a label in the input TimePoint annotations to a new label of the stitched TimeFrame annotations. Must be formatted as IN_LABEL:OUT_LABEL (with a colon). To pass multiple mappings, use this parameter multiple times. When two+ TP labels are mapped to a TF label, it essentially works as a “binning” operation. If no mapping is used, all the input labels are passed-through, meaning no change in both TP & TF labelsets. However, when at least one label is mapped, all the other “unset” labels are mapped to the negative label (-
) and if-
does not exist in the TF labelset, it is added automatically. Only applies whenuseStitcher=true
. -
tfLabelMapPreset
: optional, defaults torelaxed
- Type: string
- Multivalued: False
- Choices:
noprebin
,nomap
,strict
,simpler
,simple
,relaxed
,binary-bars
,binary-slate
,binary-chyron-strict
,binary-chyron-relaxed
,binary-credits
,nopreset
(See also
tfLabelMap
) Preset alias of a label mapping. If notnopreset
, this parameter will override thetfLabelMap
parameter. Available presets are:
-noprebin
: []
-nomap
: []
-strict
: [‘B
:Bars
’, ‘S
:Slate
’, ‘S:H
:Slate
’, ‘S:C
:Slate
’, ‘S:D
:Slate
’, ‘S:B
:Slate
’, ‘S:G
:Slate
’, ‘I
:Chyron-person
’, ‘N
:Chyron-person
’, ‘C
:Credits
’, ‘R
:Credits
’, ‘M
:Main
’, ‘O
:Opening
’, ‘W
:Opening
’, ‘Y
:Chyron-other
’, ‘U
:Chyron-other
’, ‘K
:Chyron-other
’, ‘L
:Other-text
’, ‘G
:Other-text
’, ‘F
:Other-text
’, ‘E
:Other-text
’, ‘T
:Other-text
’]
-simpler
: [‘B
:Bars
’, ‘S
:Slate
’, ‘S:H
:Slate
’, ‘S:C
:Slate
’, ‘S:D
:Slate
’, ‘S:B
:Slate
’, ‘S:G
:Slate
’, ‘I
:Chyron
’, ‘N
:Chyron
’, ‘C
:Credits
’, ‘R
:Credits
’]
-simple
: [‘B
:Bars
’, ‘S
:Slate
’, ‘S:H
:Slate
’, ‘S:C
:Slate
’, ‘S:D
:Slate
’, ‘S:B
:Slate
’, ‘S:G
:Slate
’, ‘I
:Chyron-person
’, ‘N
:Chyron-person
’, ‘C
:Credits
’, ‘R
:Credits
’, ‘M
:Other-text
’, ‘O
:Other-text
’, ‘W
:Other-text
’, ‘Y
:Other-text
’, ‘U
:Other-text
’, ‘K
:Other-text
’, ‘L
:Other-text
’, ‘G
:Other-text
’, ‘F
:Other-text
’, ‘E
:Other-text
’, ‘T
:Other-text
’]
-relaxed
: [‘B
:Bars
’, ‘S
:Slate
’, ‘S:H
:Slate
’, ‘S:C
:Slate
’, ‘S:D
:Slate
’, ‘S:B
:Slate
’, ‘S:G
:Slate
’, ‘Y
:Chyron
’, ‘U
:Chyron
’, ‘K
:Chyron
’, ‘I
:Chyron
’, ‘N
:Chyron
’, ‘C
:Credits
’, ‘R
:Credits
’, ‘M
:Other-text
’, ‘O
:Other-text
’, ‘W
:Other-text
’, ‘L
:Other-text
’, ‘G
:Other-text
’, ‘F
:Other-text
’, ‘E
:Other-text
’, ‘T
:Other-text
’]
-binary-bars
: [‘B
:Bars
’]
-binary-slate
: [‘S
:Slate
’, ‘S:H
:Slate
’, ‘S:C
:Slate
’, ‘S:D
:Slate
’, ‘S:B
:Slate
’, ‘S:G
:Slate
’]
-binary-chyron-strict
: [‘I
:Chyron-person
’, ‘N
:Chyron-person
’]
-binary-chyron-relaxed
: [‘Y
:Chyron
’, ‘U
:Chyron
’, ‘K
:Chyron
’, ‘I
:Chyron
’, ‘N
:Chyron
’]
-binary-credits
: [‘C
:Credits
’, ‘R
:Credits
’]
Only applies whenuseStitcher=true
. -
pretty
: optional, defaults tofalse
- Type: boolean
- Multivalued: False
- Choices:
false
,true
The JSON body of the HTTP response will be re-formatted with 2-space indentation
-
runningTime
: optional, defaults tofalse
- Type: boolean
- Multivalued: False
- Choices:
false
,true
The running time of the app will be recorded in the view metadata
-
hwFetch
: optional, defaults tofalse
- Type: boolean
- Multivalued: False
- Choices:
false
,true
The hardware information (architecture, GPU and vRAM) will be recorded in the view metadata
Outputs
(Note: “*” as a property value means that the property is required but can be any value.)
(Note: Not all output annotations are always generated.)
- http://mmif.clams.ai/vocabulary/TimeFrame/v5
- timeUnit = “milliseconds”
- http://mmif.clams.ai/vocabulary/TimePoint/v4
- timeUnit = “milliseconds”
- labelset = a list of [“B”, “S”, “I”, “C”, “R”, “M”, “O”, “W”, “N”, “Y”, “U”, “K”, “L”, “G”, “F”, “E”, “T”, “P”]