Spoken Language Identification (v0.1)
About this version
- Submitter: keighrim
- Submission Time: 2025-11-05T19:58:01+00:00
- Prebuilt Container Image: ghcr.io/clamsproject/app-spoken-lid:v0.1
-
Release Notes
experimental release of spoken LID app prototype based on whisper
About this app (See raw metadata.json)
Chunk-level language ID over audio based on OpenAI Whisper
- App ID: http://apps.clams.ai/spoken-lid/v0.1
- App License: Apache 2.0
- Source Repository: https://github.com/clamsproject/app-spoken-lid (source tree of the submitted version)
- Analyzer Version: v20240930
- Analyzer License: MIT
Inputs
(Note: “*” as a property value means that the property is required but can be any value.)
One of the following is required: [
-
http://mmif.clams.ai/vocabulary/AudioDocument/v1 (required) (of any properties)
-
http://mmif.clams.ai/vocabulary/VideoDocument/v1 (required) (of any properties)
]
Configurable Parameters
(Note: Multivalued means the parameter can have one or more values.)
-
model: optional, defaults totiny- Type: string
- Multivalued: False
- Choices:
tiny,base,small,medium,large,turbo
Whisper model size
-
chunk: optional, defaults to30- Type: number
- Multivalued: False
chunk/window length in seconds
-
top: optional, defaults to3- Type: integer
- Multivalued: False
top-k language scores
-
batchSize: optional, defaults to1- Type: integer
- Multivalued: False
number of windows processed in a batch
-
pretty: optional, defaults tofalse- Type: boolean
- Multivalued: False
- Choices:
false,true
The JSON body of the HTTP response will be re-formatted with 2-space indentation
-
runningTime: optional, defaults tofalse- Type: boolean
- Multivalued: False
- Choices:
false,true
The running time of the app will be recorded in the view metadata
-
hwFetch: optional, defaults tofalse- Type: boolean
- Multivalued: False
- Choices:
false,true
The hardware information (architecture, GPU and vRAM) will be recorded in the view metadata
Outputs
(Note: “*” as a property value means that the property is required but can be any value.)
(Note: Not all output annotations are always generated.)
- http://mmif.clams.ai/vocabulary/TimeFrame/v6
- timeUnit = “seconds”
- labalSet = “https://raw.githubusercontent.com/openai/whisper/refs/tags/v20240930/whisper/tokenizer.py”