Spoken Language Identification (v0.2)

About this version

About this app (See raw metadata.json)

Chunk-level language ID over audio based on OpenAI Whisper

Inputs

(Note: “*” as a property value means that the property is required but can be any value.)

One of the following is required: [

]

Configurable Parameters

(Note: Multivalued means the parameter can have one or more values.)

  • model: optional, defaults to tiny

    • Type: string
    • Multivalued: False
    • Choices: tiny, base, small, medium, large, turbo

    Whisper model size

  • chunk: optional, defaults to 30

    • Type: number
    • Multivalued: False

    chunk/window length in seconds

  • top: optional, defaults to 3

    • Type: integer
    • Multivalued: False

    top-k language scores

  • batchSize: optional, defaults to 1

    • Type: integer
    • Multivalued: False

    number of windows processed in a batch

  • pretty: optional, defaults to false

    • Type: boolean
    • Multivalued: False
    • Choices: false, true

    The JSON body of the HTTP response will be re-formatted with 2-space indentation

  • runningTime: optional, defaults to false

    • Type: boolean
    • Multivalued: False
    • Choices: false, true

    The running time of the app will be recorded in the view metadata

  • hwFetch: optional, defaults to false

    • Type: boolean
    • Multivalued: False
    • Choices: false, true

    The hardware information (architecture, GPU and vRAM) will be recorded in the view metadata

Outputs

(Note: “*” as a property value means that the property is required but can be any value.)

(Note: Not all output annotations are always generated.)