CLAMS App Directory

How to use CLAMS apps

Want to know how to use CLAMS apps? Check out CLAMS App user manual.
Want a human friendly views of MMIF JSON files? Visit MMIF visualizer repository.

App Directory

parakeet-wrapper

A CLAMS wrapper for NVIDIA NeMo Parakeet ASR models available on huggingface-hub with support for punctuation, capitalization, and word-level timestamping.

v1.0 (@shel-ho)

swt-detection

Detects scenes with text, like slates, chyrons and credits. This app can run in three modes, depending on useClassifier, useStitcher parameters. When useClassifier=True, it runs in the “TimePoint mode” and generates TimePoint annotations. When useStitcher=True, it runs in the “TimeFrame mode” and generates TimeFrame annotations based on existing TimePoint annotations – if no TimePoint is found, it produces an error. By default, it runs in the ‘both’ mode and first generates TimePoint annotations and then TimeFrame annotations on them.

spacy-wrapper

Apply spaCy NLP to all text documents in a MMIF file.

doctr-wrapper

CLAMS app wraps the docTR, End-to-End OCR model. The model can detect text regions in the input image and recognize text in the regions (via parseq OCR model, only English is support at the moment). The text-localized regions are organized hierarchically by the model into “pages” > “blocks” > “lines” > “words”, and this CLAMS app translates them into TextDocument, Paragraphs, Sentence, and Token annotations to represent recognized text contents. See descriptions for I/O types below for details on how annotations are aligned to each other.

llava-captioner

Applies LLaVA v1.6 Mistral-7B to video frames for image captioning.

whisper-wrapper

A CLAMS wrapper for Whisper-based ASR software originally developed by OpenAI.

distil-whisper-wrapper

The wrapper of Distil-Whisper, avaliable models: distil-large-v3, distil-large-v2, distil-medium.en, distil-small.en. The default model is distil-small.en.

simple-timepoints-stitcher

Stitches a sequence of TimePoint annotations into a sequence of TimeFrame annotations, performing simple smoothing of short peaks of positive labels.

tfidf-keywordextractor

extract keywords of a text document according to TF-IDF values. IDF values and all features come from related pickle files in the current directory.App can either take a simple text document or take a MMIF file generated from the text slicer app.

v1.0 (@selenasong)

text-slicer

Slice text snippets from a provided text document given time frames

v1.0 (@bohJiang12)

east-textdetection

OpenCV-based text localization app that used EAST text detection model. Please visit the source code repository for full documentation.

inaspeechsegmenter-wrapper

inaSpeechSegmenter is a CNN-based audio segmentation toolkit. The original software can be found at https://github.com/ina-foss/inaSpeechSegmenter .

pyscenedetect-wrapper

CLAMS app wraps PySceneDetect and performs shot boundary detection on input videos

easyocr-wrapper

Using EasyOCR to extract text from timeframes

v1.1 (@snewman-aa)
v1.0 (@snewman-aa)

dbpedia-spotlight-wrapper

Apply named entity linking to all text documents in a MMIF file.

slatedetection

This tool detects slates.

fewshotclassifier

This tool uses a vision model to classify video segments. Currenly supports “chyron” frame type.

v1.0 (@keighrim)

barsdetection

This tool detects SMPTE color bars.

v1.1 (@keighrim)
v1.0 (@keighrim)

parseqocr-wrapper

This tool applies Parseq OCR to a video or image and generates text boxes and OCR results.

v1.0 (@keighrim)

tesseractocr-wrapper

This tool applies Tesseract OCR to a video or image and generates text boxes and OCR results.

v1.0 (@keighrim)

chyron-detection

This tool detects chyrons, generates time segments.

v1.0 (@keighrim)

gentle-forced-aligner-wrapper

This CLAMS app aligns transcript and audio track using Gentle. Gentle is a robust yet lenient forced aligner built on Kaldi.This app only works when Gentle is already installed locally.Unfortunately, Gentle is not distributed as a Python package distribution.To get Gentle installation instruction, see https://lowerquality.com/gentle/ Make sure install Gentle from the git commit specified in analyzer_version in this metadata.

v1.0 (@keighrim)

tonedetection

Detects spans of monotonic audio within an audio file

v1.0 (@MrSqually)

brandeis-acs-wrapper

Brandeis Acoustic Classification & Segmentation (ACS) is a audio segmentation tool developed at Brandeis Lab for Linguistics and Computation. The original software can be found at https://github.com/brandeis-llc/acoustic-classification-segmentation .

v2 (@keighrim)
v1 (@keighrim)

aapb-pua-kaldi-wrapper

A CLAMS wrapper for Kaldi-based ASR software originally developed by PopUpArchive and hipstas, and later updated by Kyeongmin Rim at Brandeis University. Wrapped software can be found at https://github.com/brandeis-llc/aapb-pua-kaldi-docker .

v2 (@keighrim)
v1 (@keighrim)

CLAMS Team