whisperX REST API

The whisperX API is a tool for enhancing and analyzing audio content. This API provides a suite of services for processing audio and video files, including transcription, alignment, diarization, and combining transcript with diarization results.

Documentation

Swagger UI is available at /docs for all the services, dump of OpenAPI definition is awailable in folder app/docs as well. You can explore it directly in Swagger Editor

See the WhisperX Documentation for details on whisperX functions.

Language and Whisper model settings

in .env you can define default Language DEFAULT_LANG, if not defined en is used (you can also set it in the request)
.env contains defintion of Whisper model using WHISPER_MODEL (you can also set it in the request)

Task management and result storage

Status and result of each tasks are stored in db using ORM Sqlalchemy, db connection is defined by enviroment variable DB_URL if value is not specified db.py sets default db as sqlite:///records.db

See documentation for driver definition at Sqlalchemy Engine configuration if you want to connect other type of db than Sqlite.

Database schema

Structure of the of the db is described in DB Schema

Getting Started

Local Run

To get started with the API, follow these steps:

Create virtual enviroment
Install pytorch See for more details
Install whisperX

pip install git+https://github.com/m-bain/whisperx.git

Install the required dependencies:

pip install -r requirements.txt

Create .env file

define your Whisper Model and token for Huggingface

HF_TOKEN=<<YOUR HUGGINGFACE TOKEN>>
WHISPER_MODEL=<<WHISPER MODEL SIZE>>

Run the FastAPI application:

uvicorn app.main:app --reload

The API will be accessible at http://127.0.0.1:8000.

Docker Build

Create .env file

HF_TOKEN=<<YOUR HUGGINGFACE TOKEN>>
WHISPER_MODEL=<<WHISPER MODEL SIZE>>

Build Image

using docker-compose.yaml

#build and start the image using compose file
docker-compose up

alternative approach

#build image
docker build -t whisperx-service .

# Run Container
docker run -d --gpus all -p 8000:8000 --env-file .env whisperx-service

The API will be accessible at http://127.0.0.1:8000.

Model cache

The models used by whisperX are stored in root/.cache, if you want to avoid downloanding the models each time the container is starting you can store the cache in persistant storage. docker-compose.yaml defines a volumne whisperx-models-cache to store this cache.

faster-whisper cache: root/.cache/huggingface/hub
pyannotate and other models cache: root/.cache/torch

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
app		app
.dockerignore		.dockerignore
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml
dockerfile		dockerfile
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

whisperX REST API

Documentation

Language and Whisper model settings

Task management and result storage

Database schema

Getting Started

Local Run

Docker Build

Model cache

Related

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

whisperX REST API

Documentation

Language and Whisper model settings

Task management and result storage

Database schema

Getting Started

Local Run

Docker Build

Model cache

Related

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages