Skip to content

Automatic speech recognition with speaker diarisation

License

Notifications You must be signed in to change notification settings

HanBnrd/NeMoASR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NeMoASR

Automatic speech recognition with speaker diarisation.

Based on:

Requirements

Python 3.12+

Setup

Linux:

sudo apt install ffmpeg
conda create -n nemoasr python=3.12
conda activate nemoasr
pip install git+https://github.com/HanBnrd/NeMoASR.git

MacOS:

brew install ffmpeg
conda create -n nemoasr python=3.12
conda activate nemoasr
pip install git+https://github.com/HanBnrd/NeMoASR.git

Update NeMoASR

pip install --upgrade git+https://github.com/HanBnrd/NeMoASR.git

Usage

To transcribe a WAV or MPEG file:

nemoasr myfile.mp3

Note: running this for the first time may be long as the models need to be downloaded.

The default configuration cuts long audio files into 7-minute chunks, which should work well on machines with limited RAM or VRAM. However, the chunk duration can be adjusted if needed. For example with more RAM or VRAM:

nemoasr myfile.mp3 --max-duration=12

This will cut a long audio file into chunks of 12 minutes maximum.

About

Automatic speech recognition with speaker diarisation

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages