Skip to content

jatinkrmalik/vocalinux

Vocalinux

Voice-to-text for Linux, finally done right!

Status: Alpha GitHub release License: GPL v3

Vocalinux CI Platform: Linux Python 3.8+ Made with GTK codecov

GitHub stars GitHub forks GitHub watchers Last commit Commit activity Contributions welcome GitHub issues

Vocalinux Users

A seamless free open-source private voice dictation system for Linux, comparable to the built-in solutions on macOS and Windows.

πŸŽ‰ Alpha Release!

We're excited to share Vocalinux with the community. Try it out and let us know what you think!


✨ Features

  • 🎀 Double-tap Ctrl to start/stop voice dictation
  • ⚑ Real-time transcription with minimal latency
  • 🌎 Universal compatibility across all Linux applications
  • πŸ”’ Offline operation for privacy and reliability (with VOSK)
  • πŸ€– Optional Whisper AI support for enhanced accuracy
  • 🎨 System tray integration with visual status indicators
  • πŸ”Š Audio feedback for recording status
  • βš™οΈ Graphical settings dialog for easy configuration

πŸš€ Quick Install

One-liner Installation (Recommended)

curl -fsSL https://raw.githubusercontent.com/jatinkrmalik/vocalinux/main/install.sh | bash -s -- --tag=v0.4.0-alpha

Note: Installs v0.4.0-alpha. For the most recent version, check GitHub Releases.

This will:

  • Clone the repository to ~/.local/share/vocalinux-install
  • Install all system dependencies
  • Set up a virtual environment in ~/.local/share/vocalinux/venv
  • Install both VOSK and Whisper AI speech engines:
    • VOSK: installs the vosk Python package from PyPI
    • Whisper: installs the openai-whisper package from PyPI, which also pulls in PyTorch (the ML framework Whisper requires)
  • Create a symlink at ~/.local/bin/vocalinux
  • Download the default Whisper tiny speech model (~75MB)

⏱️ Note: Installation takes ~5-10 minutes due to Whisper AI dependencies (PyTorch with CUDA support, ~2.3GB).

Whisper with CPU-only PyTorch (no NVIDIA GPU needed):

curl -fsSL https://raw.githubusercontent.com/jatinkrmalik/vocalinux/main/install.sh | bash -s -- --tag=v0.4.0-alpha --whisper-cpu

This installs Whisper with CPU-only PyTorch (~200MB instead of ~2.3GB). Works great for systems without NVIDIA GPU.

For low-RAM systems (8GB or less) - VOSK only:

curl -fsSL https://raw.githubusercontent.com/jatinkrmalik/vocalinux/main/install.sh | bash -s -- --tag=v0.4.0-alpha --no-whisper

This skips Whisper installation entirely and configures VOSK as the default engine.

Alternative: Install from Source

# Clone the repository
git clone https://github.com/jatinkrmalik/vocalinux.git
cd vocalinux

# Run the installer (will prompt for Whisper)
./install.sh

# Or with Whisper support
./install.sh --with-whisper

The installer handles everything: system dependencies, Python environment, speech models, and desktop integration.

After Installation

# If ~/.local/bin is in your PATH (recommended):
vocalinux

# Or activate the virtual environment first:
source ~/.local/bin/activate-vocalinux.sh
vocalinux

# Or run directly:
~/.local/share/vocalinux/venv/bin/vocalinux

Or launch it from your application menu!

Uninstall

# If installed via curl:
curl -fsSL https://raw.githubusercontent.com/jatinkrmalik/vocalinux/main/uninstall.sh | bash

# If installed from source:
./uninstall.sh

πŸ“‹ Requirements

  • OS: Ubuntu 22.04+ (other Linux distros may work)
  • Python: 3.8 or newer
  • Display: X11 or Wayland
  • Hardware: Microphone for voice input

πŸŽ™οΈ Usage

Voice Dictation

  1. Double-tap Ctrl to start recording
  2. Speak clearly into your microphone
  3. Double-tap Ctrl again (or pause speaking) to stop

Voice Commands

Command Action
"new line" Inserts a line break
"period" / "full stop" Types a period (.)
"comma" Types a comma (,)
"question mark" Types a question mark (?)
"exclamation mark" Types an exclamation mark (!)
"delete that" Deletes the last sentence
"capitalize" Capitalizes the next word

Command Line Options

vocalinux --help              # Show all options
vocalinux --debug             # Enable debug logging
vocalinux --engine whisper    # Use Whisper AI engine
vocalinux --model medium      # Use medium-sized model
vocalinux --wayland           # Force Wayland mode

βš™οΈ Configuration

Configuration is stored in ~/.config/vocalinux/config.json:

{
  "speech_recognition": {
    "engine": "vosk",
    "model_size": "small",
    "vad_sensitivity": 3,
    "silence_timeout": 2.0
  }
}

You can also configure settings through the graphical Settings dialog (right-click the tray icon).

πŸ”§ Development Setup

# Clone and install in dev mode
git clone https://github.com/jatinkrmalik/vocalinux.git
cd vocalinux
./install.sh --dev

# Activate environment
source venv/bin/activate

# Run tests
pytest

# Run from source with debug
python -m vocalinux.main --debug

πŸ“ Project Structure

vocalinux/
β”œβ”€β”€ src/vocalinux/           # Main application code
β”‚   β”œβ”€β”€ speech_recognition/  # Speech recognition engines
β”‚   β”œβ”€β”€ text_injection/      # Text injection (X11/Wayland)
β”‚   β”œβ”€β”€ ui/                  # GTK UI components
β”‚   └── utils/               # Utility functions
β”œβ”€β”€ tests/                   # Test suite
β”œβ”€β”€ resources/               # Icons and sounds
β”œβ”€β”€ docs/                    # Documentation
└── web/                     # Website source

πŸ“– Documentation

πŸ—ΊοΈ Roadmap

  • Custom icon design βœ…
  • Graphical settings dialog βœ…
  • Whisper AI support βœ…
  • Multi-language support (FR, DE, RU) βœ…
  • In-app update mechanism
  • Application-specific commands
  • Debian/Ubuntu package (.deb)
  • Improved Wayland support
  • Voice command customization

🀝 Contributing

We welcome contributions! Whether it's bug reports, feature requests, or code contributions, please check out our Contributing Guide.

Quick Links

⭐ Support

If you find Vocalinux useful, please consider:

  • ⭐ Starring this repository
  • πŸ› Reporting bugs you encounter
  • πŸ“– Improving documentation
  • πŸ”€ Contributing code

πŸ“œ License

This project is licensed under the GNU General Public License v3.0 - see the LICENSE file for details.


Made with ❀️ for the Linux community