STT (Speech-to-Text)

293 repos

Sort by

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

★ 159,926Pythonupdated 2026-04-20audiodeep-learningdeepseekgemmaglm

openai/whisper

Robust Speech Recognition via Large-Scale Weak Supervision

★ 98,410Pythonupdated 2026-04-15

ggml-org/whisper.cpp

Port of OpenAI's Whisper model in C/C++

★ 49,040C++updated 2026-04-20inferenceopenaispeech-recognitionspeech-to-texttransformer

khoj-ai/khoj

Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous AI (gpt, claude, gemini, llama, qwen, mistral). Get started - free.

★ 34,251Pythonupdated 2026-03-26agentaiassistantchatchatgpt

mozilla-ai/llamafile

Distribute and run LLMs with a single file.

★ 24,274C++updated 2026-04-17cross-platformggufllama-cpplocal-ailocal-inference

jamiepine/voicebox

The open-source AI voice studio. Clone, dictate, create.

★ 23,403TypeScriptupdated 2026-04-20aicudamlxqwen3-ttsqwen3-tts-ui

SYSTRAN/faster-whisper

Faster Whisper transcription with CTranslate2

★ 22,404Pythonupdated 2025-11-19deep-learninginferenceopenaiquantizationspeech-recognition

m-bain/whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

★ 21,502Pythonupdated 2026-04-04asrspeechspeech-recognitionspeech-to-textwhisper

cjpais/Handy

A free, open source, and extensible speech-to-text application that works completely offline.

★ 20,572Rustupdated 2026-04-19accessibilitycross-platformspeech-to-texttauri-v2

chidiwilliams/buzz

Buzz transcribes and translates audio offline on your personal computer. Powered by OpenAI's Whisper.

★ 18,900Pythonupdated 2026-04-19whisper

Panniantong/Agent-Reach

Give your AI agent eyes to see the entire internet. Read & search Twitter, Reddit, YouTube, GitHub, Bilibili, XiaoHongShu — one CLI, zero API fees.

★ 18,240Pythonupdated 2026-04-13agent-infrastructureai-agentai-searchautomationbilibili

leon-ai/leon

🧠 Leon is your open-source personal assistant.

★ 17,188TypeScriptupdated 2026-04-19aiai-assistantartificial-intelligenceassistantautomation

NVIDIA-NeMo/NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

★ 17,126Pythonupdated 2026-04-20asrdeeplearninggenerative-aimachine-translationneural-networks

modelscope/FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

★ 15,852Pythonupdated 2026-03-17audio-visual-speech-recognitionconformerdfsmnparaformerpretrained-model

kaldi-asr/kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.

★ 15,378Shellupdated 2025-09-22c-plus-pluscudakaldishellspeaker-id

PaddlePaddle/PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

★ 12,594Pythonupdated 2026-04-15asrcode-switchconformerkwspunctuation-restoration

Vaibhavs10/insanely-fast-whisper

★ 12,575Jupyter Notebookupdated 2025-10-25

BasedHardware/omi

AI that sees your screen, listens to your conversations and tells you what to do

★ 12,196Dartupdated 2026-04-20aiappbcicflutter

k2-fsa/sherpa-onnx

Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, RK NPU, Axera NPU, Ascend NPU, x86_64 servers, websocket server/client, support 12 programming languages

★ 11,826C++updated 2026-04-20aarch64androidarm32asrcpp

speechbrain/speechbrain

A PyTorch-based Speech Toolkit

★ 11,475Pythonupdated 2026-04-03asraudioaudio-processingdeep-learninghuggingface

Zackriya-Solutions/meetily

Privacy first, AI meeting assistant with 4x faster Parakeet/Whisper live transcription, speaker diarization, and Ollama summarization built on Rust. 100% local processing. no cloud required. Meetily (Meetly Ai - https://meetily.ai) is the #1 Self-hosted, Open-source Ai meeting note taker for macOS & Windows.

★ 11,335Rustupdated 2026-03-16aiai-meeting-assistantllmlocal-aimac

Const-me/Whisper

High-performance GPGPU inference of OpenAI's Whisper automatic speech recognition (ASR) model

★ 10,356C++updated 2024-08-03

QuentinFuxa/WhisperLiveKit

Simultaneous speech-to-text models

★ 10,187Pythonupdated 2026-03-31

openvinotoolkit/openvino

OpenVINO™ is an open source toolkit for optimizing and deploying AI inference

★ 10,142C++updated 2026-04-20aicomputer-visiondeep-learningdeploy-aidiffusion-models

KoljaB/RealtimeSTT

A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcription.

★ 9,743Pythonupdated 2026-03-14pythonrealtimespeech-to-text

xorbitsai/inference

Swap GPT for any LLM by changing a single line of code. Xinference lets you run open-source, speech, and multimodal models on cloud, on-prem, or your laptop — all through one unified, production-ready inference API.

★ 9,261Pythonupdated 2026-04-20artificial-intelligencechatglmdeploymentflan-t5gemma

FunAudioLLM/SenseVoice

Multilingual Voice Understanding Model

★ 8,041Pythonupdated 2025-12-30aiaigcasraudio-event-classificationcross-lingual

moonshine-ai/moonshine

Very low latency speech to text, intent recognition, and text to speech, for building voice agents and interfaces

★ 7,857Cupdated 2026-04-20intent-recognitionsttttsvoicevoice-recognition

GetStream/Vision-Agents

Open Vision Agents by Stream. Build voice and vision agents quickly with any model or video provider. Uses Stream's edge network for ultra-low latency.

★ 7,688Pythonupdated 2026-04-17agentic-aiagentsaiai-agentsrealtime

abus-aikorea/voice-pro

Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.

★ 6,708Pythonupdated 2025-12-05audiobookfaster-whispergradiokaraokepodcasts

snakers4/silero-models

Silero Models: pre-trained text-to-speech models made embarrassingly simple

★ 5,888Jupyter Notebookupdated 2026-04-16armenianazerbaijanibelaruscolabgeorgian

thewh1teagle/vibe

Transcribe on your own!

★ 5,849TypeScriptupdated 2026-04-19aicross-platformdesktopopenairust

MahmoudAshraf97/whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

★ 5,502Jupyter Notebookupdated 2026-02-23asrspeaker-diarizationspeechspeech-recognitionspeech-to-text

Picovoice/porcupine

On-device wake word detection powered by deep learning

★ 4,798Pythonupdated 2026-04-17handsfreehotwordhotword-detectionhotword-detectorkeyword-spotter

cactus-compute/cactus

Low-latency AI engine for mobile devices & wearables

★ 4,689Cupdated 2026-04-20aiandroidarmedgeedge-ai

huggingface/speech-to-speech

Build local voice agents with open-source models

★ 4,686Pythonupdated 2026-04-20aiassistantlanguage-modelmachine-learningpython

WhisperSpeech/WhisperSpeech

An Open Source text-to-speech system built by inverting Whisper.

★ 4,595Jupyter Notebookupdated 2025-12-14pytorchspeech-synthesistts

leetcode-mafia/cheetah

Mac app for crushing tech interviews with AI

★ 4,263Swiftupdated 2025-01-14aichatgptgptgpt-4openai

huggingface/distil-whisper

Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.

★ 4,073Pythonupdated 2025-01-08audiospeech-recognitionwhisper

collabora/WhisperLive

A nearly-live implementation of OpenAI's Whisper.

★ 3,984Pythonupdated 2026-04-16dictationobsopenaiopenvinoopenvino-intel

umlx5h/LLPlayer

The media player for language learning, with dual subtitles, AI-generated subtitles, real-time translation, and more!

★ 3,657C#updated 2026-04-19asrcsharpflyleaflanguage-learningllm

ufal/whisper_streaming

Whisper realtime streaming for long speech-to-text transcription and translation

★ 3,603Pythonupdated 2025-11-12

n3d1117/chatgpt-telegram-bot

🤖 A Telegram bot that integrates with OpenAI's official ChatGPT APIs to provide answers, written in Python

★ 3,451Pythonupdated 2025-06-03chatgptdall-eopenaipythontelegram-bot

adamcohenhillel/ADeus

An open source AI wearable device that captures what you say and hear in the real world and then transcribes and stores it on your own server. You can then chat with Adeus using the app, and it will have all the right context about what you want to talk about - a truly personalized, personal AI.

★ 3,401TypeScriptupdated 2024-04-22aiopenopen-source-aiwearwearable

tmoroney/auto-subs

Instantly generate AI-powered subtitles on your device. Works standalone or connects to DaVinci Resolve.

★ 3,269TypeScriptupdated 2026-04-19aidavincidavinci-resolvediarizelinux

HeyWillow/willow

Open source, local, and self-hosted Amazon Echo/Google Home competitive Voice Assistant alternative

★ 3,017Cupdated 2026-02-13alexadeep-learningechoesp-adfesp-idf

pluja/whishper

Transcribe any audio to text, translate and edit subtitles 100% locally with a web UI. Powered by whisper models!

★ 2,979Svelteupdated 2025-08-15aiaudio-to-textgolangspeech-recognitionspeech-to-text

davabase/whisper_real_time

Real time transcription with OpenAI Whisper.

★ 2,926Pythonupdated 2025-04-15

linto-ai/whisper-timestamped

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

★ 2,803Pythonupdated 2025-09-09asrattention-is-all-you-needattention-mechanismattention-modelattention-network

facebookresearch/omnilingual-asr

Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages

★ 2,780Pythonupdated 2025-12-30

jhj0517/Whisper-WebUI

A Web UI for easy subtitle using whisper model.

★ 2,764Pythonupdated 2025-12-29aigradioopen-sourcepythonpytorch

rishikanthc/Scriberr

Self-hosted AI audio transcription

★ 2,605Goupdated 2026-03-22aiaudiotranscripttranscription

wendy7756/AI-Video-Transcriber

Transcribe and summarize videos and podcasts using AI. Open-source, multi-platform, and supports multiple languages.

★ 2,548Pythonupdated 2026-03-07aitooltiktoktranscribevideototextyoutube

sindresorhus/awesome-whisper

🔊 Awesome list for Whisper — an open-source AI-powered speech recognition system developed by OpenAI

★ 2,275updated 2026-03-17aiartificial-intelligenceawesomeawesome-listgpt

m1guelpf/auto-subtitle

Automatically generate and overlay subtitles for any video.

★ 2,203Pythonupdated 2024-07-12ffmpegopenai-whispersubtitle-generatorsubtitlessubtitles-generator

react-native-voice/voice

:microphone: React Native Voice Recognition library for iOS and Android (Online and Offline Support)

★ 2,156TypeScriptupdated 2026-01-31androidiosreact-nativespeech-recognitionvoice-recognition

Nutlope/notesGPT

Record voice notes & transcribe, summarize, and get tasks

★ 2,126TypeScriptupdated 2026-02-27

TEN-framework/ten-vad

Voice Activity Detector (VAD) : low-latency, high-performance and lightweight

★ 2,091Cupdated 2026-02-02audioautomatic-speech-recognitionconversational-aireal-timesilero-vad

schibsted/WAAS

Whisper as a Service (GUI and API with queuing for OpenAI Whisper)

★ 2,066JavaScriptupdated 2026-04-17

kadirnar/whisper-plus

WhisperPlus: Faster, Smarter, and More Capable 🚀

★ 1,947Pythonupdated 2026-03-02

ricky0123/vad

Voice activity detector (VAD) for the browser with a simple API

★ 1,945TypeScriptupdated 2026-01-30onnxruntimesilero-vadspeech-to-texttypescriptvoice-activity-detection

kaixxx/noScribe

Cutting edge AI technology for automated audio transcription. A nice GUI for OpenAIs Whisper and pyannote (speaker identification)

★ 1,876Pythonupdated 2026-04-17audio-transcriptionfaster-whisperinterviewpyannotequalitative-research

ideasman42/nerd-dictation

Simple, hackable offline speech to text - using the VOSK-API.

★ 1,818Pythonupdated 2025-10-10

FL33TW00D/whisper-turbo

Cross-Platform, GPU Accelerated Whisper 🏎️

★ 1,800TypeScriptupdated 2024-02-27audiomachine-learningrustspeech-recognitionwebgpu

antirez/voxtral.c

Pure C inference of Mistral Voxtral Realtime 4B speech to text model

★ 1,627Cupdated 2026-02-15

semperai/amica

Amica is an open source interface for interactive communication with 3D characters with voice synthesis and speech recognition.

★ 1,492TypeScriptupdated 2025-07-23aiassistant-chat-botscomputer-visionllmspeech-recognition

royshil/obs-localvocal

OBS plugin for local speech recognition and captioning using AI

★ 1,458C++updated 2026-04-09ailive-streaminglivestreamobsobs-studio

mkiol/dsnote

Speech Note Linux app. Note taking, reading and translating with offline Speech to Text, Text to Speech and Machine translation.

★ 1,442C++updated 2026-04-15asrflatpak-applicationslinux-desktopmachine-translationnmt

bytedance/SALMONN

SALMONN family: A suite of advanced multi-modal LLMs

★ 1,412updated 2026-04-20audioaudio-processingaudio-visual-understandingbytedanceiclr2024

Capsize-Games/airunner

Offline inference engine for art, real-time voice conversations, LLM powered chatbots and automated workflows

★ 1,314Pythonupdated 2026-02-10aiai-artartasset-generatorchatbot

yeyupiaoling/Whisper-Finetune

Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deployment, Windows desktop deployment, and Android deployment

★ 1,211Cupdated 2025-12-17androidasrchinesectranslate2huggingface

oTranscribe/oTranscribe

A free & open tool for transcribing audio interviews

★ 1,192JavaScriptupdated 2026-04-16

amicalhq/amical

🎙️ AI Dictation App - Open Source and Local-first ⚡ Type 3x faster, no keyboard needed. 🆓 Powered by open source models, works offline, fast and accurate.

★ 1,176TypeScriptupdated 2026-04-18aiai-note-taking-appasrdictatedictation

mbailey/voicemode

Natural (2-way) voice conversations with Claude Code

★ 1,125Pythonupdated 2026-04-19anthropicasrclaudeclaudecodekokoro

PromtEngineer/Verbi

A modular voice assistant application for experimenting with state-of-the-art transcription, response generation, and text-to-speech models. Supports OpenAI, Groq, Elevanlabs, CartesiaAI, and Deepgram APIs, plus local models via Ollama. Ideal for research and development in voice technology.

★ 1,119Pythonupdated 2025-11-22

JuergenFleiss/aTrain

A GUI tool for offline transcription of speech recordings, including speaker diarization, utilizing state-of-the-art machine learning models.

★ 1,113Pythonupdated 2026-04-16

goodroot/hyprwhspr

Native speech-to-text for Linux - Fast, accurate and private system-wide dictation

★ 986Pythonupdated 2026-04-19aiarchlinuxcachyoscohere-aidebian

alesaccoia/VoiceStreamAI

Near-Realtime audio transcription using self-hosted Whisper and WebSocket in Python/JS

★ 957Pythonupdated 2024-10-02aispeech-recognitionspeech-to-textwebsocket

octimot/StoryToolkitAI

An editing tool that uses AI to transcribe, understand content and search for anything in your footage, integrated with ChatGPT and other AI models

★ 936Pythonupdated 2025-02-26aichatgptdavinci-resolveeditingfilm-editing

sandrohanea/whisper.net

Whisper.net. Speech to text made simple using Whisper Models

★ 906C#updated 2026-03-16cross-platformdotnetdotnetcorehacktoberfestspeech-recognition

jitsi/jiwer

Evaluate your speech-to-text system with similarity measures such as word error rate (WER)

★ 881Pythonupdated 2026-04-16automatic-speech-recognitionevaluation-metricspython3speech-to-textwer

voquill/voquill

Open source voice dictation technology

★ 880TypeScriptupdated 2026-04-19agentagentic-aidictationfreelocal-ai

TheStageAI/TheWhisper

Optimized Whisper models for streaming and on-device use

★ 829Pythonupdated 2026-04-09apple-siliconcoremlmlxnvidia-gpuon-device-ai

mayeaux/generate-subtitles

Generate transcripts for audio and video content with a user friendly UI, powered by Open AI's Whisper with automatic translations and download videos automatically with yt-dlp integration

★ 809JavaScriptupdated 2023-03-16expressjsgpulibretranslatemachine-learningnodejs

zai-org/GLM-ASR

GLM-ASR-Nano: A robust, open-source speech recognition model with 1.5B parameters

★ 797Pythonupdated 2026-03-06asredgellmvoice

Kieirra/murmure

Fully local, private and cross platform Speech-to-Text with LLM Post-processing

★ 730TypeScriptupdated 2026-04-19asrasr-modeldebian-packageslinuxmacos

Notely-Voice/NotelyVoice

A 100% private AI voice transcription app that converts speech to text in 100+ languages. Built with Compose Multiplatform for Android & iOS using Whisper AI - no cloud uploads, all processing happens on-device for complete privacy.

★ 684C++updated 2026-04-07androidaudio-playercompose-ioscompose-multiplatformcompose-multiplatform-sample

vilassn/whisper_android

Offline Speech Recognition with OpenAI Whisper and TensorFlow Lite for Android

★ 650C++updated 2026-03-18androidasrautomatic-speech-recognitionembeddedmobile

bolna-ai/bolna

Conversational voice AI agents

★ 630Pythonupdated 2026-04-20agentic-aiagentsai-agentscartesiaconversational-ai

reriiasu/speech-to-text

Real-time transcription using faster-whisper

★ 615HTMLupdated 2024-07-23faster-whisperopenaispeech-recognitionspeech-to-textvoice-recognition

bbc/react-transcript-editor

A React component to make correcting automated transcriptions of audio and video easier and faster. By BBC News Labs. - Work in progress

★ 614JavaScriptupdated 2024-02-12bbc-news-labskaldinews-labsreactstt

woheller69/whisperIME

Android Input Method Editor (IME) based on Whisper

★ 565Javaupdated 2026-02-07androidon-device-aiprivacy-protectionspeech-recognitiontranscription

Bklieger/ScribeWizard

ScribeWizard: Generate organized notes from audio using Groq, Whisper, and Llama3

★ 504Pythonupdated 2025-08-12aigroqgroq-apillama3replit

Nutlope/whisper

Take notes with your voice & transform them with AI

★ 496TypeScriptupdated 2026-03-03

inclusionAI/Ming-UniAudio

Ming-UniAudio: Speech LLM for Joint Understanding, Generation and Editing with Unified Representation

★ 443Pythonupdated 2025-11-27speechspeech-recognitionspeech-synthesisunified-model

URUWorks/TeroSubtitler

Tero Subtitler is an open source, cross-platform, and free subtitle editing software.

★ 443Pascalupdated 2026-04-18aiaudio-to-textblu-raycaptionseditor

voxos-ai/bolna

End-to-end platform for building voice first multimodal agents

★ 431Pythonupdated 2024-10-28anyscalechatgpt-apiclaude-3-sonnetdeepgramelevenlabs

asanchezyali/talking-avatar-with-ai

This project is a digital human that can talk and listen to you. It uses OpenAI's GPT to generate responses, OpenAI's Whisper to transcript the audio, Eleven Labs to generate voice and Rhubarb Lip Sync to generate the lip sync.

★ 430JavaScriptupdated 2026-01-18ai-avatarsdigital-humanelevenlabslip-synclipsync

bigsk1/voice-chat-ai

🎙️ Speak with AI - Run locally using Ollama, OpenAI, Anthropic or xAI - Speech uses SparkTTS, OpenAI, ElevenLabs, Kokoro, Typecast or xAI

★ 427Pythonupdated 2026-04-14ai-speechai-voiceai-voice-agentanthropic-claudeconversational-ai

CodeUpdaterBot/ClickUi

The best way to use AI is on your own computer. Use local or paid API models, and ctrl+k to show/hide the chat UI. Experience the future of AI, and help build it too!

★ 424Pythonupdated 2025-04-16aichatgptclaudedeepseekhotkeys

Pikurrot/whisper-gui

A simple GUI to use Whisper.

★ 424Pythonupdated 2025-07-18gradioguihuggingfaceinterfacespeech-recognition

homelab-00/TranscriptionSuite

A fully local and private Speech-To-Text app with cross-platform support, speaker diarization, Audio Notebook mode, LM Studio integration, and both longform and live transcription.

★ 396TypeScriptupdated 2026-04-19diarizationdictationdockerfaster-whisperlinux

Siddhesh2377/ToolNeuron

On-device AI for Android — LLM chat (GGUF/llama.cpp), vision models (VLM), image generation (Stable Diffusion), tool calling, AI personas, RAG knowledge packs, TTS/STT. Fully offline, zero subscriptions, open-source.

★ 379Kotlinupdated 2026-03-15ai-personasandroidgguf-modelsjetpack-composekotlin

dmtrKovalenko/subtitler

Free on-device web app for audio transcribing and rendering subtitles

★ 363ReScriptupdated 2026-02-01airescriptsubtitleswebcodecswhisper

vasistalodagala/whisper-finetune

Fine-tune and evaluate Whisper models for Automatic Speech Recognition (ASR) on custom datasets or datasets from huggingface.

★ 361Pythonupdated 2023-05-23asrjaxpytorchspeech-recognitiontransformers

chrischoy/WhisperChain

Speech to Text but with all the bells and whistles and most importantly AI! AI will clean up your filler words, edit and will refine what you said!

★ 332Pythonupdated 2025-02-09

rhasspy/wyoming-faster-whisper

Wyoming protocol server for faster whisper speech to text system

★ 319Pythonupdated 2026-03-25

istupakov/onnx-asr

A lightweight Python package for Automatic Speech Recognition using ONNX models

★ 305Pythonupdated 2026-04-19asrlightweightonnxparakeetpython

AudioLLMs/AudioBench

AudioBench: A Universal Benchmark for Audio Large Language Models

★ 303Pythonupdated 2025-06-17audio-scene-understandingspeechspeech-question-answeringspeech-recognition

woheller69/whisperIMEplus

Android Input Method Editor (IME) based on RTranslators Whisper implementation

★ 297Javaupdated 2026-04-04

NaomiProject/Naomi

The Naomi Project is an open source, technology agnostic platform for developing always-on, voice-controlled applications!

★ 295Pythonupdated 2025-07-09assistanthacktoberfesthome-automationiotjarvis

OpenBMB/UltraEval-Audio

Your faithful, impartial partner for audio evaluation — know yourself, know your rivals. 真实评测，知己知彼。

★ 290Pythonupdated 2026-04-08evaluationspeech-recognitionspeech-to-speechspeech-to-text

themanyone/whisper_dictation

Private voice keyboard, AI chat, images, webcam, recordings, voice control with >= 4 GiB of VRAM.

★ 289Pythonupdated 2025-12-30aiassistant-chat-botsassistive-technologyclientclient-server

jatinkrmalik/vocalinux

Free, open-source, 100% offline voice dictation for Linux. Speak and type anywhere via whisper.cpp, Whisper & VOSK engines, GPU-accelerated, works on X11 + Wayland!

★ 281Pythonupdated 2026-04-20accessibilitydictationgpu-accelerationlinuxoffline-first

nateherkai/token-dashboard

See where Claude Code is burning tokens - turn raw JSONL transcripts into local cost analytics, hotspot views, and session-level usage insight.

★ 271Pythonupdated 2026-04-20

deepgram-devs/nextjs-live-transcription

Get started using Deepgram's Live Transcription with this Next.js demo app

★ 270TypeScriptupdated 2026-04-10deepgramlivereal-timespeech-to-textstt

jakovius/voxd

VOXD is a speech-to-text, voice-typing, dictation software for linux distributions. It is an open-source, free of charge, USER-FRIENDLY software, for as many linux distros as possible.

★ 243Pythonupdated 2025-10-22ai-transcriptiondictationlinuxopen-sourceuser-friendly

yuvraj108c/ComfyUI-Whisper

Transcribe audio and add subtitles to videos using Whisper in ComfyUI

★ 229Pythonupdated 2026-01-02comfyuistable-diffusionwhisper-ai

nganlinh4/oneclick-subtitles-generator

🎬 Auto-subtitle videos with AI transcription, translation, voice cloning, professional rendering, background image and music generator

★ 215JavaScriptupdated 2026-03-13

argmaxinc/WhisperKitAndroid

On-device Speech Recognition for Android

★ 208Kotlinupdated 2026-01-24

phongthanhbuiit/whisper-realtime-gui

A modern, real-time speech recognition application built with OpenAI's Whisper and PySide6. This application provides a beautiful, native-looking interface for transcribing audio in real-time with support for multiple languages.

★ 200Pythonupdated 2025-09-13

DevEmperor/Dictate

A powerful Whisper AI keyboard for reliable speech transcription

★ 199Javaupdated 2026-01-06aiandroidgptkeyboardopenai

kantv-ai/kantv

workbench for learning and practicing on-device AI technology in real scenario with online-TV on Android phone, powered by ggml(llama.cpp,whisper.cpp...) and FFmpeg and opencv-mobile

★ 192C++updated 2025-06-12ggml-hexagonllamacpp-android-portonline-tvqualcomm-npustablediffusioncpp-android-port

botbahlul/PyAutoSRT

PySimpleGUI based DESKTOP APP to AUTO GENERATE SUBTITLE FILE (using free Google Speech Recognition API) and TRANSLATED SUBTITLE FILE (using unofficial online Google Translate API) for any video or audio file

★ 189Pythonupdated 2024-05-05auto-captionauto-subtitlecaptionsffmpeggoogle-translate

futo-org/whisper-acft

★ 186Jupyter Notebookupdated 2024-06-26

charliermarsh/whisper.cpp-cli

Packages whisper.cpp into pre-built, pip-installable wheels, for macOS and Linux.

★ 178Pythonupdated 2024-06-10

QuantiusBenignus/BlahST

Input text from speech in any Linux window, the lean, fast and accurate way, using whisper.cpp OFFLINE. Speak with local LLMs via llama.cpp.

★ 172Shellupdated 2025-07-25accessibilityaibloat-freechatbotcli

rudymohammadbali/OpenAI-Whisper-GUI

Modern GUI application that transcribes and translate audio files using OpenAI Whisper.

★ 168Pythonupdated 2024-08-12

lee-b/kobold_assistant

Like ChatGPT's voice conversations with an AI, but entirely offline/private/trade-secret-friendly, using local AI models such as LLama 2 and Whisper

★ 161Pythonupdated 2024-08-20ai-assistantai-assistantsandroiddesktopkoboldai

coqui-ai/STT-models

Open models for Coqui STT

★ 155updated 2023-05-09deep-learningmodelsspeech-to-text

fgnt/meeteval

MeetEval - A meeting transcription evaluation toolkit

★ 154Pythonupdated 2026-01-27asrderwer

themanyone/voice_typing

State-of-the-art offline (or networked) voice typing everywhere + text terminals (Linux or WFL session on Windows.) with a simple bash script. Usable with X. Does not require X.

★ 154Shellupdated 2026-03-03bashcommand-linefreelightweightopen-source

danielmiessler/ExtractWisdom

An AI prompt project that uses AI to extract wisdom from all sorts of text, from podcast transcripts, conversations, talks, lectures, papers, articles, blog posts, essays, presentations, or whatever you can get into text form.

★ 139updated 2023-10-26

timmo001/home-assistant-assist-desktop

Use Home Assistant Assist on the desktop. Compatible with Windows, MacOS, and Linux

★ 132Svelteupdated 2026-01-15assistcross-platformdesktophome-assistanthome-assistant-assist

nomadkaraoke/karaoke-gen

Generate karaoke videos, by downloading audio and lyrics, separating instrumentals, synchronising lyrics using transcription models, rendering CDG and uploading videos to YouTube / Dropbox / Google Drive

★ 128HTMLupdated 2026-04-27karaokekaraoke-makerlyricsmusicvideo

tsmdt/whisply

💬 Fast, cross-platform CLI and GUI for batch transcription, translation, speaker annotation and subtitle generation using OpenAI’s Whisper on CPU, Nvidia GPU and Apple MLX.

★ 121Pythonupdated 2026-04-06asrautomatic-speech-recognitionmlxmlx-audiospeech-recognition

sevos/waystt

Wayland Speech-to-Text Tool - A minimal signal-driven speech-to-text tool for Wayland environments with PipeWire audio

★ 120Rustupdated 2026-03-21

j3soon/whisper-to-input

An Android keyboard that performs speech-to-text (STT/ASR) with OpenAI Whisper and input the recognized text; Supports English, Chinese, Japanese, etc. and even mixed languages.

★ 119Kotlinupdated 2025-12-21androidandroid-imeautomatic-speech-recognitionchinese-speech-recognitionime

alex-vt/WhisperInput

Offline voice input panel & keyboard with punctuation for Android.

★ 112Javaupdated 2024-06-01android-appofflinespeech-recognitionspeech-to-textwhisper-ai

QuantiusBenignus/blurt

Gnome shell extension for accurate OFFLINE speech to text input in Linux using whisper.cpp. Input text from speech anywhere.

★ 105JavaScriptupdated 2025-04-09aiasrbloat-freedictatedictation

primaprashant/awesome-voice-typing

Curated list of open-source speech-to-text and voice typing tools for Linux, macOS, Windows, Android, and iOS. Offline, local, and cloud.

★ 99updated 2026-04-12aiautomatic-speech-recognitionawesome-listdictationdictation-tool

GhostNaN/whisper-subs

WhisperSubs is a mpv lua script to generate subtitles at runtime with whisper.cpp on Linux

★ 99Luaupdated 2025-02-09

AshBuk/speak-to-ai

Speak to AI • Native Linux Speech-to-Text (STT) • Offline, Privacy-Focused

★ 97Goupdated 2026-04-19appimagedictationgolinuxvoice-input

build-with-groq/groq-subtitle-generator

Create subtitles in various languages in mere minutes using Whisper and Qwen3-32b via Groq's lightning-fast inference.

★ 94Pythonupdated 2025-12-17managed-by-terraform

autoshow/autoshow

End-to-end workflow to automatically generate show notes from audio/video transcripts

★ 94TypeScriptupdated 2026-02-25assembly-aichatgptclaudedeepgramgemini

zugaldia/speedofsound

Voice typing for the Linux desktop.

★ 92Kotlinupdated 2026-04-19accessibilityasrdictationlinuxportals

pariajm/awesome-disfluency-detection

A curated list of awesome disfluency detection publications along with the released code and bibliographical information

★ 83updated 2021-05-02awesome-listconversational-speech-recognitionconversational-speech-translationdeep-disfluency-codedeep-disfluency-detection

hoodini/blitzai

קול — Professional Transcription Studio. Hebrew-first, 4 engines, YouTube support, correction studio.

★ 81Pythonupdated 2026-04-12

solaoi/lycoris

Real-time speech recognition & AI-powered note-taking app for macOS with offline/online modes, multilingual transcription, and Japanese translation support.

★ 75TypeScriptupdated 2026-02-03macosmcpmcp-clientopenaiscreenshot

ShmuelRonen/hebrew_whisper

Hebrew whisper powerful transcription and translation tool

★ 74Pythonupdated 2024-05-15

heimoshuiyu/whisper-fastapi

A very simple whsper Python FastAPI for OpenAI API, Android voice-typing (konele), Home Assistant (wyoming), and a voice-typing script on Linux and MacOS!

★ 74Pythonupdated 2025-12-28fastapikoneleopenaiwhisper

Mikodin/obsidian-scribe

Record, transcribe, and transform voice notes into structured insights. Leverage Whisper or AssemblyAI and ChatGPT to fill in gaps, generate summaries, and visualize ideas — all seamlessly integrated within Obsidian.

★ 72TypeScriptupdated 2026-02-24assemblyaiobsidianopenaivoice-assistant

dynamiccreator/whisper-typer-tool

This is a python script using whisper to type with your voice

★ 67Pythonupdated 2025-07-18

cjams/whispertux

Simple GUI around whisper.cpp for voice-to-text on Linux

★ 67Pythonupdated 2026-04-08dictationlinuxopenaipromptvoice

neonwatty/bleep-that-shit

Free in-browser audio & video censorship tool. AI-powered transcription with Whisper, 100% private client-side processing. Bleep profanity, custom words, or any phrase.

★ 63TypeScriptupdated 2026-04-16ffmpegpodcastprofanity-filterspeech-to-texttransformersjs

AIFSH/ComfyUI-WhisperX

a comfyui cuatom node for audio subtitling based on whisperX and translators

★ 62Pythonupdated 2025-04-01srt-subtitlessutitlestranslationwhisper

mikeesto/gemini-transcribe

Transcribe audio and video files with speaker diarization and logically grouped timestamps using Gemini Flash

★ 61Svelteupdated 2026-04-07geminigemini-flashspeaker-diarizationspeech-to-textsveltekit

ser/wyoming-whisper-api-client

Wyoming protocol server for the Whisper API speech to text system

★ 60Pythonupdated 2026-04-17

Mohamad-Hussein/speech-assistant

Desktop application for Linux and Windows that utilizes distil-whisper models from HuggingFace, to enable real-time offline speech-to-text dictation.

★ 60Pythonupdated 2025-10-16desktop-appdictationdistil-whisperhuggingfaceoffline

pipecat-ai/stt-benchmark

Benchmarking STT service TTFB and semantic WER for real-time AI applications

★ 57Pythonupdated 2026-03-20

hyperaudio/hyperaudio-lite-editor

A lightweight transcript editor for editing and correcting STT generated timed transcripts

★ 57JavaScriptupdated 2026-01-03

jorge-menjivar/super-stt

Super STT enables effortless voice-to-text in any application, using the most advanced speech models.

★ 54Rustupdated 2026-04-14cosmiccosmic-desktoplinuxspeech-recognitionspeech-to-text

zhattention/ticktick-ai

🎯 AI-powered voice assistant for TickTick, enabling natural language task management through speech. Built with OpenAI's speech recognition and TickTick's API integration, this assistant helps you manage your todos hands-free - create tasks, set reminders, and organize your schedule using just your voice.

★ 54Pythonupdated 2025-02-24

Eyevinn/auto-subtitles

Automatically generate subtitles from an input audio or video file using OpenAI Whisper

★ 53TypeScriptupdated 2026-03-17ffmpegopenaiopenai-whispersubtitle-generatorsubtitles

bbc/subtitles-generator

A node module to generate subtitles by segmenting a list of time-coded text - BBC News Labs

★ 53JavaScriptupdated 2023-07-08captionsdigital-paper-editittjsonnews-labs

arcaputo3/mcp-server-whisper

An MCP Server for audio transcription using OpenAI

★ 52Pythonupdated 2025-10-16

xuegao-tzx/whisper_flutter_new

A flutter library for offline speech-to-text conversion which use whisper.cpp models implementation for Android、iOS、macOS.

★ 51C++updated 2026-04-15androidflutterioswhisperwhisper-cpp

TahaBakhtari/SubtitleGenerator

This project is a video processing application that extracts audio from videos, performs automatic speech recognition (ASR), and generates subtitles. It allows users to enhance audio quality, correct transcription errors, and convert subtitles into various dialects, all through a user-friendly command-line and web interface.

★ 50Pythonupdated 2025-03-30

beecave-homelab/insanely-fast-whisper-rocm

insanely-fast-whisper with support for AMD GPU's with rocm 6.1 - 7.1

★ 48Pythonupdated 2026-04-18

perrypixel/VoiceTyper-Pro

VoiceTyper-Pro is an advanced speech-to-text dictation tool built with Python and powered by the Deepgram API. Alternative to Mac Whisper, Voice Access, and other voice typing tools.

★ 40Pythonupdated 2025-02-22deepgrammacvoice-recognitionwhisperwhisper-ai

KevKibe/African-Whisper

🚀 Framework for seamless fine-tuning of Whisper model on a multi-lingual dataset and deployment to prod.

★ 38Pythonupdated 2025-02-27asrspeechspeech-recognitionspeech-to-textspeech-transcription

daaku/whispy

Speech-to-text typing for Linux/Wayland using Whisper.

★ 38Goupdated 2025-12-06

aydinnyunus/LinuxVoiceAssistant

Linux Voice Assistant for to Make Your Work Easier

★ 37Pythonupdated 2020-01-30assistantassistant-chat-botsgooglegoogle-assistantgoogle-assistant-apps

shashank2122/Local-Voice

A real-time, offline voice assistant for Linux and Raspberry Pi. Uses local LLMs (via Ollama), speech-to-text (Vosk), and text-to-speech (Piper) for fast, wake-free voice interaction. No cloud. No APIs. Just Python, a mic, and your voice.

★ 37Pythonupdated 2026-04-20androidchatbotdeep-learningechoesp-idf

WhiteSmoogy/audiov

audiov is a speech-to-text, voice-typing, dictation software for linux distributions.

★ 36Rustupdated 2026-03-14

Donkey545/wyoming-faster-whisper-rocm

Faster whisper Running on AMD GPUs with modified CTranslate 2 Libraries served up with Wyoming protocol

★ 36Pythonupdated 2024-08-17

cesp99/sussurro

A fully local, open-source voice-to-text tool that acts as a system-wide AI dictation layer, converting speech into clean, formatted text.

★ 35Goupdated 2026-04-13clidarwingolanglinuxllm-tools

CyR1en/ElevenLabsS4TS

Speech-to-text, text-to-speech with ElevenLabs

★ 35Pythonupdated 2023-12-21elevenlabspyside6pytorchspeech-to-texttext-to-speeh

knowall-ai/turbo-whisper

SuperWhisper-like voice dictation for Linux with waveform UI

★ 32Pythonupdated 2026-01-22

luoyuweidu1/podcastcut-skills

Claude Code Skills for podcast/video editing: transcription, content editing, rough/fine cut, final polish

★ 30HTMLupdated 2026-03-17

sushant-t/tts-trainer

Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using Whisper.

★ 30Pythonupdated 2023-05-27

andrewstack-maker/agenticSeek

AgenticSeek is a fully local, voice-enabled AI assistant designed to autonomously browse the web, write code, and plan tasks while ensuring complete privacy by keeping all data on your device. Tailored for local reasoning models, it runs entirely on your hardware, eliminating any cloud dependency.

★ 30Pythonupdated 2025-08-27ai-agentsai-assistantautonomous-web-browsingchromedrivercoding-assistance

ErcinDedeoglu/WhisperDock

Dockerized Whisper C++ speech-to-text API for easy deployment and rapid integration. Offering the latest stable and nightly builds for efficient audio transcription.

★ 28C++updated 2026-02-28apiaudio-transcriptiondockermachine-learningspeech-to-text

blakkd/faster-whisper-hotkey

Effortless Push-to-Talk Transcription, Anywhere.

★ 27Pythonupdated 2026-04-05canarycanary-1b-v2coherecohere-transcribe-03-2026faster-whisper

V0v1kkk/WhisperVoiceInput

A cross-platform desktop application that records audio and transcribes it to text using OpenAI's Whisper API or compatible services. Perfect for dictation, note-taking, and accessibility.

★ 27C#updated 2026-03-21

k2-fsa/sherpa-onnx-go

sherpa-onnx Go package for speech recognition without network access, supporting Linux, macOS, Windows

★ 26Goupdated 2026-04-16

Aaronontheweb/witticism

WhisperX-powered voice transcription tool that types text directly at your cursor position. Hold F9 to record, release to transcribe.

★ 26Pythonupdated 2026-03-04pytorchtranscriptionvoice-commandswhisperwhisperx

analyticsinmotion/werpy

🐍📦 Ultra-fast Python package for calculating and analyzing the Word Error Rate (WER). Built for the scalable evaluation of speech and transcription accuracy.

★ 26Pythonupdated 2026-03-30asrasr-evaluationautomatic-speech-recognitionlevenshtein-distancemetrics

yzfly/awesome-voice-agents

A curated list of voice AI agent frameworks, tools, resources, and best practices

★ 25updated 2026-04-06agentsrealtime-chatsttttsvad

ssciwr/vink

A stand-alone application with GUI for OpenAI's Whisper

★ 24Pythonupdated 2024-04-04guihacktoberfestiwr-hacktoberfestopenaipyinstaller

kejne/soundvibes

Speech-to-text for Linux that just works

★ 23Rustupdated 2026-04-09dictationspeech-to-textwhisper-cpp

i4Ds/whisper-finetune

This repository contains code for fine-tuning the Whisper speech-to-text model.

★ 23Jupyter Notebookupdated 2026-04-16fine-tuningnlpspeech-to-textwhisper

tubsn/gpt-buddy

Prompt Management System for Interaction with the ChatGPT API

★ 23JavaScriptupdated 2026-04-07aiaudio-transcribingimage-generationprompt-databaseprompts

filyp/whisper-simple-dictation

Handy voice dictation using whisper.

★ 23Pythonupdated 2026-03-30

hate/keyless

Privacy‑first, real‑time speech‑to‑text dictation. 100% local inference in Rust; hotkey to dictate anywhere (macOS, Linux, Windows).

★ 22Rustupdated 2025-12-15cpalcross-platformdesktop-appdictationrust

HackerLion123/subtitles_generator

Automatically create subtitles for any video using google speech to text cloud api.

★ 22Pythonupdated 2018-08-23

yfge/TalkReplay

The only tool that replays Claude, Codex, Cursor, AND Gemini AI coding sessions in one unified UI. Vibe coding companion for reviewing, searching, and sharing your AI pair programming transcripts.

★ 21TypeScriptupdated 2026-02-03ai-pair-programmingai-transcriptclaudeclaude-codecodex

theaifutureguy/DeepSeek-R1-Voice-Agent

An interactive AI voice agent that can capture and transcribe speech in real-time, generate intelligent responses using the DeepSeek R1 (7B model) AI, and convert the responses back to natural speech for immediate playback. The agent maintains conversation context and supports cross-platform usage on macOS, Linux, and Windows.

★ 20Pythonupdated 2025-06-20assemblyaideepseekdeepseek-r1elevenlabsportaudio

kaisoapbox/kaiboard

The best Android keyboard for offline speech recognition, using OpenAI's whisper model through whisper.cpp for fast and accurate output.

★ 19Kotlinupdated 2025-02-22

mbailey/push2type

Turn CAPSLOCK key into Dictation Key

★ 19Shellupdated 2025-06-09asrdictationspeech-recognitionspeech-to-textvoice

kodxana/whisperx-worker

RunPod Serverless worker for WhisperX

★ 19Pythonupdated 2026-03-26

cxyfer/GeminiASR

A Python tool that uses Google Gemini API to transcribe video or audio files into SRT subtitle files.

★ 19Pythonupdated 2026-01-02asrgeminigemini-apitranscribe

tempo-riz/deepgram_speech_to_text

A Deepgram client for Dart and Flutter, supporting all Speech-to-Text and Text-to-Speech features on every platform.

★ 19Dartupdated 2025-09-12dartdeepgram-apiflutteropen-sourcesdk

gladiaio/normalization

A lightweight library for normalizing speech transcripts before computing WER

★ 18Pythonupdated 2026-04-17aiasrbenchmarknormalizationspeech-to-text

0xPD33/sonori

Sonori is a fully local STT app for Linux (Wayland).

★ 18Rustupdated 2026-03-14asrautomatic-speech-recognitionctranslate2linuxonnxruntime

yadokani389/whisper-typing

Real-time voice input software using the Whisper model.

★ 17Pythonupdated 2025-07-19

BigUncle/Fast-Whisper-MCP-Server

A high-performance speech recognition MCP server based on Faster Whisper, providing efficient audio transcription capabilities.

★ 17Pythonupdated 2025-03-22

schnoddelbotz/whisper-ui

Transcribe audio/video to text, locally on macOS, Linux and Windows. A simple whisper.cpp wrapper/UI built with Go/Fyne.

★ 17Goupdated 2026-01-08ffmpegffmpeg-wrapperfyneguilocal

spartanhaden/whisper-typer

A small script that types what you say using whisper while holding a hotkey

★ 15Pythonupdated 2025-11-19openaipython3speach-recognitionspeach-to-texttyping-assistant

wheeler01/Linux-Dictation-Project

expands the boundaries of speech recognition technology for documentation productivity on the Linux PC. With dictation and transcription capabilities as well as control over your system written in Python using whisper.

★ 14Pythonupdated 2025-08-12

trebormc/linux-voice-to-text-ai

Linux-based voice-to-text tool using AI (Whisper/DeepGram) for real-time speech transcription. Command-line interface for easy recording, processing, and text output. Ideal for accessibility, dictation, and hands-free text input in Linux environments.

★ 14Pythonupdated 2025-04-20

SmartLittleApps/local-stt-mcp

A high-performance Model Context Protocol (MCP) server providing local speech-to-text transcription using whisper.cpp, optimized for Apple Silicon.

★ 13TypeScriptupdated 2025-06-03appleapple-siliconm1m2m3

peteonrails/hushnote

Privacy-first meeting transcription and voice-to-text tool for Linux. 100% local AI processing with faster-whisper and Ollama.

★ 13Shellupdated 2026-04-19

nerveband/whatsapp_voice_transcription

Node.js app that transcribes WhatsApp voice notes to text using OpenAI's Whisper API. The text can also be translated to the user's preferred language and sent back to their WhatsApp account.

★ 13HTMLupdated 2026-04-17

deepgram/voice-keyboard-linux

Linux virtual keyboard driver which types what you say using Deepgram Flux STT API

★ 13Rustupdated 2026-02-03

redocrepus/Whisper-Paste

Chrome extension that allows dictating anywhere using OpenAI Whisper

★ 13JavaScriptupdated 2023-09-29chrome-extensiondictationopenaiopenai-apitext-to-speech

ronb1964/TalkType

Privacy-first voice dictation for Linux Wayland — press a key to talk, release to type. Powered by Whisper AI, 100% offline, no subscription required.

★ 11Pythonupdated 2026-04-03accessibilityappimagedictationgnomegtk

robertelee78/swictation

Cross-platform voice-to-text dictation for Linux and macOS. Local/private STT using Parakeet-TDT 1.1B with NVIDIA CUDA or Apple CoreML acceleration.

★ 11Makefileupdated 2026-03-24

gurjar1/OmniDictate-CLI

Real-time AI dictation using faster-whisper—type anywhere with instant, accurate speech-to-text conversion.

★ 11Pythonupdated 2025-03-25

jpzk/wl-voice

A voice recording and transcription tool for Hyprland, using Whisper for speech-to-text and copying results to clipboard. It's using Faster Whisper (optimized for CPU) and runs fully locally.

★ 11Pythonupdated 2026-01-03hyprlandhyprland-dotfileshyprland-ricewaylandwhisper

TheNewC0der-24/Textonus

Voice to Text Online Notepad Professional, Accurate & Free Speech Recognition Text Editor Distraction-Free, Fast, Easy to Use Web App for Dictation & Typing

★ 11JavaScriptupdated 2026-02-14ant-designbootstraplocalstoragenpmpwa-app

samoylenkodmitry/Linux-AI-Assistant-scripts

This is my custom scripts to use Whisper / OpenAI by keyboard shortcuts and voice input.

★ 11Pythonupdated 2023-10-29

azkadev/whisper_flutter

Whisper Flutter Example Speech To Text Offline Android Linux Without Api Key Without FFMPEG

★ 10C++updated 2025-08-02aiazkadevdartflutterggml

gsu-library/whisper-scribe

An audio/video transcriber with diarization and transcription editing.

★ 10JavaScriptupdated 2026-03-17

GitJuhb/voice-typing-linux

Fast, accurate voice typing for Linux — IBus input method engine with streaming STT, Whisper refinement, and CUDA acceleration

★ 10Pythonupdated 2026-02-10accessibilityfaster-whisperlinuxnixosspeech-to-text

BharatKalluri/speechshift

A fully local, offline first speech-to-text application made for Linux!

★ 9Pythonupdated 2025-09-24

Ichigo3766/audio-transcriber-mcp

A MCP server that provides audio transcription capabilities using OpenAI's Whisper API

★ 9JavaScriptupdated 2025-03-25

LumenYoung/Whisper-Dictation

A dictation application on linux using openai's whisper. Currently only used on KDE wayland.

★ 9Pythonupdated 2023-04-11

openresearchtools/transcribeoffline

Transcribe Offline by openresearchtools.com is an open source desktop application that allows you to transcribe audio and video fully offline, with optional speaker diarisation and word-level alignment. It can also generate subtitles and integrate with local large language models (LLMs) for summarisation and editing

★ 9Rustupdated 2026-03-21ailocalaimacosopen-sourcetranscribe

imAbdelhadi/audio2srt

Convert audio files (flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, and webm) to SRT subtitles with OpenAI Whisper. Easy script for fast, accurate transcription.

★ 9Pythonupdated 2024-06-11

AsyncFuncAI/whisper-notes

Transform your voice into text effortlessly with Whisper Notes

★ 9Rustupdated 2025-02-16

kivy-school/kivywhisper

Kivywhisper is a cross platform Python GUI for OpenAI's Whisper.

★ 9Pythonupdated 2024-07-26

taraksh01/wisper

Wisper - Voice dictation app for Linux. Type directly at cursor with AI-powered transcription.

★ 8TypeScriptupdated 2026-02-14assistanthelpervoice-recognition

jdpsc/claude-code-voice

Add voice-to-text capabilities to Claude Code using OpenAI Whisper for speech recognition.

★ 8Pythonupdated 2025-08-09

RemiFabre/voice2clipboard

One-key voice-to-transcription tool: record speech, transcribe locally with Whisper, then paste. Never lose your audio files anymore!

★ 8Pythonupdated 2026-03-23chatgptlinuxllmollamaopen-source

eriknovikov/voice-type

A Linux first system-wide dictation tool to transcribe Speech To Text (STT) . Super accurate, fast, and free.

★ 7TypeScriptupdated 2026-04-15

thewh1teagle/whisper-heb-ipa

Fine-tuned whisper that transcribe Hebrew audio into IPA

★ 7Pythonupdated 2026-04-08g2phebrewipawhisper

RhinoDevel/mt_stt

Pure C wrapper library to use Whisper.cpp with Linux and Windows as simple as possible.

★ 7C++updated 2025-09-17speech-to-textsttwhisperwhisper-cpp

hyun-yang/MyChatGPT

The ultimate PyQt6 application that integrates the power of OpenAI, Google Gemini, Claude, and other open-source AI models

★ 7Pythonupdated 2025-12-07agentchatgptclaudedalleevaluator-optimizer-workflow

shinglyu/WhisperNow

A voice transcription tool using faster-whisper that records audio and converts speech to text on Linux systems.

★ 7Pythonupdated 2025-02-20

AgenticToaster/whisprd

A powerful, real-time dictation system for Linux

★ 6Pythonupdated 2025-06-22dictationlinuxspeech-to-textwhisper

johnyquest7/whisper_chrome_ai

Browser-Based AI Assistant: Speech-to-Text with Whisper and Local AI Answers

★ 6HTMLupdated 2024-07-04

jphme/supervoxtral

Local realtime transcription tool powered by Voxtral Mini

★ 5Swiftupdated 2026-02-23

zyk42/TalkType

TalkType is a cross-platform application built with Electron, supporting Windows, macOS, and Linux. By combining Automatic Speech Recognition (ASR) with Large Language Models (LLM), it goes beyond simple dictation to offer "Understanding", "Polishing", and "Q&A" capabilities — your all-in-one voice writing assistant.

★ 5JavaScriptupdated 2026-03-01

thunderpoot/audio-censor

Rudimentary program for speech transcription, manipulation, and redaction.

★ 5Pythonupdated 2024-07-17audiocensorcensorshippydubredaction

hoomanick/AudioWrite

AudioWrite: Effortless voice dictation powered by Google's Gemini API. Record, transcribe, and transform rambling audio into polished, multi-language notes. PWA ready.

★ 5TypeScriptupdated 2026-01-05aiaudio-recorderdictationfrontendgemini-api

mzazakeith/whisper-notes-pro

A modern, lightweight note-taking app powered by Whisper

★ 5Rustupdated 2025-05-06aiwhisperwhisper-ai

lliWcWill/maVoice-Linux

🎙️ Lightning-fast voice dictation Desktop Web App powered by Groq's Whisper Turbo - Open-source, privacy-first, with real-time audio visualization and intuitive click controls

★ 4Rustupdated 2026-03-14desktop-appdesktop-web-based-appgroqlinuxreal-time

chris17453/dictator

A Linux / Gnome dictation app which uses fast whisper to do speach to text.

★ 4Pythonupdated 2026-01-09

sevos/niri-transcribe

Real-Time Transcription System for Niri - MacOS-like dictation for Linux Wayland environments

★ 4JavaScriptupdated 2026-02-14

twidi/twistt

Twidi Speech To Text (openai, push to talk, linux, wayland, deepgram)

★ 4Pythonupdated 2026-02-22deepgramopenaiopenai-apipush-to-talkspeech-to-text

Andrewske/whisper-wayland

Voice dictation for Linux/Wayland (like wisprflow). 100% offline, GPU-accelerated, and actually works with Wayland compositors.

★ 4Pythonupdated 2025-10-09dictationlinuxopenaiwaylandwhisper

VaishakhVipin/whispers-final

🗣️ Whispers Talk. Recall. Repeat. A blazing-fast voice journal that remembers everything you say — searchable with AI. ✨ What is Whispers? Whispers is a voice-first journaling app powered by: 🧠 <300ms Latency Streaming Transcription (AssemblyAI) 🔍 Algolia MCP for instant search of your thoughts

★ 4TypeScriptupdated 2025-07-28

mateusz-kow/auto-subs-legacy

An offline-first desktop app to automatically transcribe and edit video subtitles using OpenAI Whisper. Full control over text, timing, and advanced styling in a powerful, intuitive editor.

★ 4Pythonupdated 2025-08-16

SarwadnyaMahajan/WhisperVoice

WhisperVoice: Covert voice notes. Encrypts text and hides it via LLM-generated acrostic sentences. Murf.ai creates natural audio. Browser extension decrypts with passcode, revealing hidden message or playing decoy for unauthorized listeners. Uses LLM, Murf.ai, STT APIs

★ 4JavaScriptupdated 2025-06-29murf-aimurf-ai-hackathon

atkvishnu/whisper-hotkey-linux

Press F9. Speak. Paste. A blazing-fast, offline voice transcription tool for Linux using Whisper.cpp, bound to a global hotkey.

★ 4Shellupdated 2025-06-24clipboardlinuxspeech-to-textubuntuvoice-to-text

AdrianScott/froshine

Speech-to-Text/Code using a fast local LLM, for Linux, uses Whisper

★ 4Pythonupdated 2025-11-15linuxttswhisperwhisper-ai

errogaht/ubuntu-wayland-voice-input

Voice input tool for Ubuntu 25.04 with Wayland. Record speech with hotkey, transcribe via Nexara API, and copy to clipboard.

★ 3JavaScriptupdated 2026-02-02

Wiecek-K/local-dictation-assistant

A fully offline, high-performance, streaming speech-to-text tool for developers on Linux.

★ 3Pythonupdated 2025-10-24

sanastasiou/dictation-service

GPU-accelerated speech-to-text service that types what you say, powered by OpenAI's Whisper AI

★ 3Pythonupdated 2025-10-09accessibilitycudadictationgpu-accelerationlinux

VanshShah1/hyperwhisper

type 10x faster with ai assisted voice typing

★ 3Pythonupdated 2025-03-17

R3DK3LL/VocalFLow

Your voice - VocalFlow dictation, harnessing Whisper and faster-whisper for real-time transcription, adaptive learning, and NLP. Built with Python, it spans Linux, Windows, and macOS, boosting productivity through voice-assisted workflows.

★ 3Pythonupdated 2025-09-08cross-platformdesktop-appdictationfaster-whisperlinux

makelinux/multi-dictate

A user-friendly voice dictation application for Linux that supports multiple languages.

★ 3Pythonupdated 2025-11-03accessibilitydictationlinuxmultilingualproductivity

CGAlei/FasterWhisper

Real-time desktop audio transcription using OpenAI Whisper for Arch Linux with CUDA acceleration

★ 3Pythonupdated 2025-08-05

97k/mcp-audio-server

A powerful audio transcription server that seamlessly transcribes meeting recordings, generates notes, and intelligently splits audio files for efficient management. Open-source and built with FastMCP and Groq/OpenAI Whisper

★ 3Pythonupdated 2025-06-13

pmerwin/audio-transcription-mcp

MCP server for real-time audio transcription using OpenAI Whisper

★ 3TypeScriptupdated 2025-10-08

eddiedunn/pywhisper-dictation

Simple Python Tkinter GUI App for linux that uses whisper from openai for transcription.

★ 3Pythonupdated 2023-06-22

fengwk/linux-stt-input

A local, real-time speech-to-text (STT) input tool for Linux, powered by RealtimeSTT and Faster-Whisper. Press a hotkey to dictate directly into any application.

★ 3Pythonupdated 2025-08-23

stanvx/NotelyCapture

A 100% private AI voice transcription app that converts speech to text in 50+ languages. Built with Compose Multiplatform for Android using Whisper AI - no cloud uploads, all processing happens on-device for complete privacy.

★ 2C++updated 2025-08-05

arturo-jc/ptt-dictate

Push-to-talk voice dictation for Linux. Record with PipeWire, transcribe locally via whisper.cpp, and type text into any app using ydotool. Fast, private, and works system-wide with a single hotkey.

★ 2Shellupdated 2025-10-20

build-ai-applications/Eval-STT

Open-Source Speech-to-Text Evaluation Framework

★ 2Jupyter Notebookupdated 2025-02-12llm-evaluationllm-inferencespeech-to-text

Anewryzm/transcript-generator-mcp-server

A powerful MCP (Model Context Protocol) server that transcribes audio and video files into text using Groq's Whisper model.

★ 2Pythonupdated 2025-06-10

rolandtritsch/whisper-wayland

A push-to-talk wisper-flow service to support voice-base vibe-coding with claude

★ 2Pythonupdated 2026-04-15

afif-malghani/stt-linux

Speech to text for linux using whisper

★ 2Pythonupdated 2025-05-17

vanviegen/speech2keys

A fast, lightweight Linux tool that converts speech to text and types it into any window using OpenAI's Whisper API.

★ 1Rustupdated 2025-11-07

alexandrehsantos/whisper-voice-typing

Voice-to-text input daemon for Linux using OpenAI Whisper

★ 1Pythonupdated 2025-10-05

emonasterios/dictado-llm

Offline Voice Dictation & Text Enhancement A lightweight, 100% local Linux tool for real-time voice‑to‑text transcription and LLM‑powered writing improvements.

★ 1Shellupdated 2025-05-15

rpodgorny/rpdictation

from microphone directly to your app

★ 1Rustupdated 2026-04-15aidictationlinuxllmwhisper

dev-git-unix/whisper-dictation-linux

macOS-style dictation for Ubuntu using Whisper. Press double-Ctrl, speak, and your words are transcribed to text locally with faster-whisper. Supports clipboard output, customizable hotkeys, and offline models for speed and privacy.

★ 1Pythonupdated 2025-09-24

tksimson/dicti

Linux Live Dictation - Real-time speech-to-text with Whisper

★ 1Pythonupdated 2025-09-15

Kieldro/whisper-hotkey

Linux voice transcription with hotkey using faster-whisper (local) with optional GPT-4o mini polishing

★ 1Pythonupdated 2026-04-16dictationlinuxparakeetpush-to-talkspeech-to-text

maglu/log_whisperer

Linux log interpreter using AI

★ 1Pythonupdated 2025-05-10

gorodulin/last_summer_agent

Langflow-based LLM agent that keeps track of my personal projects. Based on integration with WhatsApp voice messages, Whisper, OpenAI/Mistral models and local MCP.

★ 1Pythonupdated 2025-06-22

vidau-ai/asr_mcp_server

A Model Context Protocol (MCP) server that provides ASR(Automatic Speech Recognition) capabilities using the whisper engine. This server exposes TTS functionality through MCP tools, making it easy to integrate speech synthesis into your applications.

★ 1Pythonupdated 2025-03-31

Interpause/my-own-assistant

Whisper + TTS + As many MCP servers as I can stuff in

★ 1Pythonupdated 2025-05-01

samihalawa/insanely-fast-whisper-mcp

Blazingly fast audio transcription MCP server using Whisper with Flash Attention 2

★ 1Pythonupdated 2025-12-04

yhsung/whisper-cli-mcp

mcp server for whisper-cli

★ 1Pythonupdated 2025-07-18

zeglicz/subtitles-generator

App for transcribing audio/video to editable SRT subtitles using Whisper. Supports mp3/mp4/wav inputs, audio extraction, and local download.