STT (Speech-to-Text)

293 repos
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
★ 159,926Pythonupdated 2026-04-20audiodeep-learningdeepseekgemmaglm
Robust Speech Recognition via Large-Scale Weak Supervision
★ 98,410Pythonupdated 2026-04-15
Port of OpenAI's Whisper model in C/C++
★ 49,040C++updated 2026-04-20inferenceopenaispeech-recognitionspeech-to-texttransformer
Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous AI (gpt, claude, gemini, llama, qwen, mistral). Get started - free.
★ 34,251Pythonupdated 2026-03-26agentaiassistantchatchatgpt
Distribute and run LLMs with a single file.
★ 24,274C++updated 2026-04-17cross-platformggufllama-cpplocal-ailocal-inference
The open-source AI voice studio. Clone, dictate, create.
★ 23,403TypeScriptupdated 2026-04-20aicudamlxqwen3-ttsqwen3-tts-ui
Faster Whisper transcription with CTranslate2
★ 22,404Pythonupdated 2025-11-19deep-learninginferenceopenaiquantizationspeech-recognition
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
★ 21,502Pythonupdated 2026-04-04asrspeechspeech-recognitionspeech-to-textwhisper
A free, open source, and extensible speech-to-text application that works completely offline.
★ 20,572Rustupdated 2026-04-19accessibilitycross-platformspeech-to-texttauri-v2
Buzz transcribes and translates audio offline on your personal computer. Powered by OpenAI's Whisper.
★ 18,900Pythonupdated 2026-04-19whisper
Give your AI agent eyes to see the entire internet. Read & search Twitter, Reddit, YouTube, GitHub, Bilibili, XiaoHongShu — one CLI, zero API fees.
★ 18,240Pythonupdated 2026-04-13agent-infrastructureai-agentai-searchautomationbilibili
🧠 Leon is your open-source personal assistant.
★ 17,188TypeScriptupdated 2026-04-19aiai-assistantartificial-intelligenceassistantautomation
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
★ 17,126Pythonupdated 2026-04-20asrdeeplearninggenerative-aimachine-translationneural-networks
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
★ 15,852Pythonupdated 2026-03-17audio-visual-speech-recognitionconformerdfsmnparaformerpretrained-model
kaldi-asr/kaldi is the official location of the Kaldi project.
★ 15,378Shellupdated 2025-09-22c-plus-pluscudakaldishellspeaker-id
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
★ 12,594Pythonupdated 2026-04-15asrcode-switchconformerkwspunctuation-restoration
★ 12,575Jupyter Notebookupdated 2025-10-25
AI that sees your screen, listens to your conversations and tells you what to do
★ 12,196Dartupdated 2026-04-20aiappbcicflutter
Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, RK NPU, Axera NPU, Ascend NPU, x86_64 servers, websocket server/client, support 12 programming languages
★ 11,826C++updated 2026-04-20aarch64androidarm32asrcpp
A PyTorch-based Speech Toolkit
★ 11,475Pythonupdated 2026-04-03asraudioaudio-processingdeep-learninghuggingface
Privacy first, AI meeting assistant with 4x faster Parakeet/Whisper live transcription, speaker diarization, and Ollama summarization built on Rust. 100% local processing. no cloud required. Meetily (Meetly Ai - https://meetily.ai) is the #1 Self-hosted, Open-source Ai meeting note taker for macOS & Windows.
★ 11,335Rustupdated 2026-03-16aiai-meeting-assistantllmlocal-aimac
High-performance GPGPU inference of OpenAI's Whisper automatic speech recognition (ASR) model
★ 10,356C++updated 2024-08-03
Simultaneous speech-to-text models
★ 10,187Pythonupdated 2026-03-31
OpenVINO™ is an open source toolkit for optimizing and deploying AI inference
★ 10,142C++updated 2026-04-20aicomputer-visiondeep-learningdeploy-aidiffusion-models
A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcription.
★ 9,743Pythonupdated 2026-03-14pythonrealtimespeech-to-text
Swap GPT for any LLM by changing a single line of code. Xinference lets you run open-source, speech, and multimodal models on cloud, on-prem, or your laptop — all through one unified, production-ready inference API.
★ 9,261Pythonupdated 2026-04-20artificial-intelligencechatglmdeploymentflan-t5gemma
Multilingual Voice Understanding Model
★ 8,041Pythonupdated 2025-12-30aiaigcasraudio-event-classificationcross-lingual
Very low latency speech to text, intent recognition, and text to speech, for building voice agents and interfaces
★ 7,857Cupdated 2026-04-20intent-recognitionsttttsvoicevoice-recognition
Open Vision Agents by Stream. Build voice and vision agents quickly with any model or video provider. Uses Stream's edge network for ultra-low latency.
★ 7,688Pythonupdated 2026-04-17agentic-aiagentsaiai-agentsrealtime
Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.
★ 6,708Pythonupdated 2025-12-05audiobookfaster-whispergradiokaraokepodcasts
Silero Models: pre-trained text-to-speech models made embarrassingly simple
★ 5,888Jupyter Notebookupdated 2026-04-16armenianazerbaijanibelaruscolabgeorgian
Transcribe on your own!
★ 5,849TypeScriptupdated 2026-04-19aicross-platformdesktopopenairust
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
★ 5,502Jupyter Notebookupdated 2026-02-23asrspeaker-diarizationspeechspeech-recognitionspeech-to-text
On-device wake word detection powered by deep learning
★ 4,798Pythonupdated 2026-04-17handsfreehotwordhotword-detectionhotword-detectorkeyword-spotter
Low-latency AI engine for mobile devices & wearables
★ 4,689Cupdated 2026-04-20aiandroidarmedgeedge-ai
Build local voice agents with open-source models
★ 4,686Pythonupdated 2026-04-20aiassistantlanguage-modelmachine-learningpython
An Open Source text-to-speech system built by inverting Whisper.
★ 4,595Jupyter Notebookupdated 2025-12-14pytorchspeech-synthesistts
Mac app for crushing tech interviews with AI
★ 4,263Swiftupdated 2025-01-14aichatgptgptgpt-4openai
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
★ 4,073Pythonupdated 2025-01-08audiospeech-recognitionwhisper
A nearly-live implementation of OpenAI's Whisper.
★ 3,984Pythonupdated 2026-04-16dictationobsopenaiopenvinoopenvino-intel
The media player for language learning, with dual subtitles, AI-generated subtitles, real-time translation, and more!
★ 3,657C#updated 2026-04-19asrcsharpflyleaflanguage-learningllm
Whisper realtime streaming for long speech-to-text transcription and translation
★ 3,603Pythonupdated 2025-11-12
🤖 A Telegram bot that integrates with OpenAI's official ChatGPT APIs to provide answers, written in Python
★ 3,451Pythonupdated 2025-06-03chatgptdall-eopenaipythontelegram-bot
An open source AI wearable device that captures what you say and hear in the real world and then transcribes and stores it on your own server. You can then chat with Adeus using the app, and it will have all the right context about what you want to talk about - a truly personalized, personal AI.
★ 3,401TypeScriptupdated 2024-04-22aiopenopen-source-aiwearwearable
Instantly generate AI-powered subtitles on your device. Works standalone or connects to DaVinci Resolve.
★ 3,269TypeScriptupdated 2026-04-19aidavincidavinci-resolvediarizelinux
Open source, local, and self-hosted Amazon Echo/Google Home competitive Voice Assistant alternative
★ 3,017Cupdated 2026-02-13alexadeep-learningechoesp-adfesp-idf
Transcribe any audio to text, translate and edit subtitles 100% locally with a web UI. Powered by whisper models!
★ 2,979Svelteupdated 2025-08-15aiaudio-to-textgolangspeech-recognitionspeech-to-text
Real time transcription with OpenAI Whisper.
★ 2,926Pythonupdated 2025-04-15
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
★ 2,803Pythonupdated 2025-09-09asrattention-is-all-you-needattention-mechanismattention-modelattention-network
Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages
★ 2,780Pythonupdated 2025-12-30
A Web UI for easy subtitle using whisper model.
★ 2,764Pythonupdated 2025-12-29aigradioopen-sourcepythonpytorch
Self-hosted AI audio transcription
★ 2,605Goupdated 2026-03-22aiaudiotranscripttranscription
Transcribe and summarize videos and podcasts using AI. Open-source, multi-platform, and supports multiple languages.
★ 2,548Pythonupdated 2026-03-07aitooltiktoktranscribevideototextyoutube
🔊 Awesome list for Whisper — an open-source AI-powered speech recognition system developed by OpenAI
★ 2,275updated 2026-03-17aiartificial-intelligenceawesomeawesome-listgpt
Automatically generate and overlay subtitles for any video.
★ 2,203Pythonupdated 2024-07-12ffmpegopenai-whispersubtitle-generatorsubtitlessubtitles-generator
:microphone: React Native Voice Recognition library for iOS and Android (Online and Offline Support)
★ 2,156TypeScriptupdated 2026-01-31androidiosreact-nativespeech-recognitionvoice-recognition
Record voice notes & transcribe, summarize, and get tasks
★ 2,126TypeScriptupdated 2026-02-27
Voice Activity Detector (VAD) : low-latency, high-performance and lightweight
★ 2,091Cupdated 2026-02-02audioautomatic-speech-recognitionconversational-aireal-timesilero-vad
Whisper as a Service (GUI and API with queuing for OpenAI Whisper)
★ 2,066JavaScriptupdated 2026-04-17
WhisperPlus: Faster, Smarter, and More Capable 🚀
★ 1,947Pythonupdated 2026-03-02
Voice activity detector (VAD) for the browser with a simple API
★ 1,945TypeScriptupdated 2026-01-30onnxruntimesilero-vadspeech-to-texttypescriptvoice-activity-detection
Cutting edge AI technology for automated audio transcription. A nice GUI for OpenAIs Whisper and pyannote (speaker identification)
★ 1,876Pythonupdated 2026-04-17audio-transcriptionfaster-whisperinterviewpyannotequalitative-research
Simple, hackable offline speech to text - using the VOSK-API.
★ 1,818Pythonupdated 2025-10-10
Cross-Platform, GPU Accelerated Whisper 🏎️
★ 1,800TypeScriptupdated 2024-02-27audiomachine-learningrustspeech-recognitionwebgpu
Pure C inference of Mistral Voxtral Realtime 4B speech to text model
★ 1,627Cupdated 2026-02-15
Amica is an open source interface for interactive communication with 3D characters with voice synthesis and speech recognition.
★ 1,492TypeScriptupdated 2025-07-23aiassistant-chat-botscomputer-visionllmspeech-recognition
OBS plugin for local speech recognition and captioning using AI
★ 1,458C++updated 2026-04-09ailive-streaminglivestreamobsobs-studio
Speech Note Linux app. Note taking, reading and translating with offline Speech to Text, Text to Speech and Machine translation.
★ 1,442C++updated 2026-04-15asrflatpak-applicationslinux-desktopmachine-translationnmt
SALMONN family: A suite of advanced multi-modal LLMs
★ 1,412updated 2026-04-20audioaudio-processingaudio-visual-understandingbytedanceiclr2024
Offline inference engine for art, real-time voice conversations, LLM powered chatbots and automated workflows
★ 1,314Pythonupdated 2026-02-10aiai-artartasset-generatorchatbot
Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deployment, Windows desktop deployment, and Android deployment
★ 1,211Cupdated 2025-12-17androidasrchinesectranslate2huggingface
A free & open tool for transcribing audio interviews
★ 1,192JavaScriptupdated 2026-04-16
🎙️ AI Dictation App - Open Source and Local-first ⚡ Type 3x faster, no keyboard needed. 🆓 Powered by open source models, works offline, fast and accurate.
★ 1,176TypeScriptupdated 2026-04-18aiai-note-taking-appasrdictatedictation
Natural (2-way) voice conversations with Claude Code
★ 1,125Pythonupdated 2026-04-19anthropicasrclaudeclaudecodekokoro
A modular voice assistant application for experimenting with state-of-the-art transcription, response generation, and text-to-speech models. Supports OpenAI, Groq, Elevanlabs, CartesiaAI, and Deepgram APIs, plus local models via Ollama. Ideal for research and development in voice technology.
★ 1,119Pythonupdated 2025-11-22
A GUI tool for offline transcription of speech recordings, including speaker diarization, utilizing state-of-the-art machine learning models.
★ 1,113Pythonupdated 2026-04-16
Native speech-to-text for Linux - Fast, accurate and private system-wide dictation
★ 986Pythonupdated 2026-04-19aiarchlinuxcachyoscohere-aidebian
Near-Realtime audio transcription using self-hosted Whisper and WebSocket in Python/JS
★ 957Pythonupdated 2024-10-02aispeech-recognitionspeech-to-textwebsocket
An editing tool that uses AI to transcribe, understand content and search for anything in your footage, integrated with ChatGPT and other AI models
★ 936Pythonupdated 2025-02-26aichatgptdavinci-resolveeditingfilm-editing
Whisper.net. Speech to text made simple using Whisper Models
★ 906C#updated 2026-03-16cross-platformdotnetdotnetcorehacktoberfestspeech-recognition
Evaluate your speech-to-text system with similarity measures such as word error rate (WER)
★ 881Pythonupdated 2026-04-16automatic-speech-recognitionevaluation-metricspython3speech-to-textwer
Open source voice dictation technology
★ 880TypeScriptupdated 2026-04-19agentagentic-aidictationfreelocal-ai
Optimized Whisper models for streaming and on-device use
★ 829Pythonupdated 2026-04-09apple-siliconcoremlmlxnvidia-gpuon-device-ai
Generate transcripts for audio and video content with a user friendly UI, powered by Open AI's Whisper with automatic translations and download videos automatically with yt-dlp integration
★ 809JavaScriptupdated 2023-03-16expressjsgpulibretranslatemachine-learningnodejs
GLM-ASR-Nano: A robust, open-source speech recognition model with 1.5B parameters
★ 797Pythonupdated 2026-03-06asredgellmvoice
Fully local, private and cross platform Speech-to-Text with LLM Post-processing
★ 730TypeScriptupdated 2026-04-19asrasr-modeldebian-packageslinuxmacos
A 100% private AI voice transcription app that converts speech to text in 100+ languages. Built with Compose Multiplatform for Android & iOS using Whisper AI - no cloud uploads, all processing happens on-device for complete privacy.
★ 684C++updated 2026-04-07androidaudio-playercompose-ioscompose-multiplatformcompose-multiplatform-sample
Offline Speech Recognition with OpenAI Whisper and TensorFlow Lite for Android
★ 650C++updated 2026-03-18androidasrautomatic-speech-recognitionembeddedmobile
Conversational voice AI agents
★ 630Pythonupdated 2026-04-20agentic-aiagentsai-agentscartesiaconversational-ai
Real-time transcription using faster-whisper
★ 615HTMLupdated 2024-07-23faster-whisperopenaispeech-recognitionspeech-to-textvoice-recognition
A React component to make correcting automated transcriptions of audio and video easier and faster. By BBC News Labs. - Work in progress
★ 614JavaScriptupdated 2024-02-12bbc-news-labskaldinews-labsreactstt
Android Input Method Editor (IME) based on Whisper
★ 565Javaupdated 2026-02-07androidon-device-aiprivacy-protectionspeech-recognitiontranscription
ScribeWizard: Generate organized notes from audio using Groq, Whisper, and Llama3
★ 504Pythonupdated 2025-08-12aigroqgroq-apillama3replit
Take notes with your voice & transform them with AI
★ 496TypeScriptupdated 2026-03-03
Ming-UniAudio: Speech LLM for Joint Understanding, Generation and Editing with Unified Representation
★ 443Pythonupdated 2025-11-27speechspeech-recognitionspeech-synthesisunified-model
Tero Subtitler is an open source, cross-platform, and free subtitle editing software.
★ 443Pascalupdated 2026-04-18aiaudio-to-textblu-raycaptionseditor
End-to-end platform for building voice first multimodal agents
★ 431Pythonupdated 2024-10-28anyscalechatgpt-apiclaude-3-sonnetdeepgramelevenlabs
This project is a digital human that can talk and listen to you. It uses OpenAI's GPT to generate responses, OpenAI's Whisper to transcript the audio, Eleven Labs to generate voice and Rhubarb Lip Sync to generate the lip sync.
★ 430JavaScriptupdated 2026-01-18ai-avatarsdigital-humanelevenlabslip-synclipsync
🎙️ Speak with AI - Run locally using Ollama, OpenAI, Anthropic or xAI - Speech uses SparkTTS, OpenAI, ElevenLabs, Kokoro, Typecast or xAI
★ 427Pythonupdated 2026-04-14ai-speechai-voiceai-voice-agentanthropic-claudeconversational-ai
The best way to use AI is on your own computer. Use local or paid API models, and ctrl+k to show/hide the chat UI. Experience the future of AI, and help build it too!
★ 424Pythonupdated 2025-04-16aichatgptclaudedeepseekhotkeys
A simple GUI to use Whisper.
★ 424Pythonupdated 2025-07-18gradioguihuggingfaceinterfacespeech-recognition
A fully local and private Speech-To-Text app with cross-platform support, speaker diarization, Audio Notebook mode, LM Studio integration, and both longform and live transcription.
★ 396TypeScriptupdated 2026-04-19diarizationdictationdockerfaster-whisperlinux
On-device AI for Android — LLM chat (GGUF/llama.cpp), vision models (VLM), image generation (Stable Diffusion), tool calling, AI personas, RAG knowledge packs, TTS/STT. Fully offline, zero subscriptions, open-source.
★ 379Kotlinupdated 2026-03-15ai-personasandroidgguf-modelsjetpack-composekotlin
Free on-device web app for audio transcribing and rendering subtitles
★ 363ReScriptupdated 2026-02-01airescriptsubtitleswebcodecswhisper
Fine-tune and evaluate Whisper models for Automatic Speech Recognition (ASR) on custom datasets or datasets from huggingface.
★ 361Pythonupdated 2023-05-23asrjaxpytorchspeech-recognitiontransformers
Speech to Text but with all the bells and whistles and most importantly AI! AI will clean up your filler words, edit and will refine what you said!
★ 332Pythonupdated 2025-02-09
Wyoming protocol server for faster whisper speech to text system
★ 319Pythonupdated 2026-03-25
A lightweight Python package for Automatic Speech Recognition using ONNX models
★ 305Pythonupdated 2026-04-19asrlightweightonnxparakeetpython
AudioBench: A Universal Benchmark for Audio Large Language Models
★ 303Pythonupdated 2025-06-17audio-scene-understandingspeechspeech-question-answeringspeech-recognition
Android Input Method Editor (IME) based on RTranslators Whisper implementation
★ 297Javaupdated 2026-04-04
The Naomi Project is an open source, technology agnostic platform for developing always-on, voice-controlled applications!
★ 295Pythonupdated 2025-07-09assistanthacktoberfesthome-automationiotjarvis
Your faithful, impartial partner for audio evaluation — know yourself, know your rivals. 真实评测,知己知彼。
★ 290Pythonupdated 2026-04-08evaluationspeech-recognitionspeech-to-speechspeech-to-text
Private voice keyboard, AI chat, images, webcam, recordings, voice control with >= 4 GiB of VRAM.
★ 289Pythonupdated 2025-12-30aiassistant-chat-botsassistive-technologyclientclient-server
Free, open-source, 100% offline voice dictation for Linux. Speak and type anywhere via whisper.cpp, Whisper & VOSK engines, GPU-accelerated, works on X11 + Wayland!
★ 281Pythonupdated 2026-04-20accessibilitydictationgpu-accelerationlinuxoffline-first
See where Claude Code is burning tokens - turn raw JSONL transcripts into local cost analytics, hotspot views, and session-level usage insight.
★ 271Pythonupdated 2026-04-20
Get started using Deepgram's Live Transcription with this Next.js demo app
★ 270TypeScriptupdated 2026-04-10deepgramlivereal-timespeech-to-textstt
VOXD is a speech-to-text, voice-typing, dictation software for linux distributions. It is an open-source, free of charge, USER-FRIENDLY software, for as many linux distros as possible.
★ 243Pythonupdated 2025-10-22ai-transcriptiondictationlinuxopen-sourceuser-friendly
Transcribe audio and add subtitles to videos using Whisper in ComfyUI
★ 229Pythonupdated 2026-01-02comfyuistable-diffusionwhisper-ai
🎬 Auto-subtitle videos with AI transcription, translation, voice cloning, professional rendering, background image and music generator
★ 215JavaScriptupdated 2026-03-13
On-device Speech Recognition for Android
★ 208Kotlinupdated 2026-01-24
A modern, real-time speech recognition application built with OpenAI's Whisper and PySide6. This application provides a beautiful, native-looking interface for transcribing audio in real-time with support for multiple languages.
★ 200Pythonupdated 2025-09-13
A powerful Whisper AI keyboard for reliable speech transcription
★ 199Javaupdated 2026-01-06aiandroidgptkeyboardopenai
workbench for learning and practicing on-device AI technology in real scenario with online-TV on Android phone, powered by ggml(llama.cpp,whisper.cpp...) and FFmpeg and opencv-mobile
★ 192C++updated 2025-06-12ggml-hexagonllamacpp-android-portonline-tvqualcomm-npustablediffusioncpp-android-port
PySimpleGUI based DESKTOP APP to AUTO GENERATE SUBTITLE FILE (using free Google Speech Recognition API) and TRANSLATED SUBTITLE FILE (using unofficial online Google Translate API) for any video or audio file
★ 189Pythonupdated 2024-05-05auto-captionauto-subtitlecaptionsffmpeggoogle-translate
★ 186Jupyter Notebookupdated 2024-06-26
Packages whisper.cpp into pre-built, pip-installable wheels, for macOS and Linux.
★ 178Pythonupdated 2024-06-10
Input text from speech in any Linux window, the lean, fast and accurate way, using whisper.cpp OFFLINE. Speak with local LLMs via llama.cpp.
★ 172Shellupdated 2025-07-25accessibilityaibloat-freechatbotcli
Modern GUI application that transcribes and translate audio files using OpenAI Whisper.
★ 168Pythonupdated 2024-08-12
Like ChatGPT's voice conversations with an AI, but entirely offline/private/trade-secret-friendly, using local AI models such as LLama 2 and Whisper
★ 161Pythonupdated 2024-08-20ai-assistantai-assistantsandroiddesktopkoboldai
Open models for Coqui STT
★ 155updated 2023-05-09deep-learningmodelsspeech-to-text
MeetEval - A meeting transcription evaluation toolkit
★ 154Pythonupdated 2026-01-27asrderwer
State-of-the-art offline (or networked) voice typing everywhere + text terminals (Linux or WFL session on Windows.) with a simple bash script. Usable with X. Does not require X.
★ 154Shellupdated 2026-03-03bashcommand-linefreelightweightopen-source
An AI prompt project that uses AI to extract wisdom from all sorts of text, from podcast transcripts, conversations, talks, lectures, papers, articles, blog posts, essays, presentations, or whatever you can get into text form.
★ 139updated 2023-10-26
Use Home Assistant Assist on the desktop. Compatible with Windows, MacOS, and Linux
★ 132Svelteupdated 2026-01-15assistcross-platformdesktophome-assistanthome-assistant-assist
Generate karaoke videos, by downloading audio and lyrics, separating instrumentals, synchronising lyrics using transcription models, rendering CDG and uploading videos to YouTube / Dropbox / Google Drive
★ 128HTMLupdated 2026-04-27karaokekaraoke-makerlyricsmusicvideo
💬 Fast, cross-platform CLI and GUI for batch transcription, translation, speaker annotation and subtitle generation using OpenAI’s Whisper on CPU, Nvidia GPU and Apple MLX.
★ 121Pythonupdated 2026-04-06asrautomatic-speech-recognitionmlxmlx-audiospeech-recognition
Wayland Speech-to-Text Tool - A minimal signal-driven speech-to-text tool for Wayland environments with PipeWire audio
★ 120Rustupdated 2026-03-21
An Android keyboard that performs speech-to-text (STT/ASR) with OpenAI Whisper and input the recognized text; Supports English, Chinese, Japanese, etc. and even mixed languages.
★ 119Kotlinupdated 2025-12-21androidandroid-imeautomatic-speech-recognitionchinese-speech-recognitionime
Offline voice input panel & keyboard with punctuation for Android.
★ 112Javaupdated 2024-06-01android-appofflinespeech-recognitionspeech-to-textwhisper-ai
Gnome shell extension for accurate OFFLINE speech to text input in Linux using whisper.cpp. Input text from speech anywhere.
★ 105JavaScriptupdated 2025-04-09aiasrbloat-freedictatedictation
Curated list of open-source speech-to-text and voice typing tools for Linux, macOS, Windows, Android, and iOS. Offline, local, and cloud.
★ 99updated 2026-04-12aiautomatic-speech-recognitionawesome-listdictationdictation-tool
WhisperSubs is a mpv lua script to generate subtitles at runtime with whisper.cpp on Linux
★ 99Luaupdated 2025-02-09
Speak to AI • Native Linux Speech-to-Text (STT) • Offline, Privacy-Focused
★ 97Goupdated 2026-04-19appimagedictationgolinuxvoice-input
Create subtitles in various languages in mere minutes using Whisper and Qwen3-32b via Groq's lightning-fast inference.
★ 94Pythonupdated 2025-12-17managed-by-terraform
End-to-end workflow to automatically generate show notes from audio/video transcripts
★ 94TypeScriptupdated 2026-02-25assembly-aichatgptclaudedeepgramgemini
Voice typing for the Linux desktop.
★ 92Kotlinupdated 2026-04-19accessibilityasrdictationlinuxportals
A curated list of awesome disfluency detection publications along with the released code and bibliographical information
★ 83updated 2021-05-02awesome-listconversational-speech-recognitionconversational-speech-translationdeep-disfluency-codedeep-disfluency-detection
קול — Professional Transcription Studio. Hebrew-first, 4 engines, YouTube support, correction studio.
★ 81Pythonupdated 2026-04-12
Real-time speech recognition & AI-powered note-taking app for macOS with offline/online modes, multilingual transcription, and Japanese translation support.
★ 75TypeScriptupdated 2026-02-03macosmcpmcp-clientopenaiscreenshot
Hebrew whisper powerful transcription and translation tool
★ 74Pythonupdated 2024-05-15
A very simple whsper Python FastAPI for OpenAI API, Android voice-typing (konele), Home Assistant (wyoming), and a voice-typing script on Linux and MacOS!
★ 74Pythonupdated 2025-12-28fastapikoneleopenaiwhisper
Record, transcribe, and transform voice notes into structured insights. Leverage Whisper or AssemblyAI and ChatGPT to fill in gaps, generate summaries, and visualize ideas — all seamlessly integrated within Obsidian.
★ 72TypeScriptupdated 2026-02-24assemblyaiobsidianopenaivoice-assistant
This is a python script using whisper to type with your voice
★ 67Pythonupdated 2025-07-18
Simple GUI around whisper.cpp for voice-to-text on Linux
★ 67Pythonupdated 2026-04-08dictationlinuxopenaipromptvoice
Free in-browser audio & video censorship tool. AI-powered transcription with Whisper, 100% private client-side processing. Bleep profanity, custom words, or any phrase.
★ 63TypeScriptupdated 2026-04-16ffmpegpodcastprofanity-filterspeech-to-texttransformersjs
a comfyui cuatom node for audio subtitling based on whisperX and translators
★ 62Pythonupdated 2025-04-01srt-subtitlessutitlestranslationwhisper
Transcribe audio and video files with speaker diarization and logically grouped timestamps using Gemini Flash
★ 61Svelteupdated 2026-04-07geminigemini-flashspeaker-diarizationspeech-to-textsveltekit
Wyoming protocol server for the Whisper API speech to text system
★ 60Pythonupdated 2026-04-17
Desktop application for Linux and Windows that utilizes distil-whisper models from HuggingFace, to enable real-time offline speech-to-text dictation.
★ 60Pythonupdated 2025-10-16desktop-appdictationdistil-whisperhuggingfaceoffline
Benchmarking STT service TTFB and semantic WER for real-time AI applications
★ 57Pythonupdated 2026-03-20
A lightweight transcript editor for editing and correcting STT generated timed transcripts
★ 57JavaScriptupdated 2026-01-03
Super STT enables effortless voice-to-text in any application, using the most advanced speech models.
★ 54Rustupdated 2026-04-14cosmiccosmic-desktoplinuxspeech-recognitionspeech-to-text
🎯 AI-powered voice assistant for TickTick, enabling natural language task management through speech. Built with OpenAI's speech recognition and TickTick's API integration, this assistant helps you manage your todos hands-free - create tasks, set reminders, and organize your schedule using just your voice.
★ 54Pythonupdated 2025-02-24
Automatically generate subtitles from an input audio or video file using OpenAI Whisper
★ 53TypeScriptupdated 2026-03-17ffmpegopenaiopenai-whispersubtitle-generatorsubtitles
A node module to generate subtitles by segmenting a list of time-coded text - BBC News Labs
★ 53JavaScriptupdated 2023-07-08captionsdigital-paper-editittjsonnews-labs
An MCP Server for audio transcription using OpenAI
★ 52Pythonupdated 2025-10-16
A flutter library for offline speech-to-text conversion which use whisper.cpp models implementation for Android、iOS、macOS.
★ 51C++updated 2026-04-15androidflutterioswhisperwhisper-cpp
This project is a video processing application that extracts audio from videos, performs automatic speech recognition (ASR), and generates subtitles. It allows users to enhance audio quality, correct transcription errors, and convert subtitles into various dialects, all through a user-friendly command-line and web interface.
★ 50Pythonupdated 2025-03-30
insanely-fast-whisper with support for AMD GPU's with rocm 6.1 - 7.1
★ 48Pythonupdated 2026-04-18
VoiceTyper-Pro is an advanced speech-to-text dictation tool built with Python and powered by the Deepgram API. Alternative to Mac Whisper, Voice Access, and other voice typing tools.
★ 40Pythonupdated 2025-02-22deepgrammacvoice-recognitionwhisperwhisper-ai
🚀 Framework for seamless fine-tuning of Whisper model on a multi-lingual dataset and deployment to prod.
★ 38Pythonupdated 2025-02-27asrspeechspeech-recognitionspeech-to-textspeech-transcription
Speech-to-text typing for Linux/Wayland using Whisper.
★ 38Goupdated 2025-12-06
Linux Voice Assistant for to Make Your Work Easier
★ 37Pythonupdated 2020-01-30assistantassistant-chat-botsgooglegoogle-assistantgoogle-assistant-apps
A real-time, offline voice assistant for Linux and Raspberry Pi. Uses local LLMs (via Ollama), speech-to-text (Vosk), and text-to-speech (Piper) for fast, wake-free voice interaction. No cloud. No APIs. Just Python, a mic, and your voice.
★ 37Pythonupdated 2026-04-20androidchatbotdeep-learningechoesp-idf
audiov is a speech-to-text, voice-typing, dictation software for linux distributions.
★ 36Rustupdated 2026-03-14
Faster whisper Running on AMD GPUs with modified CTranslate 2 Libraries served up with Wyoming protocol
★ 36Pythonupdated 2024-08-17
A fully local, open-source voice-to-text tool that acts as a system-wide AI dictation layer, converting speech into clean, formatted text.
★ 35Goupdated 2026-04-13clidarwingolanglinuxllm-tools
Speech-to-text, text-to-speech with ElevenLabs
★ 35Pythonupdated 2023-12-21elevenlabspyside6pytorchspeech-to-texttext-to-speeh
SuperWhisper-like voice dictation for Linux with waveform UI
★ 32Pythonupdated 2026-01-22
Claude Code Skills for podcast/video editing: transcription, content editing, rough/fine cut, final polish
★ 30HTMLupdated 2026-03-17
Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using Whisper.
★ 30Pythonupdated 2023-05-27
AgenticSeek is a fully local, voice-enabled AI assistant designed to autonomously browse the web, write code, and plan tasks while ensuring complete privacy by keeping all data on your device. Tailored for local reasoning models, it runs entirely on your hardware, eliminating any cloud dependency.
★ 30Pythonupdated 2025-08-27ai-agentsai-assistantautonomous-web-browsingchromedrivercoding-assistance
Dockerized Whisper C++ speech-to-text API for easy deployment and rapid integration. Offering the latest stable and nightly builds for efficient audio transcription.
★ 28C++updated 2026-02-28apiaudio-transcriptiondockermachine-learningspeech-to-text
Effortless Push-to-Talk Transcription, Anywhere.
★ 27Pythonupdated 2026-04-05canarycanary-1b-v2coherecohere-transcribe-03-2026faster-whisper
A cross-platform desktop application that records audio and transcribes it to text using OpenAI's Whisper API or compatible services. Perfect for dictation, note-taking, and accessibility.
★ 27C#updated 2026-03-21
sherpa-onnx Go package for speech recognition without network access, supporting Linux, macOS, Windows
★ 26Goupdated 2026-04-16
WhisperX-powered voice transcription tool that types text directly at your cursor position. Hold F9 to record, release to transcribe.
★ 26Pythonupdated 2026-03-04pytorchtranscriptionvoice-commandswhisperwhisperx
🐍📦 Ultra-fast Python package for calculating and analyzing the Word Error Rate (WER). Built for the scalable evaluation of speech and transcription accuracy.
★ 26Pythonupdated 2026-03-30asrasr-evaluationautomatic-speech-recognitionlevenshtein-distancemetrics
A curated list of voice AI agent frameworks, tools, resources, and best practices
★ 25updated 2026-04-06agentsrealtime-chatsttttsvad
A stand-alone application with GUI for OpenAI's Whisper
★ 24Pythonupdated 2024-04-04guihacktoberfestiwr-hacktoberfestopenaipyinstaller
Speech-to-text for Linux that just works
★ 23Rustupdated 2026-04-09dictationspeech-to-textwhisper-cpp
This repository contains code for fine-tuning the Whisper speech-to-text model.
★ 23Jupyter Notebookupdated 2026-04-16fine-tuningnlpspeech-to-textwhisper
Prompt Management System for Interaction with the ChatGPT API
★ 23JavaScriptupdated 2026-04-07aiaudio-transcribingimage-generationprompt-databaseprompts
Handy voice dictation using whisper.
★ 23Pythonupdated 2026-03-30
Privacy‑first, real‑time speech‑to‑text dictation. 100% local inference in Rust; hotkey to dictate anywhere (macOS, Linux, Windows).
★ 22Rustupdated 2025-12-15cpalcross-platformdesktop-appdictationrust
Automatically create subtitles for any video using google speech to text cloud api.
★ 22Pythonupdated 2018-08-23
The only tool that replays Claude, Codex, Cursor, AND Gemini AI coding sessions in one unified UI. Vibe coding companion for reviewing, searching, and sharing your AI pair programming transcripts.
★ 21TypeScriptupdated 2026-02-03ai-pair-programmingai-transcriptclaudeclaude-codecodex
An interactive AI voice agent that can capture and transcribe speech in real-time, generate intelligent responses using the DeepSeek R1 (7B model) AI, and convert the responses back to natural speech for immediate playback. The agent maintains conversation context and supports cross-platform usage on macOS, Linux, and Windows.
★ 20Pythonupdated 2025-06-20assemblyaideepseekdeepseek-r1elevenlabsportaudio
The best Android keyboard for offline speech recognition, using OpenAI's whisper model through whisper.cpp for fast and accurate output.
★ 19Kotlinupdated 2025-02-22
Turn CAPSLOCK key into Dictation Key
★ 19Shellupdated 2025-06-09asrdictationspeech-recognitionspeech-to-textvoice
RunPod Serverless worker for WhisperX
★ 19Pythonupdated 2026-03-26
A Python tool that uses Google Gemini API to transcribe video or audio files into SRT subtitle files.
★ 19Pythonupdated 2026-01-02asrgeminigemini-apitranscribe
A Deepgram client for Dart and Flutter, supporting all Speech-to-Text and Text-to-Speech features on every platform.
★ 19Dartupdated 2025-09-12dartdeepgram-apiflutteropen-sourcesdk
A lightweight library for normalizing speech transcripts before computing WER
★ 18Pythonupdated 2026-04-17aiasrbenchmarknormalizationspeech-to-text
Sonori is a fully local STT app for Linux (Wayland).
★ 18Rustupdated 2026-03-14asrautomatic-speech-recognitionctranslate2linuxonnxruntime
Real-time voice input software using the Whisper model.
★ 17Pythonupdated 2025-07-19
A high-performance speech recognition MCP server based on Faster Whisper, providing efficient audio transcription capabilities.
★ 17Pythonupdated 2025-03-22
Transcribe audio/video to text, locally on macOS, Linux and Windows. A simple whisper.cpp wrapper/UI built with Go/Fyne.
★ 17Goupdated 2026-01-08ffmpegffmpeg-wrapperfyneguilocal
A small script that types what you say using whisper while holding a hotkey
★ 15Pythonupdated 2025-11-19openaipython3speach-recognitionspeach-to-texttyping-assistant
expands the boundaries of speech recognition technology for documentation productivity on the Linux PC. With dictation and transcription capabilities as well as control over your system written in Python using whisper.
★ 14Pythonupdated 2025-08-12
Linux-based voice-to-text tool using AI (Whisper/DeepGram) for real-time speech transcription. Command-line interface for easy recording, processing, and text output. Ideal for accessibility, dictation, and hands-free text input in Linux environments.
★ 14Pythonupdated 2025-04-20
A high-performance Model Context Protocol (MCP) server providing local speech-to-text transcription using whisper.cpp, optimized for Apple Silicon.
★ 13TypeScriptupdated 2025-06-03appleapple-siliconm1m2m3
Privacy-first meeting transcription and voice-to-text tool for Linux. 100% local AI processing with faster-whisper and Ollama.
★ 13Shellupdated 2026-04-19
Node.js app that transcribes WhatsApp voice notes to text using OpenAI's Whisper API. The text can also be translated to the user's preferred language and sent back to their WhatsApp account.
★ 13HTMLupdated 2026-04-17
Linux virtual keyboard driver which types what you say using Deepgram Flux STT API
★ 13Rustupdated 2026-02-03
Chrome extension that allows dictating anywhere using OpenAI Whisper
★ 13JavaScriptupdated 2023-09-29chrome-extensiondictationopenaiopenai-apitext-to-speech
Privacy-first voice dictation for Linux Wayland — press a key to talk, release to type. Powered by Whisper AI, 100% offline, no subscription required.
★ 11Pythonupdated 2026-04-03accessibilityappimagedictationgnomegtk
Cross-platform voice-to-text dictation for Linux and macOS. Local/private STT using Parakeet-TDT 1.1B with NVIDIA CUDA or Apple CoreML acceleration.
★ 11Makefileupdated 2026-03-24
Real-time AI dictation using faster-whisper—type anywhere with instant, accurate speech-to-text conversion.
★ 11Pythonupdated 2025-03-25
A voice recording and transcription tool for Hyprland, using Whisper for speech-to-text and copying results to clipboard. It's using Faster Whisper (optimized for CPU) and runs fully locally.
★ 11Pythonupdated 2026-01-03hyprlandhyprland-dotfileshyprland-ricewaylandwhisper
Voice to Text Online Notepad Professional, Accurate & Free Speech Recognition Text Editor Distraction-Free, Fast, Easy to Use Web App for Dictation & Typing
★ 11JavaScriptupdated 2026-02-14ant-designbootstraplocalstoragenpmpwa-app
This is my custom scripts to use Whisper / OpenAI by keyboard shortcuts and voice input.
★ 11Pythonupdated 2023-10-29
Whisper Flutter Example Speech To Text Offline Android Linux Without Api Key Without FFMPEG
★ 10C++updated 2025-08-02aiazkadevdartflutterggml
An audio/video transcriber with diarization and transcription editing.
★ 10JavaScriptupdated 2026-03-17
Fast, accurate voice typing for Linux — IBus input method engine with streaming STT, Whisper refinement, and CUDA acceleration
★ 10Pythonupdated 2026-02-10accessibilityfaster-whisperlinuxnixosspeech-to-text
A fully local, offline first speech-to-text application made for Linux!
★ 9Pythonupdated 2025-09-24
A MCP server that provides audio transcription capabilities using OpenAI's Whisper API
★ 9JavaScriptupdated 2025-03-25
A dictation application on linux using openai's whisper. Currently only used on KDE wayland.
★ 9Pythonupdated 2023-04-11
Transcribe Offline by openresearchtools.com is an open source desktop application that allows you to transcribe audio and video fully offline, with optional speaker diarisation and word-level alignment. It can also generate subtitles and integrate with local large language models (LLMs) for summarisation and editing
★ 9Rustupdated 2026-03-21ailocalaimacosopen-sourcetranscribe
Convert audio files (flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, and webm) to SRT subtitles with OpenAI Whisper. Easy script for fast, accurate transcription.
★ 9Pythonupdated 2024-06-11
Transform your voice into text effortlessly with Whisper Notes
★ 9Rustupdated 2025-02-16
Kivywhisper is a cross platform Python GUI for OpenAI's Whisper.
★ 9Pythonupdated 2024-07-26
Wisper - Voice dictation app for Linux. Type directly at cursor with AI-powered transcription.
★ 8TypeScriptupdated 2026-02-14assistanthelpervoice-recognition
Add voice-to-text capabilities to Claude Code using OpenAI Whisper for speech recognition.
★ 8Pythonupdated 2025-08-09
One-key voice-to-transcription tool: record speech, transcribe locally with Whisper, then paste. Never lose your audio files anymore!
★ 8Pythonupdated 2026-03-23chatgptlinuxllmollamaopen-source
A Linux first system-wide dictation tool to transcribe Speech To Text (STT) . Super accurate, fast, and free.
★ 7TypeScriptupdated 2026-04-15
Fine-tuned whisper that transcribe Hebrew audio into IPA
★ 7Pythonupdated 2026-04-08g2phebrewipawhisper
Pure C wrapper library to use Whisper.cpp with Linux and Windows as simple as possible.
★ 7C++updated 2025-09-17speech-to-textsttwhisperwhisper-cpp
The ultimate PyQt6 application that integrates the power of OpenAI, Google Gemini, Claude, and other open-source AI models
★ 7Pythonupdated 2025-12-07agentchatgptclaudedalleevaluator-optimizer-workflow
A voice transcription tool using faster-whisper that records audio and converts speech to text on Linux systems.
★ 7Pythonupdated 2025-02-20
A powerful, real-time dictation system for Linux
★ 6Pythonupdated 2025-06-22dictationlinuxspeech-to-textwhisper
Browser-Based AI Assistant: Speech-to-Text with Whisper and Local AI Answers
★ 6HTMLupdated 2024-07-04
Local realtime transcription tool powered by Voxtral Mini
★ 5Swiftupdated 2026-02-23
TalkType is a cross-platform application built with Electron, supporting Windows, macOS, and Linux. By combining Automatic Speech Recognition (ASR) with Large Language Models (LLM), it goes beyond simple dictation to offer "Understanding", "Polishing", and "Q&A" capabilities — your all-in-one voice writing assistant.
★ 5JavaScriptupdated 2026-03-01
Rudimentary program for speech transcription, manipulation, and redaction.
★ 5Pythonupdated 2024-07-17audiocensorcensorshippydubredaction
AudioWrite: Effortless voice dictation powered by Google's Gemini API. Record, transcribe, and transform rambling audio into polished, multi-language notes. PWA ready.
★ 5TypeScriptupdated 2026-01-05aiaudio-recorderdictationfrontendgemini-api
A modern, lightweight note-taking app powered by Whisper
★ 5Rustupdated 2025-05-06aiwhisperwhisper-ai
🎙️ Lightning-fast voice dictation Desktop Web App powered by Groq's Whisper Turbo - Open-source, privacy-first, with real-time audio visualization and intuitive click controls
★ 4Rustupdated 2026-03-14desktop-appdesktop-web-based-appgroqlinuxreal-time
A Linux / Gnome dictation app which uses fast whisper to do speach to text.
★ 4Pythonupdated 2026-01-09
Real-Time Transcription System for Niri - MacOS-like dictation for Linux Wayland environments
★ 4JavaScriptupdated 2026-02-14
Twidi Speech To Text (openai, push to talk, linux, wayland, deepgram)
★ 4Pythonupdated 2026-02-22deepgramopenaiopenai-apipush-to-talkspeech-to-text
Voice dictation for Linux/Wayland (like wisprflow). 100% offline, GPU-accelerated, and actually works with Wayland compositors.
★ 4Pythonupdated 2025-10-09dictationlinuxopenaiwaylandwhisper
🗣️ Whispers Talk. Recall. Repeat. A blazing-fast voice journal that remembers everything you say — searchable with AI. ✨ What is Whispers? Whispers is a voice-first journaling app powered by: 🧠 <300ms Latency Streaming Transcription (AssemblyAI) 🔍 Algolia MCP for instant search of your thoughts
★ 4TypeScriptupdated 2025-07-28
An offline-first desktop app to automatically transcribe and edit video subtitles using OpenAI Whisper. Full control over text, timing, and advanced styling in a powerful, intuitive editor.
★ 4Pythonupdated 2025-08-16
WhisperVoice: Covert voice notes. Encrypts text and hides it via LLM-generated acrostic sentences. Murf.ai creates natural audio. Browser extension decrypts with passcode, revealing hidden message or playing decoy for unauthorized listeners. Uses LLM, Murf.ai, STT APIs
★ 4JavaScriptupdated 2025-06-29murf-aimurf-ai-hackathon
Press F9. Speak. Paste. A blazing-fast, offline voice transcription tool for Linux using Whisper.cpp, bound to a global hotkey.
★ 4Shellupdated 2025-06-24clipboardlinuxspeech-to-textubuntuvoice-to-text
Speech-to-Text/Code using a fast local LLM, for Linux, uses Whisper
★ 4Pythonupdated 2025-11-15linuxttswhisperwhisper-ai
Voice input tool for Ubuntu 25.04 with Wayland. Record speech with hotkey, transcribe via Nexara API, and copy to clipboard.
★ 3JavaScriptupdated 2026-02-02
A fully offline, high-performance, streaming speech-to-text tool for developers on Linux.
★ 3Pythonupdated 2025-10-24
GPU-accelerated speech-to-text service that types what you say, powered by OpenAI's Whisper AI
★ 3Pythonupdated 2025-10-09accessibilitycudadictationgpu-accelerationlinux
type 10x faster with ai assisted voice typing
★ 3Pythonupdated 2025-03-17
Your voice - VocalFlow dictation, harnessing Whisper and faster-whisper for real-time transcription, adaptive learning, and NLP. Built with Python, it spans Linux, Windows, and macOS, boosting productivity through voice-assisted workflows.
★ 3Pythonupdated 2025-09-08cross-platformdesktop-appdictationfaster-whisperlinux
A user-friendly voice dictation application for Linux that supports multiple languages.
★ 3Pythonupdated 2025-11-03accessibilitydictationlinuxmultilingualproductivity
Real-time desktop audio transcription using OpenAI Whisper for Arch Linux with CUDA acceleration
★ 3Pythonupdated 2025-08-05
A powerful audio transcription server that seamlessly transcribes meeting recordings, generates notes, and intelligently splits audio files for efficient management. Open-source and built with FastMCP and Groq/OpenAI Whisper
★ 3Pythonupdated 2025-06-13
MCP server for real-time audio transcription using OpenAI Whisper
★ 3TypeScriptupdated 2025-10-08
Simple Python Tkinter GUI App for linux that uses whisper from openai for transcription.
★ 3Pythonupdated 2023-06-22
A local, real-time speech-to-text (STT) input tool for Linux, powered by RealtimeSTT and Faster-Whisper. Press a hotkey to dictate directly into any application.
★ 3Pythonupdated 2025-08-23
A 100% private AI voice transcription app that converts speech to text in 50+ languages. Built with Compose Multiplatform for Android using Whisper AI - no cloud uploads, all processing happens on-device for complete privacy.
★ 2C++updated 2025-08-05
Push-to-talk voice dictation for Linux. Record with PipeWire, transcribe locally via whisper.cpp, and type text into any app using ydotool. Fast, private, and works system-wide with a single hotkey.
★ 2Shellupdated 2025-10-20
Open-Source Speech-to-Text Evaluation Framework
★ 2Jupyter Notebookupdated 2025-02-12llm-evaluationllm-inferencespeech-to-text
A powerful MCP (Model Context Protocol) server that transcribes audio and video files into text using Groq's Whisper model.
★ 2Pythonupdated 2025-06-10
A push-to-talk wisper-flow service to support voice-base vibe-coding with claude
★ 2Pythonupdated 2026-04-15
Speech to text for linux using whisper
★ 2Pythonupdated 2025-05-17
A fast, lightweight Linux tool that converts speech to text and types it into any window using OpenAI's Whisper API.
★ 1Rustupdated 2025-11-07
Voice-to-text input daemon for Linux using OpenAI Whisper
★ 1Pythonupdated 2025-10-05
Offline Voice Dictation & Text Enhancement A lightweight, 100% local Linux tool for real-time voice‑to‑text transcription and LLM‑powered writing improvements.
★ 1Shellupdated 2025-05-15
from microphone directly to your app
★ 1Rustupdated 2026-04-15aidictationlinuxllmwhisper
macOS-style dictation for Ubuntu using Whisper. Press double-Ctrl, speak, and your words are transcribed to text locally with faster-whisper. Supports clipboard output, customizable hotkeys, and offline models for speed and privacy.
★ 1Pythonupdated 2025-09-24
Linux Live Dictation - Real-time speech-to-text with Whisper
★ 1Pythonupdated 2025-09-15
Linux voice transcription with hotkey using faster-whisper (local) with optional GPT-4o mini polishing
★ 1Pythonupdated 2026-04-16dictationlinuxparakeetpush-to-talkspeech-to-text
Linux log interpreter using AI
★ 1Pythonupdated 2025-05-10
Langflow-based LLM agent that keeps track of my personal projects. Based on integration with WhatsApp voice messages, Whisper, OpenAI/Mistral models and local MCP.
★ 1Pythonupdated 2025-06-22
A Model Context Protocol (MCP) server that provides ASR(Automatic Speech Recognition) capabilities using the whisper engine. This server exposes TTS functionality through MCP tools, making it easy to integrate speech synthesis into your applications.
★ 1Pythonupdated 2025-03-31
Whisper + TTS + As many MCP servers as I can stuff in
★ 1Pythonupdated 2025-05-01
Blazingly fast audio transcription MCP server using Whisper with Flash Attention 2
★ 1Pythonupdated 2025-12-04
mcp server for whisper-cli
★ 1Pythonupdated 2025-07-18
App for transcribing audio/video to editable SRT subtitles using Whisper. Supports mp3/mp4/wav inputs, audio extraction, and local download.
★ 1Pythonupdated 2025-05-26openai-apistreamlit
Automation of Whisper fine tuning using ClearML
★ 1Pythonupdated 2025-09-09clearmlclearml-servermlopsnlps3-storage
Desktop version of Whisper API called program, meant for quick, decent ASR for Linux and Windows.
★ 1Rustupdated 2025-08-26
Transcribe text using whisper.cpp on linux with a key combo & auto-type it
★ 1updated 2025-06-01
Sample implementation to whisperai for Linux with real time transcription
★ 1Pythonupdated 2026-04-14
A Linux utility that provides system-wide speech-to-text functionality by connecting to a remote Whisper API server.
★ 1Shellupdated 2025-06-04