Machine Learning & Deep Learning
326 repos
An Open Source Machine Learning Framework for Everyone
f.k.a. Awesome ChatGPT Prompts. Share, discover, and collect prompts from the community. Free and open source — self-host for your organization with complete privacy.
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
A high-throughput and memory-efficient inference and serving engine for LLMs
🐙 Guides, papers, lessons, notebooks and resources for prompt engineering, context engineering, RAG, and AI Agents.
Tesseract Open Source OCR Engine (main repository)
A curated list of awesome Machine Learning frameworks, libraries and software.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Financial data platform for analysts, quants and AI agents.
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Ultralytics YOLO 🚀
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Streamlit — A faster way to build and share data apps.
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
Making large AI models cheaper, faster and more accessible
Qlib is an AI-oriented Quant investment platform that aims to use AI tech to empower Quant Research, from exploring ideas to implementing productions. Qlib supports diverse ML modeling paradigms, including supervised learning, market dynamics modeling, and RL, and is now equipped with https://github.com/microsoft/RD-Agent to automate R&D process.
AI-Powered Photos App for the Decentralized Web 🌈💎✨
Google Research
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
In-depth tutorials on LLMs, RAGs and real-world AI agent applications.
Vane is an AI-powered answering engine.
💫 Industrial-strength Natural Language Processing (NLP) in Python
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
Visualizer for neural network, deep learning and machine learning models
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
NVR with realtime local object detection for IP cameras
Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.
Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
:memo: An awesome Data Science repository to learn and apply for real world problems.
An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.
Label Studio is a multi-type data labeling and annotation tool with standardized output format
This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. Each technique has a detailed notebook tutorial.
Improve your resumes with Resume Matcher. Get insights, keyword suggestions and tune your resumes to job descriptions.
The open source AI engineering platform for agents, LLMs, and ML models. MLflow enables teams of all sizes to debug, evaluate, monitor, and optimize production-quality AI applications while controlling costs and managing access to models and data.
Open-source AI orchestration framework for building context-engineered, production-ready LLM applications. Design modular pipelines and agent workflows with explicit control over retrieval, routing, memory, and generation. Built for scalable agents, RAG, multimodal applications, semantic search, and conversational systems.
GUI for a Vocal Remover that uses Deep Neural Networks.
Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.
Learn OpenCV : C++ and Python Examples
A Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion and Race) Library for Python
Universal LLM Deployment Engine with ML Compilation
Faster Whisper transcription with CTranslate2
Best Practices on Recommendation Systems
🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools
FinGPT: Open-Source Financial Large Language Models! Revolutionize 🔥 We release the trained model on HuggingFace.
A machine learning-based video super resolution and frame interpolation framework. Est. Hack the Valley II, 2018.
The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.
Tongyi Deep Research, the Leading Open-source Deep Research Agent
Run agents that work for you based on what you do. AI finally knows what you are doing
Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama model family and using them on various provider services
The free and privacy-friendly screen recorder with no limits 🎥
Private AI platform for agents, assistants and enterprise search. Built-in Agent Builder, Deep research, Document analysis, Multi-model support, and API connectivity for agents.
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
🐫 CAMEL: The first and the best multi-agent framework. Finding the Scaling Law of Agents. https://www.camel-ai.org
Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native database.
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
🦉 Data Versioning and ML Experiments
FinceptTerminal is a modern finance application offering advanced market analytics, investment research, and economic data tools, designed for interactive exploration and data-driven decision-making in a user-friendly environment.
This repository contains the source code for the paper First Order Motion Model for Image Animation
FinRL®: Financial Reinforcement Learning. 🔥
The open-source hub to build & deploy GPT/LLM Agents ⚡️
Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning, enrichments, chunking and embedding.
Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
A curated list of references for MLOps
A curated list of Artificial Intelligence (AI) courses, books, video lectures and papers.
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
Node-based Visual Programming Toolbox
💡 All-in-one AI framework for semantic search, LLM orchestration and language model workflows
Low-code framework for building custom LLMs, neural networks, and other AI models
COLMAP - Structure-from-Motion and Multi-View Stereo
A PyTorch-based Speech Toolkit
TensorZero is an open-source LLMOps platform that unifies an LLM gateway, observability, evaluation, optimization, and experimentation.
The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.
A fast, local neural text to speech system
Large Language Model Text Generation Inference
Open source annotation tool for machine learning practitioners.
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
OpenVINO™ is an open source toolkit for optimizing and deploying AI inference
Techniques for deep learning with satellite & aerial imagery
🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
Build, Manage and Deploy AI/ML Systems
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.
AI powered open source recommender system engine supports classical/LLM rankers and multimodal content via embedding
A collection of pre-trained, state-of-the-art models in the ONNX format
Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.
Containers for machine learning
Anomaly detection related books, papers, videos, and toolboxes. Last update late 2025 for LLM and VLM works!
Swap GPT for any LLM by changing a single line of code. Xinference lets you run open-source, speech, and multimodal models on cloud, on-prem, or your laptop — all through one unified, production-ready inference API.
Deeplake is AI Data Runtime for Agents. It provides serverless postgres with a multimodal datalake, enabling scalable retrieval and training.
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
ModelScope: bring the notion of Model-as-a-Service to life.
🧙 Build, run, and manage data pipelines for integrating and transforming data.
视频硬字幕提取,生成srt文件。无需申请第三方API,本地实现文本识别。基于深度学习的视频字幕提取框架,包含字幕区域检测、字幕内容提取。A GUI tool for extracting hard-coded subtitle (hardsub) from videos and generating srt files.
A curated list of practical financial machine learning tools and applications.
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
A curated list of awesome embedded programming.
This repo is meant to serve as a guide for Machine Learning/AI technical interviews.
Multilingual Voice Understanding Model
A self-hosted open source photo management service.
Background Remover lets you Remove Background from images and video using AI with a simple command line interface that is free and open source.
An open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data. Supports NLP, pattern matching, and customizable pipelines.
Flexible and powerful framework for managing multiple AI agents and handling complex conversations
Awesome Object Detection based on handong1587 github: https://handong1587.github.io/deep_learning/2015/10/09/object-detection.html
A customisable 3D platform for agent-based AI research
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
SD.Next: All-in-one WebUI for AI generative image and video creation, captioning and processing
An open source library for deep learning end-to-end dialog systems and chatbots.
AI + Data, online. https://vespa.ai
Flower: A Friendly Federated AI Framework
FinRobot: An Open-Source AI Agent Platform for Financial Analysis using LLMs 🚀 🚀 🚀
Mycroft Core, the Mycroft Artificial Intelligence platform.
The Enterprise-Grade Production-Ready Multi-Agent Orchestration Framework. Website: https://swarms.ai
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
YuE: Open Full-song Music Generation Foundation Model, something similar to Suno.ai but open
Aim 💫 — An easy-to-use & supercharged open-source experiment tracker.
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
Silero Models: pre-trained text-to-speech models made embarrassingly simple
This repository contains a hand-curated resources for Prompt Engineering with a focus on Generative Pre-trained Transformer (GPT), ChatGPT, PaLM etc
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
🔥 Awesome list of resources on Web Development.
🍊 :bar_chart: :bulb: Orange: Interactive data analysis
High-performance data engine for AI and multimodal workloads. Process images, audio, video, and structured data at any scale
ZenML 🙏: One AI Platform from Pipelines to Agents. https://zenml.io.
Superduper: End-to-end framework for building custom AI applications and agents.
An AI-powered data science team of agents to help you perform common data science tasks 10X faster.
Pretrained Pytorch face detection (MTCNN) and facial recognition (InceptionResnet) models
Easily train or fine-tune SOTA computer vision models with one open source training library. The home of Yolo-NAS.
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
A curated list of Artificial Intelligence Top Tools
Long list of geospatial tools and resources
Kodezi Chronos is a debugging-first language model that achieves state-of-the-art results on SWE-bench Lite (80.33%) and 67% real-world fix accuracy, over six times better than GPT-4. Built with Adaptive Graph-Guided Retrieval and Persistent Debug Memory. Model available Q1 2026 via Kodezi OS.
Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets
An Open-Source Framework for Prompt-Learning.
On-device wake word detection powered by deep learning
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model w/CPU ONNX and NVIDIA GPU PyTorch support, handling, and auto-stitching
Prompt Engineering, Generative AI, and LLM Guide by Learn Prompting | Join our discord for the largest Prompt Engineering learning community
Build local voice agents with open-source models
Next generation of automated data exploratory analysis and visualization platform.
ALICE (Automated Learning and Intelligence for Causation and Economics) is a Microsoft Research project aimed at applying Artificial Intelligence concepts to economic decision making. One of its goals is to build a toolkit that combines state-of-the-art machine learning techniques with econometrics in order to bring automation to complex causal inference problems. To date, the ALICE Python SDK (econml) implements orthogonal machine learning algorithms such as the double machine learning work of Chernozhukov et al. This toolkit is designed to measure the causal effect of some treatment variable(s) t on an outcome variable y, controlling for a set of features x.
An Open Source text-to-speech system built by inverting Whisper.
Prompt Engineering | Prompt Versioning | Use GPT or other prompt based models to get structured output. Join our discord for Prompt-Engineering, LLMs and other latest research
🤗 AutoTrain Advanced
Fast inference engine for Transformer models
Visual Object Tagging Tool: An electron app for building end to end Object Detection Models from Images and Videos.
🏝️ OASIS: Open Agent Social Interaction Simulations with One Million Agents.
RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry
A curated list of awesome data labeling tools
modular quant framework.
An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.
Noise supression using deep filtering
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
SwarmUI (formerly StableSwarmUI), A Modular Stable Diffusion Web-User-Interface, with an emphasis on making powertools easily accessible, high performance, and extensibility.
TorchGeo: datasets, samplers, transforms, and pre-trained models for geospatial data
A Python package for segmenting geospatial data with the Segment Anything Model (SAM)
🛰️ List of satellite image training datasets with annotations for computer vision and deep learning
ChatGPT Jailbreaks, GPT Assistants Prompt Leaks, GPTs Prompt Injection, LLM Prompt Security, Super Prompts, Prompt Hack, Prompt Security, Ai Prompt Engineering, Adversarial Machine Learning.
TimeGPT-1: production ready pre-trained Time Series Foundation Model for forecasting and anomaly detection. Generative pretrained transformer for time series trained on over 100B data points. It's capable of accurately predicting various domains such as retail, electricity, finance, and IoT with just a few lines of code 🚀.
Java dataframe and visualization library
An extensive node suite that enables ComfyUI to process 3D inputs (Mesh & UV Texture, etc) using cutting edge algorithms (3DGS, NeRF, etc.)
A deep learning library for video understanding research.
🚀 Curated collection of Amazing Python scripts from Basics to Advance with automation task scripts.
[TMLR] A curated list of language modeling researches for code (and other software engineering activities), plus related datasets.
A library for audio and music analysis, feature extraction.
Algorithmic Trading in Python with Machine Learning
Evaluation and Tracking for LLM Experiments and AI Agents
QualityScaler - image/video AI upscaler app
Open source, local, and self-hosted Amazon Echo/Google Home competitive Voice Assistant alternative
Self-hosted, local only NVR and AI Computer Vision software. With features such as object detection, motion detection, face recognition and more, it gives you the power to keep an eye on your home, office or any other place you want to monitor.
Toolkit for creating, sharing and using natural language prompts.
TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.
Must-read Papers on LLM Agents.
A modular high-level library to train embodied AI agents across a variety of tasks and environments.
GeoAI: Artificial Intelligence for Geospatial Data
Data manipulation and transformation for audio signal processing, powered by PyTorch
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
A Web UI for easy subtitle using whisper model.
Control Any Computer Using LLMs.
✨ Build a machine learning model from a prompt
Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax.
A toolkit to run Ray applications on Kubernetes
🤗 Evaluate: A library for easily evaluating machine learning models and datasets.
news-please - an integrated web crawler and information extractor for news that just works
(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.
the terminal client for Ollama
UpTrain is an open-source unified platform to evaluate and improve Generative AI applications. We provide grades for 20+ preconfigured checks (covering language, code, embedding use-cases), perform root cause analysis on failure cases and give insights on how to resolve them.
Objectron is a dataset of short, object-centric video clips. In addition, the videos also contain AR session metadata including camera poses, sparse point-clouds and planes. In each video, the camera moves around and above the object and captures it from different views. Each object is annotated with a 3D bounding box. The 3D bounding box describes the object’s position, orientation, and dimensions. The dataset contains about 15K annotated video clips and 4M annotated images in the following categories: bikes, books, bottles, cameras, cereal boxes, chairs, cups, laptops, and shoes
A powerful tool that translates ComfyUI workflows into executable Python code.
An open source library and framework for deep learning on satellite and aerial imagery.
A comprehensive and up-to-date compilation of datasets, tools, methods, review papers, and competitions for remote sensing change detection.
🧠 Make your agents learn from experience. Now available as a hosted solution at kayba.ai
Open-source AI-driven quantitative trading platform for crypto, stocks, and forex with backtesting, live trading, market data, and multi-agent research.
Collaborate & label any type of data, images, text, or documents, in an easy web interface or desktop app.
:cloud: :rocket: :bar_chart: :chart_with_upwards_trend: Evaluating state of the art in AI
🤖 A Python library for learning and evaluating knowledge graph embeddings
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
Ui.Vision Open-Source RPA Software with Computer Vision, OCR, Anthropic Computer Use/LLM. Selenium IDE import/export.
ContextGem: Effortless LLM extraction from documents
Cross-Platform, GPU Accelerated Whisper 🏎️
🚀🚀🚀 A collection of some awesome public YOLO object detection series projects and the related object detection datasets.
A community-driven collection of RAG (Retrieval-Augmented Generation) frameworks, projects, and resources. Contribute and explore the evolving RAG ecosystem.
Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration
AI for GNU Image Manipulation Program
Amica is an open source interface for interactive communication with 3D characters with voice synthesis and speech recognition.
Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"
A Python/Pytorch app for easily synthesising human voices
This repository aims to map the ecosystem of artificial intelligence guidelines, principles, codes of ethics, standards, regulation and beyond.
Offline inference engine for art, real-time voice conversations, LLM powered chatbots and automated workflows
The python library for research and development in NLP, multimodal LLMs, Agents, ML, Knowledge Graphs, and more.
Parallax is a distributed model serving framework that lets you build your own AI cluster anywhere
Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deployment, Windows desktop deployment, and Android deployment
Unofficial PyTorch implementation of Google AI's VoiceFilter system
Self-host the powerful Chatterbox TTS model. This server offers a user-friendly Web UI, flexible API endpoints (incl. OpenAI compatible), predefined voices, voice cloning, and large audiobook-scale text processing. Runs accelerated on NVIDIA (CUDA), AMD (ROCm), and CPU.
Xtreme1 is an all-in-one data labeling and annotation platform for multimodal data training and supports 3D LiDAR point cloud, image, and LLM.
Examples of programs built using Modal
😎 Awesome list of Retrieval-Augmented Generation (RAG) applications in Generative AI.
Open source audio annotation tool for humans
Datasets for deep learning with satellite & aerial imagery
A GUI tool for offline transcription of speech recordings, including speaker diarization, utilizing state-of-the-art machine learning models.
PyTorch implementation of Audio Flamingo: Series of Advanced Audio Understanding Language Models
GRASS - free and open-source geospatial processing engine
Sentiment Analysis, Text Classification, Text Augmentation, Text Adversarial defense, etc.;
Build agents which are controlled by LLMs
TorchBench is a collection of open source benchmarks used to evaluate PyTorch performance.
Web UI for AutoGen (A Framework Multi-Agent LLM Applications)
🔍 LangKit: An open-source toolkit for monitoring Large Language Models (LLMs). 📚 Extracts signals from prompts & responses, ensuring safety & security. 🛡️ Features include text quality, relevance metrics, & sentiment analysis. 📊 A comprehensive tool for LLM observability. 👀
Rasa UI is a frontend for the Rasa Framework
CodeProject.AI Server is a self contained service that software developers can include in, and distribute with, their applications in order to augment their apps with the power of AI.
🔧 A curated list of awesome dataset tools
AI Audio Datasets (AI-ADS) 🎵, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio applications.
Enhance Images with Javascript and AI. Increase resolution, retouch, denoise, and more. Open Source, Browser & Node Compatible, MIT License.
Deep research agent to help you find the best GitHub repositories 🕵️!
Backend that powers the dataset viewer on Hugging Face dataset pages through a public API.
A TensorFlow based wake word detection training framework using synthetic sample generation suitable for certain microcontrollers.
Generate transcripts for audio and video content with a user friendly UI, powered by Open AI's Whisper with automatic translations and download videos automatically with yt-dlp integration
🍰 PromptLayer - Maintain a log of your prompts and OpenAI API requests. Track, debug, and replay old completions.
Visualise your Kedro data and machine-learning pipelines and track your experiments.
Orchestra is a human-in-the-loop AI system for orchestrating project teams of experts and machines.
VectorFlow is a high volume vector embedding pipeline that ingests raw data, transforms it into vectors and writes it to a vector DB of your choice.
Papers, code and datasets about deep learning for 3D Object Detection.
OmniVinci is an omni-modal LLM for joint understanding of vision, audio, and language.
A Python-based toolbox of various methods in decision making, uncertainty quantification and statistical emulation: multi-fidelity, experimental design, Bayesian optimisation, Bayesian quadrature, etc.
Offline Speech Recognition with OpenAI Whisper and TensorFlow Lite for Android
🐝 Multi-agent swarm coordination for OpenCode with learning capabilities, agent issue tracking, and management
The Hailo Model Zoo includes pre-trained models and a full building and evaluation environment
Well tested & Multi-language evaluation framework for text summarization.
Computer vision based ML training data generation tool :rocket:
Tock, the open source conversational AI toolkit.
ChatGPT PROMPTs Splitter. Tool for safely process chunks of up to 15,000 characters per request
MAD: The first work to explore Multi-Agent Debate with Large Language Models :D
Extract hardcoded subtitles from videos using machine learning
🤖 AI browser extensions & userscripts to augment your web experience
🤖 Awesome list of AGI Agents. Agents 精选资源合集.
A library for easily merging multiple LLM experts, and efficiently train the merged LLM.
Android Voice Activity Detection (VAD) library. Supports WebRTC VAD GMM, Silero VAD DNN, Yamnet VAD DNN models.
Python scripts performing object detection using the YOLOv8 model in ONNX.
SDK libraries for Modal
No code AI agents
A Community Open-Source Saas for Crafting/Building/Creating Chatbots with OpenAI's Assistant API that you can add to your website.
Fine-tune and evaluate Whisper models for Automatic Speech Recognition (ASR) on custom datasets or datasets from huggingface.
Autocomplete your obsidian notes with AI, including ChatGPT, through a copilot-like interface.
A curated list of awesome open source healthcare tools, algorithms, datasets and research papers.
Home of the AI workforce - Multi-agent system, AI agents & tools
Agentic AI platform that harnesses Visual LLM Chaining to build proactive digital assistants
Microsoft Finance Time Series Forecasting Framework (FinnTS) is a forecasting package that utilizes cutting-edge time series forecasting and parallelization on the cloud to produce accurate forecasts for financial data.
This is a list of awesome articles about object detection from video.
🐸 - A general purpose model trainer, as flexible as it gets
The AI Podcast Studio: generate podcasts scripts and their audio version with a team of AI workers in a Podcast Studio 🎙️📜
AlphaSuite is an open-source quantitative analysis platform that gives you the power to build, test, and deploy professional-grade trading strategies. It's designed for traders and analysts who want to move beyond simple backtests and develop a genuine, data-driven edge in the financial markets.
Verify the authenticity of handwritten signatures through digital image processing and neural networks. ✍️
A toolkit for building computer use AI agents
DeepSeek CLI, a command-line AI coding assistant that leverages the powerful DeepSeek Coder models
A framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings.
A modular framework built to simplify Computer Vision inference workloads.
Deep neural network (DNN) for noise reduction, removal of background music, and speech separation
Open models for Coqui STT
The BEST music separation model with help of A.I. ... to my ears ! 👂👂
This repo aims to record resource of role-playing abilities in LLMs, including dataset, paper, application, etc.
simplifies the process of creating and managing LLM workflows.
A comprehensive evaluation framework for AI agents and LLM applications.
Gnome shell extension for accurate OFFLINE speech to text input in Linux using whisper.cpp. Input text from speech anywhere.
List of PostgreSQL® AI projects and resources
A wargaming platform compatible with reinforcement learning agents
☕ GPT-2 chatbot for daily conversation
The project uses a computer vision model to extract structured features from floor plan images for a fire risk assessment.
Multi-Agent Blog Generator based on Agno framework. Supports leading LLM providers like OpenAI, Gemini, Claude, and Grok.
🤖🔎 STREAM: Search with Top Result Extraction & Answer Model 🔤📊 SEEKTOPIC 🚜📜 Tractor the Text Extractor 📈📝 REASON Docs Writing Agent
A utility to inspect, validate, sign and verify machine learning model files.
A curated list of all things awesome about OpenAI
Comprehensive guide to AI applications in OSINT workflows and intelligence analysis
GitHub repository of the Introduction to Machine Learning course in the Hebrew University of Jerusalem. Includes code examples, labs, and exercise templates
Hebrew Diacritizer
Mission to create a Hebrew TTS model as powerful and user-friendly as WaveNet
A dynamic NewsAI dashboard that uses NLP to analyze news articles, visualize sentiment trends, and extract insights through interactive data visualizations.
A real-time, offline voice assistant for Linux and Raspberry Pi. Uses local LLMs (via Ollama), speech-to-text (Vosk), and text-to-speech (Piper) for fast, wake-free voice interaction. No cloud. No APIs. Just Python, a mic, and your voice.
Speech-to-text, text-to-speech with ElevenLabs
AgenticSeek is a fully local, voice-enabled AI assistant designed to autonomously browse the web, write code, and plan tasks while ensuring complete privacy by keeping all data on your device. Tailored for local reasoning models, it runs entirely on your hardware, eliminating any cloud dependency.
Dockerized Whisper C++ speech-to-text API for easy deployment and rapid integration. Offering the latest stable and nightly builds for efficient audio transcription.
A dataset of global salaries in AI/ML and Big Data.
WhisperX-powered voice transcription tool that types text directly at your cursor position. Hold F9 to record, release to transcribe.
🐍📦 Ultra-fast Python package for calculating and analyzing the Word Error Rate (WER). Built for the scalable evaluation of speech and transcription accuracy.
💹 StockSim: Multi-Agent LLM Financial Market Simulator — A realistic trading simulation platform for evaluating large language models in dynamic financial environments.
This repository contains code for fine-tuning the Whisper speech-to-text model.
(NVIDIA) FramePack is a next-frame (next-frame-section) prediction neural network structure that generates videos progressively.
[AAAI 2024] Official pytorch implementation of “Learning Real-World Image De-Weathering with Imperfect Supervision”
MultimodalHugs is an extension of Hugging Face that offers a generalized framework for training, evaluating, and using multimodal AI models with minimal code differences, ensuring seamless compatibility with Hugging Face pipelines.
🚀 Unleash AMD GPU Performance: Fix PyTorch ROCm detection for 4x AI/ML speedup on RX 6000/7000 series for Pinokio and developers / custom setups
PyTorch docker images for use in GPU cloud and local environments. Includes AI-Dock base for authentication and improved user experience.
A modern FastAPI-based web app for real-time object detection using YOLO models, supporting image and video uploads, model selection, live streaming, and interactive UI.
The AI Assistant uses OpenAI's GPT models and Langchain for agent management and memory handling. With a Streamlit interface, it offers interactive responses and supports efficient document search with FAISS. Users can upload and search pdf, docx, and txt files, making it a versatile tool for answering questions and retrieving content.
This app allows users to take notes by recording and analyze the content using machine learning technology.
About AI-Powered Medical Assistant 🏥🤖 The AI-Powered Medical Assistant is an intelligent healthcare platform that utilizes AI to assist users in symptom analysis, treatment recommendations, medical research, and patient management. By integrating advanced AI models and multiple innovative features, this project enhances healthcare accessibility,
Modern NVR with object/motion/audio detection, push notifications, multi-location, and encrypted local and cloud-based storage support built in.
GPU-accelerated speech-to-text service that types what you say, powered by OpenAI's Whisper AI
Your voice - VocalFlow dictation, harnessing Whisper and faster-whisper for real-time transcription, adaptive learning, and NLP. Built with Python, it spans Linux, Windows, and macOS, boosting productivity through voice-assisted workflows.
A deep learning application that classifies the reason for a baby's cry (hunger, pain, etc.) from live or uploaded audio. Built with a TensorFlow/Keras CNN, Librosa for audio processing, and a responsive Flask web UI with real-time recording and visualization. Helps caregivers understand an infant's needs instantly.
Self-hosted, local only NVR and AI Computer Vision software. With features such as object detection, motion detection, face recognition and more, it gives you the power to keep an eye on your home,…
🏡 Transform real estate searches with natural language queries; find contextually relevant listings effortlessly using ML embeddings and vector search.
Product deduplication pipeline for Israeli price-comparison — Hebrew/English normalization, FAISS embeddings, LLM cluster refinement. Pair F1: 0.955
deep-learning-research-sub-agents for claude code
Automation of Whisper fine tuning using ClearML
A repository to support the Leeds Data Science presentation