Analytics & Data Science
145 repos
Bring data to life with SVG, Canvas and HTML. :bar_chart::chart_with_upwards_trend::tada:
The open and composable observability and data visualization platform. Visualize metrics, logs, and traces from multiple sources like Prometheus, Loki, Elasticsearch, InfluxDB, Postgres and many more.
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
ClickHouse® is a real-time analytics database management system
The HTML5 Creation Engine: Create beautiful digital content with the fastest, most flexible 2D WebGL renderer.
AdminLTE - Free admin dashboard template based on Bootstrap 5
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Streamlit — A faster way to build and share data apps.
Apache Spark - A unified analytics engine for large-scale data processing
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
This is a repo with links to everything you'd ever want to learn about data engineering
AI Data Vault - A query engine for AI Agents to securely query data from any datasource
DuckDB is an analytical in-process SQL database management system
The flexible backend for all your projects 🐰 Turn your DB into a headless CMS, admin panels, or apps with a custom UI, instant APIs, auth & more.
💫 Industrial-strength Natural Language Processing (NLP) in Python
Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.
:memo: An awesome Data Science repository to learn and apply for real world problems.
🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23
Marketing skills for Claude Code and AI agents. CRO, copywriting, SEO, analytics, and growth engineering.
Data Apps & Dashboards for Python. No JavaScript Required.
Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
Best Practices on Recommendation Systems
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
🐯 visx | visualization components
Go language library for reading and writing Microsoft Excel™ (XLAM / XLSM / XLSX / XLTM / XLTX) spreadsheets
Turns Data and AI algorithms into production-ready web applications in no time.
Open-source JavaScript charting library behind Plotly and Dash
Python SDK for Agent AI Observability, Monitoring and Evaluation Framework. Includes features like agent, llm and tools tracing, debugging multi-agentic system, self-hosted dashboard and advanced analytics with timeline and execution graph view
🦉 Data Versioning and ML Experiments
An orchestration platform for the development, production, and observation of data assets.
📊 Interactive JavaScript Charts built on SVG
FinceptTerminal is a modern finance application offering advanced market analytics, investment research, and economic data tools, designed for interactive exploration and data-driven decision-making in a user-friendly environment.
Open-source text-to-SQL and text-to-chart GenBI agent with a semantic layer. Ask your database questions in natural language — get accurate SQL, charts, and BI insights. Supports 12+ data sources (PostgreSQL, BigQuery, Snowflake, etc.) and any LLM (OpenAI, Claude, Gemini, Ollama).
Creative Coding: Generative Art, Data visualization, Interaction Design, Resources.
Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning, enrichments, chunking and embedding.
A curated list of awesome big data frameworks, ressources and other awesomeness.
WebGL2 powered visualization framework
Apache Druid: a high performance real-time analytics database.
A curated list of references for MLOps
Statistical data visualization in Python
dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
♾ A Graph Visualization Framework in JavaScript.
A JavaScript library aimed at visualizing graphs of thousands of nodes and edges
Kepler.gl is a powerful open source geospatial analysis tool for large-scale data sets.
Low-code framework for building custom LLMs, neural networks, and other AI models
A collection of composable React components for building interactive data visualizations
The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.
A data visualization and analytics component, especially well-suited for large and/or streaming datasets.
📈 A small, fast chart for time series, lines, areas, ohlc & bars
A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.
:bar_chart: A D3-based reusable chart library
Data Visualization Components
🧙 Build, run, and manage data pipelines for integrating and transforming data.
A curated list of data engineering tools for software developers
React friendly API wrapper around MapboxGL JS
JavaScript diagramming library for interactive flowcharts, org charts, design tools, planning tools, visual languages.
📱📈An elegant, interactive and flexible charting library for mobile.
Real-time Claude Code usage monitor with predictions and warnings
Python Data. Leaflet.js Maps.
Visually explore, understand, and present your data.
Portfolio analytics for quants, written in Python
Flower: A Friendly Federated AI Framework
Open source CSS framework for data visualization.
Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷
🔥 🔥 🔥 The Open Source Retool Alternative
Business intelligence as code: build fast, interactive data visualizations in SQL and markdown
Powerful data visualization library based on G2 and React.
The first "code-first" agent framework for seamlessly planning and executing data analytics tasks.
Aim 💫 — An easy-to-use & supercharged open-source experiment tracker.
🍊 :bar_chart: :bulb: Orange: Interactive data analysis
High-performance data engine for AI and multimodal workloads. Process images, audio, video, and structured data at any scale
ZenML 🙏: One AI Platform from Pipelines to Agents. https://zenml.io.
data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
A concise API for exploratory data visualization implementing a layered grammar of graphics
An AI-powered data science team of agents to help you perform common data science tasks 10X faster.
Visualizer for pandas data structures
Automatically generates beautiful and easy-to-read ER diagrams from your database.
A lightweight next-gen data explorer - Postgres, MySQL, SQLite, MongoDB, Redis, MariaDB, Elastic Search, and Clickhouse with Chat interface
Next generation of automated data exploratory analysis and visualization platform.
A curated list of awesome Jupyter projects, libraries and resources
:chart_with_upwards_trend: A curated list of awesome data visualization libraries and resources.
Create web apps from Python notebooks
Dashboards and notebooks in a single place. Create powerful and flexible dashboards using code, or build beautiful Notion-like notebooks and share them with your team.
Preswald is a WASM packager for Python-based interactive data apps: bundle full complex data workflows, particularly visualizations, into single files, runnable completely in-browser, using Pyodide, DuckDB, Pandas, and Plotly, Matplotlib, etc. Build dashboards, reports, and notebooks that run offline, load fast, and share like a document.
DeepAnalyze is the first agentic LLM for autonomous data science. 🎈你的AI数据分析师,自动分析大量数据,一键生成专业分析报告!
🌎 Large-scale WebGL-powered Geospatial Data Visualization analysis engine.
A Python package for interactive geospatial analysis and visualization with Google Earth Engine.
Java dataframe and visualization library
A charting and data visualization library for Unity. Unity数据可视化图表插件。
Vizro is a low-code toolkit for building high-quality data visualization apps.
A curated list of awesome ETL frameworks, libraries, and software.
UI components and hooks for building video/audio players on the web. Robust, customizable, and accessible. Modern alternative to JW Player and Video.js.
A desktop application for viewing and analyzing tabular data
The exceptionally handsome dashboard framework in Ruby and Coffeescript.
Algorithmic Trading in Python with Machine Learning
The platform for LLM evaluations and AI agent testing
An Awesome List of Open-Source Data Engineering Projects
Build apps that AI can generate, humans can review, and teams can maintain. Config that works between code and natural language.
GeoAI: Artificial Intelligence for Geospatial Data
Laminar - open-source observability platform purpose-built for AI agents. YC S24.
Learn to build your Second Brain AI assistant with LLMs, agents, RAG, fine-tuning, LLMOps and AI systems techniques.
All-in-one platform for search, recommendations, RAG, and analytics offered via API
Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
A list of publicly available datasets with real-time data maintained by the team at bytewax.io
Modern Confluence alternative designed for internal & external docs, built with Go + EmberJS
An open source user-empowering data visualization Vue 3 components library for eloquent data storytelling
DATAGEN: AI-driven multi-agent research assistant automating hypothesis generation, data analysis, and report writing.
Chat with your data - AI data analysis and visualization on CSV, Postgres, MySQL, Snowflake, SQLite...
📊 Data visualization library for React. Maintained by @goodcodeus.
Claude Code dashboard with usage stats, error analysis, and sharable feature
GRASS - free and open-source geospatial processing engine
📊 llm.report is an open-source logging and analytics platform for OpenAI: Log your ChatGPT API requests, analyze costs, and improve your prompts.
A collection of the most important Github repos for ML, AI & Data science practitioners
Interactive interface for browsing global, full-resolution satellite imagery
Visualise your Kedro data and machine-learning pipelines and track your experiments.
VectorFlow is a high volume vector embedding pipeline that ingests raw data, transforms it into vectors and writes it to a vector DB of your choice.
dbt + Metabase integration
Universal AI Agent Frontend. Build your backend we handle the rest.
JobSync is a self-hosted, open-source job application tracker and AI-powered career assistant. Built with Next.js and Shadcn UI, it helps job seekers manage their search journey with AI resume review, job matching, task logging, and application analytics—all while keeping your data private.
ClickHouse database driver for the Metabase business intelligence front-end
Huge AI models catalog. A curated list of AI tools, platforms, and resources across various domains.
SDK libraries for Modal
A CLI tool for logging and analyzing Claude Code and Cursor ai-driven coding session.
A simple, fully responsive Dashboard to forward to the services of your choice!
See where Claude Code is burning tokens - turn raw JSONL transcripts into local cost analytics, hotspot views, and session-level usage insight.
Highly Flexible Lovelace Card - arbitrary contents/columns/rows, regex matched, perfect to show appdaemon created content and anything breaking out of the entity_id + attributes concept
🐸 - A general purpose model trainer, as flexible as it gets
Automated system for LLM evaluation via agents.
🌟DataTonic : A Data-Capable AGI-style Agent Builder of Agents , that creates swarms , runs commands and securely processes and creates datasets, databases, visualisations, and analyses.
MCP OAuth Proxy incl. dynamic client registration (DCR), MCP prompt analytics and MCP firewall to build enterprise grade MCP servers.
List of publicly available, free/open source and open access resources for learning and doing data journalism.
A dynamic NewsAI dashboard that uses NLP to analyze news articles, visualize sentiment trends, and extract insights through interactive data visualizations.
Awesome list for datapipeline
A dataset of global salaries in AI/ML and Big Data.
🐍📦 Ultra-fast Python package for calculating and analyzing the Word Error Rate (WER). Built for the scalable evaluation of speech and transcription accuracy.
Data-Verse is an end-to-end AI data analysis agent that automates data ingestion, cleaning, pattern extraction, and predictive modeling, culminating in interactive visualizations—providing a comprehensive alternative to traditional data analysts.
🐲 Ultra-fast Claude Code Max usage dashboard with real-time analytics, multi-language support, and dragon-inspired design. Built with Electron + React + TypeScript for cross-platform desktop experience.
The AI Assistant uses OpenAI's GPT models and Langchain for agent management and memory handling. With a Streamlit interface, it offers interactive responses and supports efficient document search with FAISS. Users can upload and search pdf, docx, and txt files, making it a versatile tool for answering questions and retrieving content.
Open-source modular stack for AI-enhanced policy making. Includes reusable microservices for data ingestion, simulation, analytics, dashboards, strategy agents, and decision-support tools.
Quantitative geopolitical risk dashboard tracking Iran-Israel conflict escalation via market signals, GDELT news analytics, and probabilistic portfolio regime guidance.
This repository contains a multi layer analysis of news articles, editorial opinions and public comments about the ongoing Iran - Israel War. It synthesis the dominant themes by perspectives by global media channels and what is convergence/divergence of editor's opinions and common public to news articles.
A repository to support the Leeds Data Science presentation