AI Toolkit

For the development and implementation of different AI services, here we list a series of projects that can significantly help in managing these services.

Machine Learning

Framework
Ray	Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
ZenML	Develop ML pipelines locally that run on any MLOps stack.
Prefect	Modern workflow orchestration for data and ML engineers.
Platform
Kubeflow	Machine Learning Toolkit for Kubernetes.
Weights & Biases	Weights & Biases helps AI developers build better models faster. Quickly track experiments, version and iterate on datasets, evaluate model performance, reproduce models, and manage your ML workflows end-to-end.
MLflow	Open source platform for the machine learning lifecycle.
Library
SciKit-Learn	Machine learning in Python
XGBoost	Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on a single machine, Hadoop, Spark, Dask, Flink and DataFlow.
Darts	A python library for user-friendly forecasting and anomaly detection on time series.
OpenCV	Open Source Computer Vision Library.

Model

Format & Interface
ONNX	Open standard for machine learning interoperability.
Workflow
Airflow	A platform to programmatically author, schedule, and monitor workflows.
Nifi	NiFi automates cybersecurity, observability, event streams, and generative AI data pipelines and distribution for thousands of companies worldwide across every industry.

Deep Learning

Framework
Tensorflow
Pytorch	Tensors and Dynamic neural networks in Python with strong GPU acceleration.
Library
Keras	Deep Learning for humans.
Pytorch Lightning	Deep learning framework to train, deploy, and ship AI products Lightning fast.
RAPIDS	RAPIDS provides unmatched speed with familiar APIs that match the most popular PyData libraries. Built on state-of-the-art foundations like NVIDIA CUDA and Apache Arrow, it unlocks the speed of GPUs with code you already know.
OpenMMLab	Covers a wide range of research topics of computer vision, e.g., classification, detection, segmentation and super-resolution.

Programming

Language
Python	The Python programming language.
Library
Dask	Parallel computing with task scheduling.
Numpy	The fundamental package for scientific computing with Python.
Hydra	Hydra is a framework for elegantly configuring complex applications
SciPy	SciPy library main repository.

Notebook Environment

Notebook Environment
Jupyter	Jupyter Interactive Notebook.
Colab	Python libraries for Google Colaboratory.

Distributed Computing

Computing & Management
Docker
Podman	A tool for managing OCI containers and pods.
Kubernetes	An open-source system for automating deployment, scaling, and management of containerized applications.
Spark	A unified analytics engine for large-scale data processing.
Portainer	Portainer is the most versatile container management software that simplifies your secure adoption of containers with remarkable speed.
OpenShift	Unified platform to build, modernize, and deploy applications at scale. Work smarter and faster with a complete set of services for bringing apps to market on your choice of infrastructure.
ArgoCD	Argo CD is a declarative, GitOps continuous delivery tool for Kubernetes.

Data

Relation DB
MySQL	MySQL Server, the world's most popular open source database, and MySQL Cluster, a real-time, open source transactional database.
Postgres	Develop ML pipelines locally that run on any MLOps stack.
Storage & Format
Delta Lake	An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs.
influxdb	Scalable datastore for metrics, events, and real-time analytics.
pandas	Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more.
Versioning
DVC	ML Experiments Management with Git.
Operations
Whylogs	An open-source data logging library for machine learning models and data pipelines. Provides visibility into data quality & model performance over time. Supports privacy-preserving data collection, ensuring safety & robustness.
AI system Logging & Monitor	AI system Logging & Monitor (RECICLAI)
Hive	The Apache Hive ™ is a distributed, fault-tolerant data warehouse system that enables analytics at a massive scale and facilitates reading, writing, and managing petabytes of data residing in distributed storage using SQL.
ETL
Airbyte	The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
Feature Engineering
tsfresh	tsfresh is a python package. It automatically calculates a large number of time series characteristics, the so called features. Further the package contains methods to evaluate the explaining power and importance of such characteristics for regression or classification tasks.
Stream Processing
Kafka	Apache Kafka is an open-source distributed event streaming platform used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications.
Flink	Apache Flink is a framework and distributed processing engine for stateful computations over unbounded and bounded data streams. Flink has been designed to run in all common cluster environments, perform computations at in-memory speed and at any scale.
Visualization
D3	Bring data to life with SVG, Canvas and HTML.
Plotly-Dash	Data Apps & Dashboards for Python. No JavaScript Required.
Grafana	The open and composable observability and data visualization platform. Visualize metrics, logs, and traces from multiple sources like Prometheus, Loki, Elasticsearch, InfluxDB, Postgres and many more.
Prometheus	The Prometheus monitoring system and time series database.
Streamlit	A faster way to build and share data apps.
Kibana	Run data analytics at speed and scale for observability, security, and search with Kibana. Powerful analysis on any data from any source, from threat intelligence to search analytics, logs to application monitoring, and much more.
Gradio	Gradio is the fastest way to demo your machine learning model with a friendly web interface so that anyone can use it, anywhere!
Pipeline Management
TPOT	TPOT is a Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
Labeling & Annotation
Label Studio	Label Studio is a multi-type data labeling and annotation tool with standardized output format.
CVAT	Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.
Supervisely	Develop AI faster and better with on-premise, enterprise-grade end-to-end solution for every task: from labeling to building production models.

Validation

Validation
Evidently AI	Evidently helps analyze and monitor the quality of machine learning models in production. It generates detailed reports on data drift and model performance, using visualizations to identify significant changes in input data or model performance.
Whylogs	Whylogs is a lightweight and scalable library for logging and monitoring ML data in production. It provides statistical profiles of input and output data, facilitating the detection of data drift and anomalies in real-time or batch data.
Promehteus & Grafana	Although not specific to ML, they can be adapted to monitor specific ML model metrics, including production accuracy. By defining custom metrics that reflect model performance, they can be used to capture and visualize data drift or model drift, though this requires manual configuration and clear metric definitions.
Alibi Detect	Specialized in anomaly and data drift detection, Alibi Detect offers a series of techniques and algorithms designed specifically to identify changes in input data and model behavior, which may indicate the need for retraining.
MLPerf (and MLCommons)	MLPerf is a suite of benchmarks that evaluates the performance of hardware, software, and machine learning models. It provides standardized metrics that allow comparing different implementations and configurations of ML, helping to identify best practices and optimizations in the field of machine learning.