Purpose
This place where I store whole things I know, learn and inspect about AI, ML and DA to build up these into consistent collections
General Awesome Repositories
Repository
- awesome-local-ai : An awesome repository of local AI tools
- awesome-deep-learning : A curated list of awesome Deep Learning tutorials, projects and communities.
- awesome-datascience: π An awesome Data Science repository to learn and apply for real world problems.
- applied-ml: π Papers & tech blogs by companies sharing their work on data science & machine learning in production.
- awesome-production-machine-learning: A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning. Website
- awesome-mlops: A curated list of references for MLOps. Website
- awesome-bigdata: A curated list of awesome big data frameworks, ressources and other awesomeness.
- data-engineering-roadmap: A comprehensive roadmap tailored for data engineering professionals at all levels
Page
- Hugging Face : The Opensource AI community
- LF AI & DATA Projects: Linux Foundation project about Open Source Innovation in Artificial Intelligence and Data
- Made With ML: Learning how to responsibly deliver value with ML!
- TensorFlow Hub: A repository of trained machine learning models.
- Papers With Code: The latest in Machine Learning
- Kaggle: Your Machine Learning and Data Science Community
Topic
- Data Engineering: Collection about data-engineer topics
- Machine learning: Closely related to artificial intelligence and computational statistics.
- Stream Processing: Collection about Stream Processing topics
Organization
- OpenAI: OpenAI Github Community
- Triton Inference Server: provides a cloud and edge inferencing solution optimized for both CPUs and GPUs.
Landscape
- LF AI & Data Foundation Interactive Landscape
- The 2024 MAD (Machine Learning, AI and Data) Landscape: Get it in PDF
Artificial intelligence & Machine Learning
Toolkits
- openvino: OpenVINOβ’ is an open-source toolkit for optimizing and deploying AI inference
- ray: an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
- pachyderm: Data-Centric Pipelines and Data Versioning
VectorDB
- Milvus: A high-performance, highly scalable vector database that runs efficiently across a wide range of environments, from a laptop to large-scale distributed systems
- Chroma: The AI-native open-source vector database (Opensource)
LLM
- private-gpt: Interact with your documents using the power of GPT, 100% privately, no data leaks. Website
Computer Vision
- opencv: Open Source Computer Vision Library. Website Version
Text To Speech
- VALL-E-X: An open source implementation of Microsoftβs VALL-E X zero-shot TTS model
Models
- ColossalAI: Making large AI models cheaper, faster and more accessible
- trl: Train transformer language models with reinforcement learning. Hugging Face
Blogs
- Medium - Marvelous MLOps
- DigitalOcean - AI/ML Topics: Articles and Community about AI/ML
- Neptune.ai - MLOps Learning Hub: Strategies, tools, practical insights, and example projects on MLOps
- Neptune.ai: Learn from AI/ML engineers, researchers, and folks building foundation models: best practices, tool reviews, and real-world examples.
- PyImageSearch: A brand new Computer Vision, Deep Learning, and OpenCV
- MarkTechPost: ML and AI Tech News
- MΓ¬ AI - Hα»c AI theo cΓ‘ch mΓ¬ Δn liα»n!: Learn about AI in Vietnamese Community
- Machine Learning Mastery: The best resources to approaching ML
- Machine Learning cΖ‘ bαΊ£n: Vietnamese Forum and Community about ML
- Machine Learning Blog: ML@CMU | Carnegie Mellon University
Articles
- Nanonets - Tesseract OCR in Python with Pytesseract & OpenCV
- Neptune.ai - MLOps Landscape in 2024: Top Tools and Platforms
- Milvus - Deploy a Milvus Cluster on EKS
- Medium - Understanding Milvus: Key Concepts and Potential Applications
- CNCF - CNCF Cloud Native AI White Paper
- Medium - Why You Shouldnβt Invest In Vector Databases?
- Datacamp - The Top 7 Vector Databases in 2025
Youtube Channel
- NeuralNine : Educational brand focusing on programming, machine learning and computer science
- sentdex : Funny guy who teach you about build cool stuff with python like AI
- MLOps.community : The MLOps Community fills the swiftly growing need to share real-world Machine Learning Operations best practices from engineers in the field
Data Analysis
Awesome Repositories
Tools
- airbyte: The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
- airflow: A platform to programmatically author, schedule, and monitor workflows
- active_workflow: Polyglot workflows without leaving the comfort of your technology stack.
- datahub: The Metadata Platform for your Data Stack
Blogs
Articles
- LakeFS - The State of Data Engineering 2024
- Practicle Data Engineering - Open Source Data Engineering Landscape 2024
- Medium - Data Pipeline Development with MinIO, Iceberg, Nessie, Polars, StarRocks, Mage, and Docker
- Medium - ETL and ELT
- Medium - ELT with Fabric, Azure and Databricks
- Medium - Apache Airflow Overview