šŸš€ Startup Data Stack Roadmap

    Build a scalable, cost-effective data stack using modern open-source tools and serverless architecture.

    āœ“ Expert-Designed Learning Path• Industry-Validated Curriculum• Real-World Application Focus

    This roadmap was created by data engineering professionals with 31 hands-on tasks covering production-ready skills used by companies like Netflix, Airbnb, and Spotify. Master DuckDB, Polars, Metabase and 3 more technologies.

    Beginner to Intermediate
    8 sections • 31 tasks

    Skills You'll Learn

    • SQL
    • Data modeling
    • Python
    • ETL/ELT
    • Serverless
    • Cloud

    Tools You'll Use

    • DuckDB
    • Polars
    • Metabase
    • AWS Lambda/GCP Cloud Functions
    • GitHub Actions/AWS EventBridge
    • GitHub

    Projects to Build

    Step 0: Pre-requisites and fundamentals

    -Learn the fundamentals of data engineering
    -Master Python basics and SQL
    -Understand cloud computing concepts

    Step 1: Local Development Environment

    -Set up Python virtual environment
    -Install Jupyter Notebooks
    -Configure DuckDB and Polars
    -Create your first data processing notebook

    Step 2: Data Processing with Polars

    -Learn Polars DataFrame operations
    -Practice data transformations in notebooks
    -Implement data quality checks
    -Optimize performance with Polars

    Step 3: Analytics with DuckDB

    -Learn DuckDB SQL syntax
    -Query public datasets
    -Create analytical views
    -Optimize query performance

    Step 4: Version control and CI/CD

    -Learn Git basics
    -Create a GitHub repository for your project
    -Set up GitHub Actions for data pipeline orchestration
    -Implement CI/CD for data quality checks

    Step 5: Serverless data processing

    -Set up AWS Lambda or GCP Cloud Functions
    -Create serverless data processing functions
    -Implement error handling and retries
    -Set up monitoring and logging

    Step 6: Data visualization with Metabase

    -Install and configure Metabase
    -Connect Metabase to DuckDB
    -Create dashboards and visualizations
    -Set up automated reporting

    Step 7: Production orchestration

    -Set up AWS EventBridge or GCP Cloud Scheduler
    -Create orchestration workflows
    -Implement monitoring and alerting
    -Set up data pipeline observability

    Sign up for free courses and get early access to AI-powered grading, quizzes, and curated learning resources for each roadmap step.