🏗️ Data Warehousing Concepts

    Understand data warehousing principles including OLAP, dimensional modeling, and modern cloud warehouse platforms.

    Level:
    Intermediate
    Tools:
    Snowflake
    BigQuery
    Redshift
    PostgreSQL
    dbt

    Skills You'll Learn:

    Data warehouse design
    OLTP vs OLAP
    Star and Snowflake schema
    ETL/ELT patterns
    Partitioning and clustering
    Slowly Changing Dimensions

    Step 1: OLTP vs OLAP

    • 1Understand transactional (OLTP) database characteristics and use cases
    • 2Learn analytical (OLAP) database characteristics and how they differ from OLTP
    • 3Compare row-oriented vs columnar storage formats and their performance trade-offs
    • 4Identify when to use OLTP vs OLAP systems in a data architecture
    • 5Explore how modern cloud warehouses blend OLTP and OLAP capabilities

    Step 2: Data Warehouse Architecture

    • 1Learn the core layers of a data warehouse: staging, integration, and presentation
    • 2Understand the role of the ETL/ELT process in populating a warehouse
    • 3Compare Kimball (bottom-up) vs Inmon (top-down) architecture approaches
    • 4Explore the data lakehouse pattern and how it extends traditional warehousing
    • 5Design a simple three-layer warehouse architecture for a sample business domain

    Step 3: Star vs Snowflake Schema

    • 1Define fact tables and understand different fact table types (transactional, snapshot, accumulating)
    • 2Define dimension tables and learn about conformed dimensions
    • 3Build a star schema with one fact table and multiple dimensions
    • 4Convert a star schema to a snowflake schema by normalizing dimensions
    • 5Evaluate the trade-offs between star and snowflake schemas for query performance and storage

    Step 4: Slowly Changing Dimensions (SCD)

    • 1Understand why dimensions change over time and why tracking history matters
    • 2Implement SCD Type 1 (overwrite) for dimensions where history is not needed
    • 3Implement SCD Type 2 (add new row) with effective dates and current flags
    • 4Learn SCD Type 3 (add new column) and when it is appropriate
    • 5Practice implementing SCD Type 2 using dbt snapshots

    Step 5: Partitioning and Clustering

    • 1Understand table partitioning and how it reduces query scan size
    • 2Choose effective partition keys based on common query patterns
    • 3Learn clustering (sort keys) and how they complement partitioning
    • 4Practice partitioning a large table in BigQuery or Snowflake
    • 5Monitor and optimize partition pruning and clustering effectiveness

    Step 6: Modern Cloud Warehouses

    • 1Explore Snowflake architecture: virtual warehouses, storage separation, and Time Travel
    • 2Explore BigQuery architecture: serverless compute, slots, and nested/repeated fields
    • 3Explore Redshift architecture: node types, distribution keys, and sort keys
    • 4Compare pricing models across Snowflake, BigQuery, and Redshift
    • 5Set up a free-tier cloud warehouse account and run analytical queries on a sample dataset

    Recommended Resources

    Snowflake Documentation

    documentation
    Visit →

    Google BigQuery Documentation

    documentation
    Visit →

    Kimball Group Design Tips

    documentation
    Visit →

    Ready to Apply Your Knowledge?

    Put these fundamental concepts into practice with our hands-on projects and structured roadmaps.