<evan.rosa/>

The Inner Join
Build Scalable Data Pipelines.

Less NULLs. More value.

Cover Image for Apache Spark 4.0: New Features for Data Engineers

Apache Spark 4 introduces a modernized developer experience, SQL language upgrades, observability improvements, and Spark Connect, a lightweight client-server model that makes Spark more flexible than ever.

More Stories

Cover Image for Building a Real ETL Pipeline with Kafka, SQLMesh, BigQuery, and Superset — Step by Step

Building a Real ETL Pipeline with Kafka, SQLMesh, BigQuery, and Superset — Step by Step

This detailed tutorial shows you how to build a full batch ETL pipeline using Kafka, SQLMesh, BigQuery, and Superset. Learn how to set it up, how the tools work, and why this stack gives you scalable, testable data workflows.

Cover Image for Getting Started with SQLMesh: Versioned, Incremental Data Transformations Made Simple

Getting Started with SQLMesh: Versioned, Incremental Data Transformations Made Simple

SQLMesh brings version control, incremental builds, and environment isolation to your SQL pipelines. Here's how it works—and how to get started.

Cover Image for Choosing the Right Tools for Your Data Stack: A Practical Guide for Modern Teams

Choosing the Right Tools for Your Data Stack: A Practical Guide for Modern Teams

Not all data tools are created equal. Here's how to confidently pick the right tools for ingestion, storage, orchestration, transformation, and serving your data.

Cover Image for Deep Dive: Enterprise-Grade Architecture for Fresh, Fast, Scalable Data

Deep Dive: Enterprise-Grade Architecture for Fresh, Fast, Scalable Data

Beyond Bronze/Silver/Gold—go under the hood on design patterns, infrastructure, and battle-tested architecture for enterprise data at scale.

Cover Image for Top-Tier Guide to Designing Enterprise-Grade Data Architecture with Bronze, Silver, and Gold Layers

Top-Tier Guide to Designing Enterprise-Grade Data Architecture with Bronze, Silver, and Gold Layers

Master enterprise data pipelines with the Bronze Silver and Gold data layers model. Learn how to build scalable, modular, and reliable data architecture for analytics success.

Cover Image for Why Streaming ETL Is the Future — and How to Get Started

Why Streaming ETL Is the Future — and How to Get Started

Streaming ETL is no longer a niche—it's the foundation of real-time, event-driven systems. In this post, I break down when to use streaming pipelines, how Kafka and Flink fit together, and walk through a real-world example.

Cover Image for What is ETL in 2025? Moving Beyond Extract, Transform, Load

What is ETL in 2025? Moving Beyond Extract, Transform, Load

ETL has evolved—fast. Here's a clear, thoughtful guide on modern ETL vs. ELT, highlighting real-world use cases, tooling insights, and best practices for data engineers.

Cover Image for From Pipelines to Purpose: Why I’m Sharing My Journey in Data Engineering

From Pipelines to Purpose: Why I’m Sharing My Journey in Data Engineering

A senior data engineer's story of building real-time and batch pipelines—and why I'm sharing my journey to land my dream role.