ManningBooks

ManningBooks

Devtalk Sponsor

Architecting an Apache Iceberg Data Lakehouse (Manning)

The book focuses on designing a complete, modular lakehouse architecture using Apache Iceberg—leveraging open source tools instead of relying on closed, vendor-specific platforms.

Highlights:

  • End-to-end guidance on building an Iceberg-based lakehouse from storage to BI
  • Integrates tools like Spark, Flink, Dremio, and Polaris
  • Covers ingestion pipelines (batch & streaming), schema evolution, governance, and security
  • Hands-on examples using PostgreSQL, Apache Superset, and more
  • Focus on real-world tradeoffs and implementation decisions at scale

The “lakehouse” data architecture is a powerful way to combine the flexibility of data lakes with the management features of data warehouses. The open source Apache Iceberg framework delivers the scalability, reliability, and performance you want from a lakehouse without the expense and vendor lock-in of platforms like Snowflake, BigQuery, and Redshift.

In Architecting an Apache Iceberg Data Lakehouse , data guru Alex Merced shows you:

  • How to create a modular, scalable Iceberg lakehouse architecture
  • Where Spark, Flink, Dremio, Polaris fit into your design
  • Reliable batch and streaming ingestion pipelines
  • Strategies for governance, security, and performance at scale

Apache Iceberg is an open source table format perfect for massive analytic datasets. Iceberg enables ACID transactions, schema evolution, and high-performance queries on data lakes using multiple compute engines like Spark, Trino, Flink, Presto, and Hive. An Iceberg data lakehouse enables fast, reliable analytics at scale while retaining the observability you need for compliance audits, governance, and provable data security.


If you’re exploring Iceberg as an alternative to platforms like Snowflake or BigQuery—or already using it and want to deepen your understanding—this could be a useful resource. The Early Access format also means readers can give feedback as the book evolves.

Full details: Architecting an Apache Iceberg Lakehouse - Alex Merced

Don’t forget you can get 45% off with your Devtalk discount! Just use the coupon code “devtalk.com" at checkout :+1:

Where Next?

Popular Other Fields topics Top

PragmaticBookshelf
From finance to artificial intelligence, genetic algorithms are a powerful tool with a wide array of applications. But you don't need an ...
New
First poster: bot
As a student, when I was starting to seriously consider Data Science (DS) as a career option, the first thing that came to mind was where...
New
First poster: bot
We introduce the problem of perpetual view generation —long-range generation of novel views corresponding to an arbitrarily long camera t...
New
First poster: bot
Summary In this project, we Added an OpenGL backend for MXNet/TVM - a general-purpose tensor computation framework, so that it automat...
New
First poster: bot
What is Logica? Logica is an open source declarative logic programming language for data manipulation. Logica is a successor to Yedalog, ...
New
First poster: davearonson
Deep learning may transform health care, but model development has largely been dependent on availability of advanced technical expertise...
New
First poster: bot
Two recent collaborations between mathematicians and DeepMind demonstrate the potential of machine learning to help researchers generate ...
New
ManningBooks
The book focuses on designing a complete, modular lakehouse architecture using Apache Iceberg—leveraging open source tools instead of rel...
New
ManningBooks
DAX Reimagined isn’t just another beginner’s guide to the powerful DAX language. This unique book teaches you how to work with the engine...
New
ManningBooks
Timeless Algorithms: The Seminal Papers explains both the how and the why of the most important data science algorithms. Along with the t...
New

Other popular topics Top

New
ohm
Which, if any, games do you play? On what platform? I just bought (and completed) Minecraft Dungeons for my Nintendo Switch. Other than ...
New
siddhant3030
I’m thinking of buying a monitor that I can rotate to use as a vertical monitor? Also, I want to know if someone is using it for program...
New
AstonJ
poll poll Be sure to check out @Dusty’s article posted here: An Introduction to Alternative Keyboard Layouts It’s one of the best write-...
New
Exadra37
Oh just spent so much time on this to discover now that RancherOS is in end of life but Rancher is refusing to mark the Github repo as su...
New
AstonJ
We’ve talked about his book briefly here but it is quickly becoming obsolete - so he’s decided to create a series of 7 podcasts, the firs...
New
New
First poster: AstonJ
Jan | Rethink the Computer. Jan turns your computer into an AI machine by running LLMs locally on your computer. It’s a privacy-focus, l...
New
PragmaticBookshelf
Develop, deploy, and debug BEAM applications using BEAMOps: a new paradigm that focuses on scalability, fault tolerance, and owning each ...
New
NewsBot
Node.js v22.14.0 has been released. Link: Release 2025-02-11, Version 22.14.0 'Jod' (LTS), @aduh95 · nodejs/node · GitHub
New