A Little Introduction to Apache Iceberg

The internal Apache Iceberg architecture is key:

To understand what actually the Iceberg OTF tables are,
How to access and use them,
Troubleshoot workloads,
Identify errors in the data and keep its quality, and
Integrate a Data Fabric based on Iceberg tables with the rest of the analytical ecosystem.

For all these reasons, this post explains the basics on how Apache Iceberg works. Nevertheless, the references at the bottom, which I used to write this article, can help you to deep dive in other topics.

Apache Iceberg’s Origin

Netflix’s employees Ryan Blue and Daniel Weeks created Iceberg open table format in 2017. They designed Iceberg to work around Apache Hive issues, such as performance and consistency, among others.

As you probably remember, Apache Hive was developed to simplify coding workloads on Hadoop by writing SQL instead of MapReduce jobs directly. In other words, the Hive framework converts SQL statements into MapReduce jobs that Hadoop can execute. To that end, the Hive open table format indicates the data in the Hadoop storage, which represents a unique table, and Hive Metastore allows tracking these tables.

So, the Hive table format defines a table as all files within a directory (or prefixes for object storage). The partitions of those tables are the subdirectories.

Separately, the Hive Metastore tracks the directory paths defining the tables. Then the compute engines can access the Metastore to know where to find the data applicable to their query.

Hive brought more efficient query patterns than full table scans when accessing files for analytics. Furthermore, it is format-agnostic, and it allows for all-or-nothing (atomic) changes to an individual partition in the table through the Metastore.

However, Hive access to files is not without weaknesses, such as:

File-level changes are inefficient.
There is no mechanism for atomically updating multiple partitions in one transaction.
It can’t perform simultaneous, concurrent updates.
Listing files and directories slows down the engines’ performance.
Partitioning helps only if the query filters by the partition column. If the query uses a derived value (such as a month in a timestamp), Hive performs a full table scan.
Hive calculates statistics through asynchronous jobs, if it calculates them at all.
Queries on tables with large numbers of files in a single partition (i.e., all the files are in one prefix) have performance issues.

So, Iceberg improved many of Hive’s issues In Netflix’s Data Fabric. Eventually, Netflix made the Iceberg project open source in 2018 and donated to the Apache Software Foundation, where many other organisations got involved, contributed to it and implemented it within their ecosystems.

Apache Iceberg Architecture: Tree of Metadata

Apache Iceberg aims for existing tools to embrace its open file format as a standard. With this intention, it was designed to leverage existing, popular storage solutions and compute engines. The secret seasoning to achieve this goal is Iceberg’s architecture, which relies on a tree of metadata. The infographic below describes this architecture.

There is a high-resolution version of this infographic in the OTF repository within my GitHub account.

As a side note, the Iceberg catalogue maintains part of the metadata needed to keep Iceberg capabilities. Organisations still need an Enterprise-level data catalogue in their ecosystem to ingest metadata (business, technical, operational) from a workload run on OTF in general, and in Iceberg in particular.

Apache Iceberg Main Features

As I have already said, the Apache Iceberg project is an open table format. It represents the specifications (or standard) for how the metadata that defines a data lakehouse table should be written across several files.

There are many libraries to support the adoption of this standard, Iceberg. The libraries help to work with the format or compute engines to implement support.

Note that Apache Iceberg also has implementations for open-source compute engines such as Apache Spark and Apache Flink, as well as the ones that different vendors provide, such as Teradata or Snowflake, among others.

Moreover, Iceberg works with many existing tools, and it is designed to use existing storage solutions, such as on the object storage provided by the main Cloud Service Providers (AWS S3, Azure Blob Storage, Google Cloud Storage) or on Dell ECS on-prem.

As we can deduce from the previous section, in Iceberg, a table is a canonical list of files, instead of tracking a table as a list of directories and subdirectories.

Separately, that Apache Iceberg uses optimistic concurrency control to enable ACID (Atomicity, Consistency, Isolation, and Durability) guarantees. The Catalogue manages the concurrency guarantees.

Keep in mind that, optimistic concurrency assumes transactions won’t conflict. It only checks for conflicts when necessary. Consequently, transactions commit or fail, but Iceberg doesn’t lock the tables for read or write. The objective is to improve performance.

On another front, in Iceberg, partitioning occurs in two parts:

The column where a user places the partition. It represents the physical partitioning.
An optional transform function applied to the partitioning column. These transform functions include bucket, truncate, year, month, day, and hour. The transform function eliminates the need to create more columns for partitioning.

As for performance, Iceberg allows for optimising the table’s row-level update (and delete) patterns to take one of two forms:

Copy-On-Write (COW) — When a user changes any row in a data file, the entire file is rewritten, with the row-level change in the new file, even if a single record in it is updated.
Merge-On-Read (MOR) — In this case, Iceberg only writes a new file that contains the changes to the affected row.

Regarding resilience, the Time Travel feature creates an isolated snapshot. It lets the table’s current state to be reverted to any of those previous snapshots.

Other important features are the so-called Evolution, which involves seamlessly changes in the schemas (tables definitions), partitions and the files with ordered data. These group of features improve performance, and of paramount importance to integrate the OTF data within an analytical ecosystem.

Finally, to connect the Compute Engines, Iceberg offers APIs in Java. It also allows some functionality in Python, Rust and Go. There is also an unofficial C++ connector, whose development started in early 2025.

Iceberg Write Process

So far, I have analysed Iceberg’s architecture, why it is created this way and the main characteristics in provides to data and workloads. Now, I can bring all this together to explain how Iceberg writes. The infographic below explains the write process at a high level.

There is a high-resolution version of this infographic in the OTF repository within my GitHub account.

Iceberg Read Process

In order to complete the circle, the diagram below shows Iceberg reads process.

There is a high-resolution version of this infographic in the OTF repository within my GitHub account.

References

Apache Iceberg. (n.d.). Apache Iceberg online documentation on the Apache Iceberg official website. Retrieved in August 2025, from https://iceberg.apache.org/docs/nightly/

Shiran, T., Hughes, J. and Merced, A. (2024). Apache Iceberg: The Definitive Guide. Data Lakehouse Functionality, Performance, and Scalability on the Data Lake. O’Reilly Media, Inc.

Celia Muriel

A Little Introduction to Apache Iceberg

Apache Iceberg’s Origin

Apache Iceberg Architecture: Tree of Metadata

Apache Iceberg Main Features

Iceberg Write Process

Iceberg Read Process

References

Comments

Leave a Reply Cancel reply