• phone icon +44 7459 302492 email message icon support@uplatz.com
  • Register

BUY THIS COURSE (GBP 12 GBP 29)
4.8 (2 reviews)
( 10 Students )

 

Apache Iceberg

Master Apache Iceberg to manage large-scale data lakes with reliability, performance, and open table formats.
( add to cart )
Save 59% Offer ends on 31-Dec-2025
Course Duration: 10 Hours
  Price Match Guarantee   Full Lifetime Access     Access on any Device   Technical Support    Secure Checkout   Course Completion Certificate
Bestseller
Trending
Popular
Coming soon (2026)

Students also bought -

Completed the course? Request here for Certificate. ALL COURSES

Apache Iceberg is an open table format for huge analytic datasets, designed to bring reliability and performance to data lakes. Unlike traditional Hive tables, Iceberg enables schema evolution, time travel, partitioning, and ACID transactions, making it the backbone of modern data lakehouse architectures.
 
This course introduces learners to Iceberg fundamentals, architecture, integrations with Spark, Flink, and Trino, and best practices for building scalable analytics platforms. By the end, you’ll be able to design, query, and manage Iceberg tables in production-grade data lakes.

What You Will Gain
  • Understand the core principles of Apache Iceberg.

  • Learn how Iceberg solves limitations of Hive and parquet-only datasets.

  • Use schema evolution, partitioning, and time travel.

  • Run queries with Spark, Flink, Trino, and Presto.

  • Manage ACID transactions in a data lakehouse.

  • Integrate Iceberg with cloud storage (S3, GCS, ADLS).

  • Apply best practices for scaling and performance.


Who This Course Is For
  • Data engineers managing data lakes and lakehouses.

  • Analytics engineers working with Spark, Flink, and Trino.

  • Data scientists needing reliable and consistent data for ML.

  • DevOps engineers deploying scalable storage systems.

  • Students & professionals learning modern big data architectures.

  • Enterprises migrating from Hive tables to Iceberg.


How to Use This Course Effectively
 
  •  
    Start with Iceberg basics – architecture and motivation.
     
  •  
    Experiment with small datasets and Iceberg tables in Spark.
     
  •  
    Progress to schema evolution, time travel, and partitioning.
     
  •  
    Integrate with query engines like Flink and Trino.
     
  •  
    Deploy Iceberg with cloud-native storage.
     
  •  
    Revisit modules for optimization and best practices.

Course Objectives Back to Top

By completing this course, learners will:

  • Deploy and configure Apache Iceberg.

  • Create and manage Iceberg tables.

  • Implement schema evolution and partitioning.

  • Use time travel and rollback features.

  • Integrate Iceberg with modern query engines.

  • Operate Iceberg at scale in a data lakehouse environment.

Course Syllabus Back to Top

Course Syllabus

Module 1: Introduction to Apache Iceberg

  • What is Apache Iceberg?

  • Iceberg vs Hive/Delta/Parquet tables

  • Installing and setting up Iceberg

Module 2: Core Architecture

  • Table format design

  • Metadata layers (snapshots, manifests)

  • ACID transactions in Iceberg

  • Partitioning strategies

Module 3: Table Operations

  • Creating Iceberg tables

  • Inserting and updating data

  • Deletes and incremental updates

  • Time travel and rollback

Module 4: Schema Evolution

  • Adding, renaming, and deleting columns

  • Partition evolution

  • Managing table versions

  • Handling large-scale schema changes

Module 5: Query Engines Integration

  • Iceberg with Apache Spark

  • Iceberg with Apache Flink

  • Iceberg with Trino and Presto

  • SQL queries and analytics

Module 6: Cloud & Storage Integration

  • Iceberg with Amazon S3

  • Iceberg with Google Cloud Storage (GCS)

  • Iceberg with Azure Data Lake (ADLS)

  • On-premises Hadoop compatibility

Module 7: Deployment & Scaling

  • Deploying Iceberg in production

  • Optimizing query performance

  • Compaction and garbage collection

  • Monitoring and observability

Module 8: Real-World Projects

  • Building a data lakehouse with Iceberg and Spark

  • Streaming ingestion with Flink + Iceberg

  • Time travel analytics for business intelligence

  • Multi-engine queries with Trino + Iceberg

Module 9: Best Practices & Future Trends

  • Iceberg vs Delta Lake vs Apache Hudi

  • Cost optimization strategies

  • Data governance and compliance

  • The future of open table formats

Certification Back to Top

Apache Iceberg skills prepare learners for roles such as:

  • Data Engineer (big data pipelines)

  • Analytics Engineer (BI + data lakes)

  • Cloud Data Engineer (AWS/GCP/Azure)

  • Big Data Architect (lakehouse systems)

  • Machine Learning Engineer (data preparation at scale)

 

Iceberg is being rapidly adopted by companies like Netflix, Apple, and Adobe to power modern data platforms, making it a valuable and in-demand skill.

Career & Jobs Back to Top

Learners will receive a Certificate of Completion from Uplatz, validating their expertise in Apache Iceberg and modern data lakehouse technologies. This certification demonstrates readiness for roles in data engineering, analytics, and big data platform development.

Interview Questions Back to Top

1. What is Apache Iceberg?
An open table format for large-scale analytics datasets, enabling schema evolution, time travel, and ACID transactions in data lakes.

2. How does Iceberg differ from Hive tables?
Iceberg supports schema evolution, partitioning, and ACID operations, while Hive tables are rigid and lack transactional reliability.

3. What are Iceberg’s key features?

  • Schema evolution

  • Time travel queries

  • Hidden partitioning

  • ACID transactions

4. What query engines support Iceberg?
Spark, Flink, Trino, Presto, and Hive.

5. What is time travel in Iceberg?
The ability to query past versions of a dataset using snapshots.

6. How does Iceberg achieve ACID transactions?
Through atomic snapshot replacement and metadata layers that track changes safely.

7. What storage systems work with Iceberg?
Amazon S3, Google Cloud Storage, Azure Data Lake, and HDFS.

8. What are the benefits of Iceberg?

  • Reliability at scale

  • Performance with big data

  • Compatibility with multiple engines

  • Open-source, vendor-neutral

9. What are challenges with Iceberg?

  • Complex setup compared to Hive

  • Relatively new ecosystem

  • Requires expertise in Spark/Flink integration

10. Where is Apache Iceberg being adopted?
By enterprises and tech leaders like Netflix, Apple, Adobe, and others modernizing their data lakehouse architectures.

Course Quiz Back to Top
Start Quiz



BUY THIS COURSE (GBP 12 GBP 29)