Apache Iceberg™ in Yandex Data Processing
Apache Iceberg™
-
Supports the high-performance Apache Iceberg™ tables you use the same way as regular SQL tables.
-
Provides the schema evolution
mechanism which eliminates side effects when updating schemas. -
Provides hidden partitioning
in auto mode thus preventing errors related to manual partitioning. -
Allows retrospective requests enabled by the time travel
mechanism. You can use the feature to make reproducible requests based on table snapshots or compare changes.Note
This mechanism requires Apache Spark™ 3.3.x or higher.
-
Allows rolling tables back to previous versions (version rollback) for quick response to issues.
-
Provides advanced filtering
that relies on column-level and partition-level statistics as well as table metadata. This accelerates request processing, even for very large tables: data files unrelated to the request will not be processed. -
Enables the serializable
isolation level — the strictest one for transaction isolation. All changes in tables are atomic, and the readers will see only the committed ones. -
Supports concurrent writing
based on the optimistic strategy: a writer will retry an operation if their changes are in conflict with those of another writer.
You can configure Apache Iceberg™ in a Yandex Data Processing cluster versions 2.0 or higher.
Note
Apache Iceberg™ is not part of Yandex Data Processing. It is not covered by Yandex Cloud support and its usage is not governed by the Yandex Data Processing Terms of Use
For more information about Apache Iceberg™, see the official documentation
Compatibility between Apache Iceberg™ versions and Yandex Data Processing images
Apache Iceberg™ versions and Yandex Data Processing images are only compatible if the Apache Iceberg™ version is compatible with the Apache Spark™ version used in the cluster. The table below lists compatible versions and links to the library files you will need to configure Apache Iceberg™ in your cluster.
Yandex Data Processing image |
Apache Spark™ version |
Apache Iceberg™ version |
JAR files |
2.0.x |
3.0.3 |
||
2.1.x (2.1.0–2.1.3) |
3.2.1 |
||
2.1.x (2.1.4 and higher) |
3.3.2 |
||
2.2.x |
3.5.0 |
Note
Access to image 2.2 is provided on request. Contact technical support