Locating Iceberg metadata from Dremio query plans – Dremio Support

Summary

Iceberg formatted tables and datasets for which Dremio maintains Iceberg metadata, have different locations for the metadata directory. You can use the final physical plan of a SELECT query against one of these datasets to quickly locate the metadata for further inspection.

Reported Issue

Datasets in Dremio can be grouped into 3 categories based on how their metadata is collected and stored.

Iceberg format - metadata is stored in files alongside the tables' data files in a specific metadata subdirectory under the table root directory. Any change made to the table via Dremio or another application automatically updates the Iceberg metadata to be current for subsequent read queries.

Iceberg metadata - Dremio maintains metadata for these tables which conform to the Iceberg spec, but they are not true Iceberg tables. This category includes:

Parquet and Orc datasets in filesystem/object-store sources (S3, Azure Storage, HDFS.
Parquet, Orc (non-transactional) and Avro datasets in Hive/Glue sources

Data is typically added to or removed from these tables through processes outside of Dremio. To collect and update metadata, Dremio regularly initiates a job on one of its executor engines, similar to how a user would issue a SELECT query, with the SQL command `REFRESH DATASETS <dataset name>`. Importantly, the metadata for these tables is not stored alongside the data in the root table directory; instead, it is kept in a separate metadata directory within Dremio's distributed storage. Each such table has a Dremio-generated UID subdirectory for its metadata under this metadata directory.

These tables are sometimes referred to as "unlimited splits" tables because there's hypothetically no limit to the number of data files that can be part of the dataset. This feature is possible because of the Iceberg metadata Dremio maintains for these tables.

Coordinator-bound metadata - for this class of tables, metadata refresh is done by the coordinator rather than as a job submitted to an engine, and metadata is stored in Dremio's catalog database (the KV-store) rather than the distributed storage. This class of datasets includes:

Tables in RDBS sources such as Redshift, Postgres, and Oracle
Collections in document stores such as MongoDB and Elasticsearch
JSON and text-based (CSV, Excel) datasets in filesystem and object store sources
Delta Lake tables*

The metadata for tables in that last category are not accessible for inspection because they are stored in Dremio's internal database and can only be read with Dremio code. In contrast, for Iceberg format tables and "unlimited splits" tables with Iceberg metadata, if you can locate and access the metadata directory, the contents are easily viewable JSON and Avro files.

Relevant Versions

Dremio 18+ for Iceberg tables or tables for which Dremio maintains Iceberg metadata.

Steps to Resolve

To locate the metadata location for a table, try the following steps

1. Run an EXPLAIN PLAN for a simple SELECT query against the physical dataset you are trying to locate the metadata for:

EXPLAIN PLAN FOR SELECT * FROM <table-of-interest>;

The result set is single column that shows the final physical plan of the query. This closely resembles the graph of query operations that actually executes on the engines at runtime. Each line in this box represents an operation and the numbering and level of indentation indicates where in the graph the operator is located.

2. Search for "metadataFileLocation". It will be in the operator called "IcebergManfestList" towards the bottom of the results. The path to the metadata file is enclosed in square brackets.

As an example, "tpch-orders" is an "unlimited splits" Parquet dataset in and S3 source (called... "S3-source"). It is not an Iceberg table, but Dremio maintains Iceberg metadata for its data files. Below is the final physical plan obtained from performing the first step.

00-00    Screen : rowType = RecordType(VARCHAR(65536) Fragment, BIGINT Records, VARCHAR(65536) Path, VARBINARY(65536) Metadata, INTEGER Partition, BIGINT FileSize, VARBINARY(65536) IcebergMetadata, VARBINARY(65536) fileschema, VARBINARY(65536) ARRAY PartitionData, INTEGER OperationType, VARCHAR(65536) PartitionValue, BIGINT RejectedRecords, VARBINARY(65536) ReferencedDataFiles): rowcount = 373852.0, cumulative cost = {2654374.2 rows, 4149982.0943 cpu, 0.0 io, 1.99069696E10 network, 0.0 memory}, id = 3210
00-01      Project(Fragment=[$0], Records=[$1], Path=[$2], Metadata=[$3], Partition=[$4], FileSize=[$5], IcebergMetadata=[$6], fileschema=[$7], PartitionData=[$8], OperationType=[$9], PartitionValue=[$10], RejectedRecords=[$11], ReferencedDataFiles=[$12]) : rowType = RecordType(VARCHAR(65536) Fragment, BIGINT Records, VARCHAR(65536) Path, VARBINARY(65536) Metadata, INTEGER Partition, BIGINT FileSize, VARBINARY(65536) IcebergMetadata, VARBINARY(65536) fileschema, VARBINARY(65536) ARRAY PartitionData, INTEGER OperationType, VARCHAR(65536) PartitionValue, BIGINT RejectedRecords, VARBINARY(65536) ReferencedDataFiles): rowcount = 373852.0, cumulative cost = {2616989.0 rows, 4112596.8943 cpu, 0.0 io, 1.99069696E10 network, 0.0 memory}, id = 3209
00-02        WriterCommitter(final=[/lakehouse-bucket-42/dremio-dist/results/1866d122-7ebe-e8d6-1c06-187b95496e00]) : rowType = RecordType(VARCHAR(65536) Fragment, BIGINT Records, VARCHAR(65536) Path, VARBINARY(65536) Metadata, INTEGER Partition, BIGINT FileSize, VARBINARY(65536) IcebergMetadata, VARBINARY(65536) fileschema, VARBINARY(65536) ARRAY PartitionData, INTEGER OperationType, VARCHAR(65536) PartitionValue, BIGINT RejectedRecords, VARBINARY(65536) ReferencedDataFiles): rowcount = 373852.0, cumulative cost = {2243137.0 rows, 4112548.29354 cpu, 0.0 io, 1.99069696E10 network, 0.0 memory}, id = 3208
00-03          UnionExchange : rowType = RecordType(VARCHAR(65536) Fragment, BIGINT Records, VARCHAR(65536) Path, VARBINARY(65536) Metadata, INTEGER Partition, BIGINT FileSize, VARBINARY(65536) IcebergMetadata, VARBINARY(65536) fileschema, VARBINARY(65536) ARRAY PartitionData, INTEGER OperationType, VARCHAR(65536) PartitionValue, BIGINT RejectedRecords, VARBINARY(65536) ReferencedDataFiles): rowcount = 373852.0, cumulative cost = {1869285.0 rows, 3738696.29354 cpu, 0.0 io, 1.99069696E10 network, 0.0 memory}, id = 3207
01-01            Writer : rowType = RecordType(VARCHAR(65536) Fragment, BIGINT Records, VARCHAR(65536) Path, VARBINARY(65536) Metadata, INTEGER Partition, BIGINT FileSize, VARBINARY(65536) IcebergMetadata, VARBINARY(65536) fileschema, VARBINARY(65536) ARRAY PartitionData, INTEGER OperationType, VARCHAR(65536) PartitionValue, BIGINT RejectedRecords, VARBINARY(65536) ReferencedDataFiles): rowcount = 373852.0, cumulative cost = {1495433.0 rows, 747880.29354 cpu, 0.0 io, 98304.0 network, 0.0 memory}, id = 3206
01-02              Project(O_ORDERKEY=[$0], O_CUSTKEY=[$1], O_ORDERSTATUS=[$2], O_TOTALPRICE=[$3], O_ORDERDATE=[$4], O_ORDERPRIORITY=[$5], O_CLERK=[$6], O_SHIPPRIORITY=[$7], O_COMMENT=[$8]) : rowType = RecordType(BIGINT O_ORDERKEY, BIGINT O_CUSTKEY, VARCHAR(65536) O_ORDERSTATUS, DOUBLE O_TOTALPRICE, DATE O_ORDERDATE, VARCHAR(65536) O_ORDERPRIORITY, VARCHAR(65536) O_CLERK, BIGINT O_SHIPPRIORITY, VARCHAR(65536) O_COMMENT): rowcount = 373852.0, cumulative cost = {1121581.0 rows, 374028.29354 cpu, 0.0 io, 98304.0 network, 0.0 memory}, id = 3205
01-03                Project(O_ORDERKEY=[$0], O_CUSTKEY=[$1], O_ORDERSTATUS=[$2], O_TOTALPRICE=[$3], O_ORDERDATE=[$4], O_ORDERPRIORITY=[$5], O_CLERK=[$6], O_SHIPPRIORITY=[$7], O_COMMENT=[$8]) : rowType = RecordType(BIGINT O_ORDERKEY, BIGINT O_CUSTKEY, VARCHAR(65536) O_ORDERSTATUS, DOUBLE O_TOTALPRICE, DATE O_ORDERDATE, VARCHAR(65536) O_ORDERPRIORITY, VARCHAR(65536) O_CLERK, BIGINT O_SHIPPRIORITY, VARCHAR(65536) O_COMMENT): rowcount = 373852.0, cumulative cost = {747729.0 rows, 373994.64686 cpu, 0.0 io, 98304.0 network, 0.0 memory}, id = 3204
01-04                  TableFunction(columns=[`O_ORDERKEY`, `O_CUSTKEY`, `O_ORDERSTATUS`, `O_TOTALPRICE`, `O_ORDERDATE`, `O_ORDERPRIORITY`, `O_CLERK`, `O_SHIPPRIORITY`, `O_COMMENT`], Table Function Type=[DATA_FILE_SCAN], table=["S3-source"."tpch-orders"]) : rowType = RecordType(BIGINT O_ORDERKEY, BIGINT O_CUSTKEY, VARCHAR(65536) O_ORDERSTATUS, DOUBLE O_TOTALPRICE, DATE O_ORDERDATE, VARCHAR(65536) O_ORDERPRIORITY, VARCHAR(65536) O_CLERK, BIGINT O_SHIPPRIORITY, VARCHAR(65536) O_COMMENT): rowcount = 373852.0, cumulative cost = {373877.0 rows, 373961.00018 cpu, 0.0 io, 98304.0 network, 0.0 memory}, id = 3203
01-05                    Project(splitsIdentity=[$0], splits=[$1], colIds=[$2]) : rowType = RecordType(RecordType(VARCHAR(65536) path, BIGINT offset, BIGINT length, BIGINT fileLength) splitsIdentity, VARBINARY(65536) splits, VARBINARY(65536) colIds): rowcount = 6.0, cumulative cost = {25.0 rows, 109.00018 cpu, 0.0 io, 98304.0 network, 0.0 memory}, id = 3202
01-06                      HashToRandomExchange(dist0=[[$0]]) : rowType = RecordType(RecordType(VARCHAR(65536) path, BIGINT offset, BIGINT length, BIGINT fileLength) splitsIdentity, VARBINARY(65536) splits, VARBINARY(65536) colIds, INTEGER E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 6.0, cumulative cost = {19.0 rows, 109.0 cpu, 0.0 io, 98304.0 network, 0.0 memory}, id = 3201
02-01                        TableFunction(columns=[`splitsIdentity`, `splits`, `colIds`, `E_X_P_R_H_A_S_H_F_I_E_L_D`], Table Function Type=[SPLIT_ASSIGNMENT]) : rowType = RecordType(RecordType(VARCHAR(65536) path, BIGINT offset, BIGINT length, BIGINT fileLength) splitsIdentity, VARBINARY(65536) splits, VARBINARY(65536) colIds, INTEGER E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 6.0, cumulative cost = {13.0 rows, 13.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 3200
02-02                          TableFunction(columns=[`splitsIdentity`, `splits`, `colIds`], Table Function Type=[SPLIT_GEN_MANIFEST_SCAN]) : rowType = RecordType(RecordType(VARCHAR(65536) path, BIGINT offset, BIGINT length, BIGINT fileLength) splitsIdentity, VARBINARY(65536) splits, VARBINARY(65536) colIds): rowcount = 6.0, cumulative cost = {7.0 rows, 7.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 3199
02-03                            IcebergManifestList(table=["S3-source"."tpch-orders"], snapshot=[3735119223638640377], columns=[`splitsIdentity`, `splits`, `colIds`], splits=[1], metadataFileLocation=[s3://lakehouse-bucket-42/dremio-dist/metadata/017a614f-1050-4409-885c-a166f6ce265d/metadata/00001-42c71c8f-f7b1-4bc2-ba3d-55df36c4af61.metadata.json], manifestContent=[DATA]) : rowType = RecordType(RecordType(VARCHAR(65536) path, BIGINT offset, BIGINT length, BIGINT fileLength) splitsIdentity, VARBINARY(65536) splits, VARBINARY(65536) colIds): rowcount = 1.0, cumulative cost = {1.0 rows, 1.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 3198

The "IcebergManifestList" operator reveals that the Iceberg metadata directory is in the S3 distributed storage at "s3://lakehouse-bucket-42/dremio-dist/metadata/017a614f-1050-4409-885c-a166f6ce265d/metadata"

As another example, "S3-source".iceberg."tpch-orders" is the same dataset in the same source, but in true Iceberg format The "IcebergManifestList" from the final physical plan looks like this:

02-03   IcebergManifestList(table=["S3-source".iceberg."tpch-orders"], snapshot=[6115902619512097962], columns=[`splitsIdentity`, `splits`, `colIds`], splits=[1], metadataFileLocation=[/lakehouse-bucket-42/data/iceberg/tpch-orders/metadata/v2.metadata.json], manifestContent=[DATA]) : rowType = RecordType(RecordType(VARCHAR(65536) path, BIGINT offset, BIGINT length, BIGINT fileLength) splitsIdentity, VARBINARY(65536) splits, VARBINARY(65536) colIds): rowcount = 1.0, cumulative cost = {1.0 rows, 1.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 26516

The metadata directory is located in the table root directory "alongside" the data files at "/lakehouse-bucket-42/data/iceberg/tpch-orders/metadata". Unlike the previous example, the S3 file scheme is not included in the path, so we have to infer from the source type (S3 in this case) where to look for that path.

Additional Resources

Apache Iceberg Table Spec

Dremio docs - Optimizing Metadata Refresh Frequency

Summary

Reported Issue

Relevant Versions

Steps to Resolve

Additional Resources

Related articles