Overview
When using the iceberg parameters ( dremio.execution.support_unlimited_splits
and dremio.iceberg.enabled
) users may see an error when creating reflections if they are not using a compatible type of storage underpinning the directory.
Applies To
Dremio 18.0 and onwards
Details
The following error may be seen in the job profile
SYSTEM ERROR: UnknownFormatConversionException: Conversion = 'Unknown format (pdfs) conversion for path /opt/dremio/data/pdfs/accelerator/6c16bb24-56c4-4358-a533-ffb1c0899523/07a33f5a-f5bb-4171-a48a-9cf978517335_0/metadata/7affb554-8916-4023-8609-35cbb4eecc92.avro Error Message : No File System scheme matches'
SqlOperatorImpl MANIFEST_WRITER
Location 0:0:6
SqlOperatorImpl MANIFEST_WRITER
Location 0:0:6
Fragment 0:0
[Error Id: 2bfc43fa-b4a6-45f2-8c08-c60aeb9c24e2 on dremio-executor-7.dremio-cluster-pod.dremio.svc.cluster.local:0]
(java.util.UnknownFormatConversionException) Conversion = 'Unknown format (pdfs) conversion for path /opt/dremio/data/pdfs/accelerator/6c16bb24-56c4-4358-a533-ffb1c0899523/07a33f5a-f5bb-4171-a48a-9cf978517335_0/metadata/7affb554-8916-4023-8609-35cbb4eecc92.avro Error Message : No File System scheme matches'
com.dremio.exec.store.iceberg.IcebergUtils.getValidIcebergPath():416
org.apache.iceberg.hadoop.DremioOutputFile.<init>():39
com.dremio.exec.store.iceberg.DremioFileIO.newOutputFile():93
com.dremio.exec.store.iceberg.manifestwriter.ManifestWritesHelper.startNewWriter():113
com.dremio.exec.store.iceberg.manifestwriter.ManifestFileRecordWriter.setup():95
com.dremio.sabot.op.writer.WriterOperator.setup():116
com.dremio.sabot.driver.SmartOp$SmartSingleInput.setup():255
com.dremio.sabot.driver.Pipe$SetupVisitor.visitSingleInput():73
com.dremio.sabot.driver.Pipe$SetupVisitor.visitSingleInput():63
com.dremio.sabot.driver.SmartOp$SmartSingleInput.accept():200
com.dremio.sabot.driver.StraightPipe.setup():103
com.dremio.sabot.driver.StraightPipe.setup():102
com.dremio.sabot.driver.StraightPipe.setup():102
com.dremio.sabot.driver.StraightPipe.setup():102
com.dremio.sabot.driver.StraightPipe.setup():102
com.dremio.sabot.driver.StraightPipe.setup():102
com.dremio.sabot.driver.StraightPipe.setup():102
com.dremio.sabot.driver.Pipeline.setup():68
com.dremio.sabot.exec.fragment.FragmentExecutor.setupExecution():405
com.dremio.sabot.exec.fragment.FragmentExecutor.run():269
com.dremio.sabot.exec.fragment.FragmentExecutor.access$1600():94
com.dremio.sabot.exec.fragment.FragmentExecutor$AsyncTaskImpl.run():747
com.dremio.sabot.task.AsyncTaskWrapper.run():112
com.dremio.sabot.task.slicing.SlicingThread.mainExecutionLoop():243
com.dremio.sabot.task.slicing.SlicingThread.run():171
SqlOperatorImpl MANIFEST_WRITER
Location 0:0:6
SqlOperatorImpl MANIFEST_WRITER
Location 0:0:6
Fragment 0:0
com.dremio.exec.store.iceberg.IcebergUtils(IcebergUtils.java:416)
org.apache.iceberg.hadoop.DremioOutputFile(DremioOutputFile.java:39)
com.dremio.exec.store.iceberg.DremioFileIO(DremioFileIO.java:93)
com.dremio.exec.store.iceberg.manifestwriter.ManifestWritesHelper(ManifestWritesHelper.java:113)
com.dremio.exec.store.iceberg.manifestwriter.ManifestFileRecordWriter(ManifestFileRecordWriter.java:95)
com.dremio.sabot.op.writer.WriterOperator(WriterOperator.java:116)
com.dremio.sabot.driver.SmartOp$SmartSingleInput(SmartOp.java:255)
com.dremio.sabot.driver.Pipe$SetupVisitor(Pipe.java:73)
com.dremio.sabot.driver.Pipe$SetupVisitor(Pipe.java:63)
com.dremio.sabot.driver.SmartOp$SmartSingleInput(SmartOp.java:200)
com.dremio.sabot.driver.StraightPipe(StraightPipe.java:103)
com.dremio.sabot.driver.StraightPipe(StraightPipe.java:102)
com.dremio.sabot.driver.StraightPipe(StraightPipe.java:102)
com.dremio.sabot.driver.StraightPipe(StraightPipe.java:102)
com.dremio.sabot.driver.StraightPipe(StraightPipe.java:102)
com.dremio.sabot.driver.StraightPipe(StraightPipe.java:102)
com.dremio.sabot.driver.StraightPipe(StraightPipe.java:102)
com.dremio.sabot.driver.Pipeline(Pipeline.java:68)
com.dremio.sabot.exec.fragment.FragmentExecutor(FragmentExecutor.java:405)
com.dremio.sabot.exec.fragment.FragmentExecutor(FragmentExecutor.java:269)
com.dremio.sabot.exec.fragment.FragmentExecutor(FragmentExecutor.java:94)
com.dremio.sabot.exec.fragment.FragmentExecutor$AsyncTaskImpl(FragmentExecutor.java:747)
com.dremio.sabot.task.AsyncTaskWrapper(AsyncTaskWrapper.java:112)
com.dremio.sabot.task.slicing.SlicingThread(SlicingThread.java:243)
com.dremio.sabot.task.slicing.SlicingThread(SlicingThread.java:171)
Cause
Dremio will use a filesystem to store reflection data. If the user has enabled iceberg features to store this data then the filesystem needs to be the correct type to support this.
iceberg support is controlled with the following support keys
dremio.execution.support_unlimited_splits
dremio.iceberg.enabled
Solution
Iceberg format must be hosted on a distributed file store - supported types are as follows
ADLS - Hadoop, Hive catalogs
GCS - Hadoop catalog
HDFS - Hadoop catalog
Hive (Recommended) - Hive catalog
S3 - Hadoop, Hive catalogs
IMPORTANT:
When using Iceberg tables, Dremio recommends using the Hive catalog in production environments
The user needs to use one of these in their dist
setting in the dremio.conf
file
See https://docs.dremio.com/deployment/dist-store-config/ for examples
Further Reading
Support settings - https://docs.dremio.com/advanced-administration/support-settings/
Apache iceberg - http://docs.dremio.com/data-formats/apache-iceberg/