Overview
When writing backups to S3-compatible object stores, an error may be encountered if the maximum number of multipart partitions are reached before the backup has been completed. This technote details how to tune the partition write size.
Applies To
All versions of Dremio
Details
When writing to S3 compatible stores (i.e., S3, MinIO, etc.) the dremio-admin process will use multipart upload per recommended S3 best practices to write any backup data to the object store. Over time, as the size of the backup increases, it may be possible to exhaust the part threshold - a hard limit is defined as 10,000. When this occurs, the following message will be displayed in server.log:
Number of parts in multipart upload exceeded. Current part count = 10001, Part count limit = 10000
The default part size used by Dremio is 64MB, which is acceptable for most clusters. However, if the limit is reached, this limit needs to be increased to ensure the backup is fully written to the object store.
To increase the part size, set the following configuration in core-site.xml:
<property>
<name>fs.s3a.multipart.size</name>
<description>How big (in bytes) to split upload or copy operations up into. A suffix {K,M,G,T,P} may be used</description>
<value>128M</value>
</property>
This applies a custom limit using the AWS S3 SDK at write time. You should tune the parameter to be optimal for your backup size.
To monitor, the following can be added to the logback.xml to turn on DEBUG logging for the S3 plugin:
<logger name="org.apache.hadoop.fs.s3a">
<level value="debug"/>
</logger>
Note that this will require a restart of Dremio on the co-ordinator, unless you have the dynamic log scan configured (see further reading).
Once in place, you should see the new multipart write size in bytes for any block writes taking place, for example:
$ /opt/dremio/bin/dremio-admin backup -l -d dremioS3:///dremio/myCluster/backup
.....
2023-02-27 15:55:22,388 [pool-25-thread-1] DEBUG o.a.h.fs.s3a.S3ABlockOutputStream - Initialized S3ABlockOutputStream for 220/backup/dremio_backup_2023-02-27_15.55/roles_info.json output to FileBlock{index=1, destFile=/tmp/hadoop-dremio/s3a/s3ablock-0001-2983948922563525156.tmp, state=Writing, dataSize=0, limit=134217728}
2023-02-27 15:55:22,388 [pool-25-thread-1] DEBUG o.a.h.fs.s3a.S3ABlockOutputStream - S3ABlockOutputStream{WriteOperationHelper {bucket=dremio}, blockSize=134217728, activeBlock=FileBlock{index=1, destFile=/tmp/hadoop-dremio/s3a/s3ablock-0001-2983948922563525156.tmp, state=Writing, dataSize=263, limit=134217728} Statistics=counters=((stream_write_bytes=263) (stream_write_exceptions=0) (action_executor_acquired=0) (op_abort.failures=0) (stream_write_block_uploads=0) (stream_write_queue_duration=0) (op_hsync=0) (stream_write_exceptions_completing_upload=0) (object_multipart_aborted.failures=0) (multipart_upload_completed.failures=0) (object_multipart_aborted=0) (op_hflush=0) (action_executor_acquired.failures=0) (multipart_upload_completed=0) (op_abort=0) (stream_write_total_time=0) (stream_write_total_data=0));
Note that the defaults for your S3 store may vary by vendor.
Further Reading