Summary
When writing large filesets to S3, the files are split into smaller blocks of data to be transmitted to improve efficiency. When very large filesets are written, the number of partitions exceeds the 10000 part limit. To work around this you should increase the size of each partition.
Reported Issue
The following error is reported in dremio co-ordinator server.log when trying to write large backup sets to S3:
Number of parts in multipart upload exceeded. Current part count = 10001, Part count limit = 10000
Overview
When writing backups to S3-compatible object stores, an error may be encountered if the maximum number of multipart partitions are reached before the backup has been completed. This technote details how to tune the partition write size.
Relevant Versions Tools and Integrations
All versions of Dremio
Steps to Resolve
To increase the part size, set the following configuration in core-site.xml:
<property>
<name>fs.s3a.multipart.size</name>
<description>How big (in bytes) to split upload or copy operations up into. A suffix {K,M,G,T,P} may be used</description>
<value>128M</value>
</property>
To monitor, the following can be added to the logback.xml to turn on DEBUG logging for the S3 plugin:
<logger name="org.apache.hadoop.fs.s3a">
<level value="debug"/>
</logger>