Summary
As Dremio adoption increases within your organization, you may find that one engine, even if it is configured with a large number of cores and a high amount of memory, may not be sufficient for the volume and concurrency of metadata refresh activity. The indicators for this are when metadata refreshes start to slow down, begin accruing wait time in the queue or indeed get cancelled due to hitting queue timeout limits.
Overview
This article provides an approach to leveraging multiple, dedicated engines for metadata refreshes in situations where your Dremio environment is set to use Iceberg metadata stored in the distributed store and where one engine is not sufficient for the volume and concurrency of metadata refreshes being performed in a Dremio platform.
Relevant Versions Tools and Integrations
This article applies to all Dremio versions where it is possible to set up engines for workload isolation and where Iceberg metadata is enabled.
Steps to Implement
You can route the REFRESH DATASET
queries that are executed by the $dremio$
user to different queues (and hence route them to different engines) by relying on simple maths. The example shows how to route several requests that are being executed one after another to different queues:
Using 3 metadata refresh queues, each associated with a distinct metadata refresh engine, the routing rules are set up as follows:
Rule MR1 (routes to queue MR1) query_type() IN ('Metadata Refresh') AND MOD(SECOND(CURRENT_TIMESTAMP), 3) = 2
Rule MR2 (routes to queue MR2) query_type() IN ('Metadata Refresh') AND MOD(SECOND(CURRENT_TIMESTAMP), 3) = 1
Rule MR3 (routes to queue MR3) -- the catch-all query_type() IN ('Metadata Refresh')