Overview
When working with a large data set, there is a possibility that a Hash Join may fail with an error like this:
OUT_OF_MEMORY ERROR: Query was cancelled because it exceeded the memory limits set by the administrator
SqlOperatorImpl HASH_JOIN
Applies To
This solution is for Dremio version 24 and up.
Cause
Upon encountering the memory limitation, the query will fail. It may work on a retry, depending on the memory used by other concurrent queries.
Workaround
No workaround; the solution is below.
Solution
In Dremio Version 24, there is new functionality that can be enabled to permit Hash Joins to "spill" to disk A spilling hash join will store information on disk when memory allocated for the query is fully utilized. With this new feature enabled, very large hash joins will succeed.
The feature is not enabled by default in Dremio Version 24, but there is a support key available to enable it. Set this support key to true:
exec.op.join.spill=true
As mentioned, the default for the support key is "false," so spilling to disk will not occur by default.
With this new feature enabled, there may be a slight performance hit for all queries, as some setup is necessary for each query when query planning completes. If the query has very large Hash Join and needs to spill, it will utilize the assigned space on disk, and the query should finish without encountering the OUT_OF_MEMORY_ERROR.