Summary
Queries may sometimes fail due to a size limitation of the data structure used when Hash Aggregate spilling is enabled.
Reported Issue
When running a query the following error may be observed in the profile:
SYSTEM ERROR: StringIndexOutOfBoundsException: Not enough space to pivot single record. Allocated capacity for pivot: 131072 bytes.
SqlOperatorImpl HASH_AGGREGATE
Location 0:0:6
ErrorOrigin: EXECUTOR
[Error Id: 550abac4-57c4-4908-94c1-647e75c6967c on dremio-executor-1.dremio-cluster-pod.dremio.svc.cluster.local:0]
(java.lang.StringIndexOutOfBoundsException) Not enough space to pivot single record. Allocated capacity for pivot: 131072 bytes.
com.dremio.sabot.op.common.ht2.BoundedPivots.pivotVariableLengths():153
com.dremio.sabot.op.common.ht2.BoundedPivots.pivot():202
com.dremio.sabot.op.aggregate.vectorized.VectorizedHashAggOperator.consumeDataHelper():1281
com.dremio.sabot.op.aggregate.vectorized.VectorizedHashAggOperator.consumeData():1175
com.dremio.sabot.driver.SmartOp$SmartSingleInput.consumeData():312
com.dremio.sabot.driver.StraightPipe.pump():58
com.dremio.sabot.driver.Pipeline.doPump():134
com.dremio.sabot.driver.Pipeline.pumpOnce():124
com.dremio.sabot.exec.fragment.FragmentExecutor$DoAsPumper.run():690
com.dremio.sabot.exec.fragment.FragmentExecutor.run():595
com.dremio.sabot.exec.fragment.FragmentExecutor$AsyncTaskImpl.run():1274
com.dremio.sabot.task.AsyncTaskWrapper.run():130
com.dremio.sabot.task.slicing.SlicingThread.mainExecutionLoop():281
com.dremio.sabot.task.slicing.SlicingThread.run():186
The query which might have run successfully previously will consistently fail.
Relevant Versions
All versions of Dremio from 24.0.x onwards
Cause
When Hash Aggregation spilling is enabled, the data for this operator is always marshalled through an internal in memory data structure, wether it is flushed out to disk or not. This data structure has a bounded limitation on the size of individual elements. When this limit is exceeded an error will be thrown as seen above.
Usually this problem is less likely to surface because the single_field_size_bytes
support key inherently keeps field sizes small. However users may often increase field sizes in cases where some table columns hold data that exceeds this. Note that keeping field size at default does not guarantee this error will not occur.
Steps to resolve
The is no adjustment for this setting. The user may truncate the data of this field, using SQL query logic like SUBSTR. For example.
SELECT DISTINCT SUBSTR(error_msg,1,32000) FROM sys.jobs_recent
Or alternatively turn off Hash Aggregation spilling by setting the support key exec.operator.aggregate.vectorize.use_spilling_operator to false
.
(When this key is off the internal in memory data structure is not used).
Additional Resources
Docs for Hash Agg spilling - https://docs.dremio.com/current/admin/hash-agg-spilling/