Summary
Each Dremio executor maintains a columnar cloud cache (C3). The amount of space that the cache manager database (db) and data filesystem (fs) component of this cache consumes can be configured via the dremi.conf file on each executor.
Reported Issue
System monitoring of the physical host, VMs or pods that run Dremio executors may show high storage usage.
Overview
The Dremio columnar cloud cache (C3) is a caching system on each Dremio executor node. Chunks of data from Parquet and Orc datasets are cached on a volume attached to each executor and managed by a local RocksDB instance. The location of the data, how much space is used, and where the cache manager database files are written can be configured via dremio.conf.
Relevant Versions
All supported versions of Dremio.
Steps To Resolve
Set ensurefreespace.fs to an array where the size of the array is equal to the size of the cache.path.fs array. For each path in cache.path.fs, the element in ensurefreespace.fs represents the amount of free space (in GB) that C3 should leave unused.
executor: {
enabled: true
# enable/disable local cache manager
# storage space for cache manager
# control max percentage of disk cache manager db instance and fs mount points can consume
cache: {
enabled: true,
path: {
db: ${paths.local},
fs: [${services.executor.cache.path.db}]
},
pctquota: {
db: 70,
fs: [${services.executor.cache.pctquota.db}]
},
ensurefreespace: {
fs: [10]
}
}
}
For example, if you had three volumes specified.
executor.cache.path.fs : [
"/mnt/cachemanagerdisk/dir1",
"/mnt/cachemanagerdisk/dir2",
"/mnt/cachemanagerdisk/dir3"
]
The ensurefreespace would be the following, allowing for 10 GB’s of free in each:
ensurefreespace: {
fs: [10,10,10]
}
Recommendations
It's not recommended to point all executors cloud cache paths to a common network attached storage. This diminishes the effect of the cache (you have reduced data locality) and can also cause runaway disk usage if each executor does not have a complete accounting of the actual space available on the storage.
Additional Resources
Dremio docs - Configure Cloud Cache