Summary
Dremio Support requires that the various Dremio application logs (server.out, server.log, queries.json, audit.json, access.log and others) and JVM garbage collection logs (server.gc) be persisted for later collection and analysis.
Currently, the default configuration in Dremio's official Helm charts does not persist the logs that each coordinator and executor pod generates (they are instead sent to the console). Configuring log persistence is left to the Dremio admin. However, with only minor changes to the charts and a Helm upgrade or your Dremio release, you can easily have the logs written to the persistent volume claimed by each pod.
Reported Issue
The default configuration in Dremio's official Helm charts does not persist the logs generated by each coordinator and executor pod. Instead, the logs are sent to the console, which can lead to log data loss when pods restart or are rescheduled.
Relevant Versions
Dremio deployments on Kubernetes using the official Dremio Helm charts. The specific versions of
Steps to Resolve
- In the values.yaml file, under both the coordinator and executor sections, add the following parameters to extraStartParams:
- -Ddremio.log.path=/opt/dremio/data/log to specify the path for Dremio application logs.
- -Xlog:gc*,classhisto*=trace:file=/opt/dremio/data/log/gc.log for Dremio v25 and higher (Java 11) or -Xloggc:/opt/dremio/data/log/gc.log for Dremio v24 and lower (Java 8) to specify the path for Java runtime GC logs.
-
Here are examples (elipses "..." are added here for brevity to show we have skipped over lines and are not valid YAML):
# Dremio Coordinator
coordinator:
...
extraStartParams: >-
-Ddremio.log.path=/opt/dremio/data/log
-Xloggc:/opt/dremio/data/log/gc.log
...
...
# Dremio Executor
executor:
...
extraStartParams: >-
-Ddremio.log.path=/opt/dremio/data/log
-Xloggc:/opt/dremio/data/log/gc.log
...
- In the config/dremio-env file, uncomment and set the following variables:
- DREMIO_LOG_TO_CONSOLE=0 to redirect application logs away from the console.
- DREMIO_GC_LOG_TO_CONSOLE="no" to redirect GC logs away from the console.
- Upgrade your Dremio Helm release to apply the changes:
$ helm upgrade <your-dremio-release-name> <your-dremio-helm-chart-dir> -f <your dremio values file>
- Verify that the logs are being written to the specified directory on the persistent volume (PV) by checking the logs directory on one of the coordinator or executor pods.
user@localhost$ kubectl exec -it dremio-master-0 -- bash
dremio@dremio-master-0:/opt/dremio/data/log$ ls -ltrh
total 16M
-rw-r--r-- 1 dremio dremio 0 Jul 12 2023 hive.deprecated.function.warning.log
-rw-r--r-- 1 dremio dremio 1023 Aug 16 2023 ad-music.json
-rw-r--r-- 1 dremio dremio 2.8K Aug 28 2023 queries.json
drwxr-xr-x 3 dremio dremio 4.0K Jul 1 00:00 json
drwxr-xr-x 2 dremio dremio 24K Jul 1 00:52 archive
-rw-r--r-- 1 dremio dremio 3.4M Jul 1 21:01 gc.log.0.current
-rw-r--r-- 1 dremio dremio 15K Jul 1 21:12 audit.json
-rw-r--r-- 1 dremio dremio 52K Jul 1 21:13 metadata_refresh.log
-rw-r--r-- 1 dremio dremio 62K Jul 1 22:08 tracker.json
-rw-r--r-- 1 dremio dremio 11M Jul 1 22:11 server.log
-rw-r--r-- 1 dremio dremio 1.5M Jul 1 22:11 access.log