How can I investigate the Dremio Job History – Dremio Support

Overview
Some users may want to investigate the Dremio Job History (including all the recent days history)

Applies To
All versions

Details

As well as searching via the search box in the Jobs list, you can upload the current and archived queries.json file(s) into Dremio for investigation.

Note, queries.json is written to the coordinator log folder and moved to the log/archive folder every 24 hours - 30 days are kept as a default

Caution - If the support key "planner.verbose_profile" is set to true, the profile can get large and hence make the KV store large.

Copying queries.json nightly once it is rotated to your distributed store and then promoting that dataset via Dremio can help you run reporting about usage of Dremio across your organization.

Depending on deployment type you can specify where the logs reside

eg via config/dremio-env or the dremio yaml files (see Further Reading below for more information)

Uploading a single queries.json file for investigation :-

From your Dremio UI use the Upload File icon to upload your queries.json file

Screenshot_2022-03-09_at_11.04.14.png

Select the JSON format and save

Screenshot_2022-03-09_at_11.08.19.png

The resultant promoted file can be queried :-

Screenshot_2022-03-09_at_11.12.42.png

Uploading the queries.json history :-

The queries.json file(s) are backed up each day into the $DREMIO_HOME/log/archive directory (along with other files)
Those files can be used to query on past jobs

Please note that the queries archive data is configurable in two ways

1) via support key jobs.max.age_in_days

2) via the maxHistory in
$DREMIO_HOME/conf/logback.xml
ie

<appender name="query" class="ch.qos.logback.core.rolling.RollingFileAppender">
  <file>${dremio.log.path}/queries.json</file>
  <rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
    <fileNamePattern>${dremio.log.path}/archive/queries.%d{yyyy-MM-dd}.%i.json.gz</fileNamePattern>
    <maxHistory>30</maxHistory>
    <timeBasedFileNamingAndTriggeringPolicy class="ch.qos.logback.core.rolling.QuerySizeAndTimeBasedFNATP">
      <maxFileSize>100MB</maxFileSize>
     </timeBasedFileNamingAndTriggeringPolicy>
  </rollingPolicy>

Uploading Files

You cannot perform bulk uploads into Dremio. Instead, groups of files with the same structure in a common directory can be queried together like they are a single table. To learn more, see the chapter on Directories.
If you are running Dremio locally, you can create a NAS data source and connect to your local files. To learn more, see the chapter on NAS.

Some ideas on uploading the history are as follows

- copy the queries*.json.gz files from the log/archive directory to your distributed store

- query that directory as a whole per the following and the data can be queried

The example below shows promotion of a directory with the queries.<timestamp>.json.gz files

Screenshot_2022-03-09_at_16.26.44.png

All the history data can be queried at this point

Copying queries.json nightly once it is rotated to your distributed store and then promoting that dataset via Dremio can help you run reporting about usage of Dremio across your organization.

The nightly copying of the rotated log file is added to the dataset and available per the Metadata refresh interval

Configuring Directories as Datasets

Further Reading
Job History & Job Details

Configuring Distributed Storage

Working with files and directories in your NAS

Dremio Support Keys

Log File Locations

Logs and PID Location

Related articles