Overview

Using workload "engines" to process execute particular queries and jobs in specific Dremio executors is an important aspect of managing a Dremio deployment. An engine is a named group of servers or pods that can be assigned as a target for one or more queues in your Workload Manager. This article will provide details and examples on deploying with different engines in a Kubernetes deployment.

Applies To

All currently supported versions of Dremio Enterprise Edition launched with Dremio V2 Helm charts, found at: https://github.com/dremio/dremio-cloud-tools/tree/master/charts/dremio_v2

Deploying Multiple Standardized Engines

In the linked values.yaml, the following parameter set can be found:

  # Engines
  # Engine names be 47 characters or less and be lowercase alphanumber characters or '-'.
  # Note: The number of executor pods will be the length of the array below * count.
  engines: ["default"]
  count: 3

  # Executor volume size.
  volumeSize: 128Gi

  # Kubernetes Service Account
  # Uncomment below to use a custom Kubernetes service account for executors.
  #serviceAccount: ""

  # Uncomment the lines below to use a custom set of extra startup parameters for executors.
  #extraStartParams: >-
  #  -DsomeKey=someValue

  # Extra Init Containers
  # Uncomment the below lines to use a custom set of extra init containers for executors.
  #extraInitContainers: |
  #  - name: extra-init-container
  #    image: {{ $.Values.image }}:{{ $.Values.imageTag }}
  #    command: ["echo", "Hello World"]

  # Extra Volumes
  # Uncomment below to use a custom set of extra volumes for executors.
  #extraVolumes: []

  # Extra Volume Mounts
  # Uncomment below to use a custom set of extra volume mounts for executors.
  #extraVolumeMounts: []

  # Uncomment this value to use a different storage class for executors.
  #storageClass:

  # Dremio C3
  # Designed for use with NVMe storage devices, performance may be impacted when using
  # persistent volume storage that resides far from the physical node.
  cloudCache:
    enabled: true

    # Uncomment this value to use a different storage class for C3.
    #storageClass:

    # Volumes to use for C3, specify multiple volumes if there are more than one local
    # NVMe disk that you would like to use for C3.
    #
    # The below example shows all valid options that can be provided for a volume.
    # volumes:
    # - name: "dremio-default-c3"
    #   size: 100Gi
    #   storageClass: "local-nvme"
    volumes:
    - size: 100Gi

  # These values, when defined and not empty, override the provided shared annotations, labels, node selectors, or tolerations.
  # Uncomment only if you are trying to override the chart's shared values.
  #annotations: {}
  #podAnnotations: {}
  #labels: {}
  #podLabels: {}
  #nodeSelector: {}
  #tolerations: []

In the above example, the name "default" is defined for a single engine, and the engine will deploy with 3 pods in a StatefulSet (STS). The rest of the parameters define resources (CPU, memory, storage) in Kubernetes for each pod in the set, and Dremio "C3" cache storage settings.

To change the name or number of engines and pods in all engines, update it with one or more engine names and optionally adjust the other parameters as below:

# Engines  
# Engine names be 47 characters or less and be lowercase alphanumber characters or '-'.  
# Note: The number of executor pods will be the length of the array below * count.  
engines: ["example1","example2"]  
count: 2
...

Now we see 2 engines named "example1" and "example2" with 2 pods in each StatefulSet.

Kubernetes Results

The next time the charts are used with helm install or helm upgrade they will yield identical groups of pods following those parameters as Kubernetes StatefulSet workloads:

devin [ ~/repo/dremio-cloud-tools/charts/dremio_v2 ]$ kubectl get pods -n dremio
NAME                         READY   STATUS              RESTARTS   AGE
dremio-executor-example1-0   0/1     Init:1/2            0          2m35s
dremio-executor-example1-1   1/1     Running             0          2m35s
dremio-executor-example2-0   1/1     Running             0          2m35s
dremio-executor-example2-1   1/1     Running             0          2m35s
dremio-master-0              1/1     Running             0          113s
zk-0                         1/1     Running             0          2m35s
zk-1                         0/1     ContainerCreating   0          2m35s
zk-2                         1/1     Running             0          2m35s

In the above output we see two pods in one STS named "example1", another two pods in another STS named "example2".

Dremio UI Results

When the Dremio “Node Activity” page is viewed, the Engine name for a node will be shown in the “Engine” column:

This configuration will allow platform operators to manage rules and queues to route queries and jobs to particular groups of executor nodes. Queues and rules are not detailed in this article, but you can learn more about them in our docs: https://docs.dremio.com/software/advanced-administration/workload-management/

Customizing Engines

Creating engines with different numbers of nodes, or other different parameter sets, for example more CPU, more memory, annotations/tags/labels, is also possible.

To change parameters for one specific engine, including Kubernetes resource limits, like CPU, memory and storage, or if you need to change the count of the pods in the engine, find the engineOverride block in the values.yaml file. It looks like this:

  # The below example shows all valid options that can be overridden on a per-engine basis.  
  # engineOverride:  
  #   engineNameHere:  
  #     cpu: 1  
  #     memory: 122800  
  #  #     count: 1  
  ...

If you replace the “engineNameHere” with the name of some engine mapped in your above “engines”, then instead of the common settings, you will see the STS for that engine will be created according to the specifications indented below the name of the engine.

For example, with these engine names:

  # Engines  
  # Engine names be 47 characters or less and be lowercase alphanumber characters or '-'.  
  # Note: The number of executor pods will be the length of the array below * count.  
  engines: ["example1","example2","example-override1"]  
  count: 2

and with the “engineOverride” with a section like this:

  engineOverride:    
    example-override1:      
      cpu: 1      
      memory: 8000        
      count: 1      
  ...      
      volumeSize: 128Gi      
  ...      
      cloudCache:        
        enabled: false

... you will get pods like this:

devin [ ~/repo/dremio-cloud-tools/charts/dremio_v2 ]$ kubectl get pods -n dremio
NAME                                  READY   STATUS    RESTARTS   AGE
dremio-executor-example-override1-0   1/1     Running   0          37m
dremio-executor-example1-0            1/1     Running   0          37m
dremio-executor-example1-1            1/1     Running   0          37m
dremio-executor-example2-0            1/1     Running   0          37m
dremio-executor-example2-1            1/1     Running   0          37m
dremio-master-0                       1/1     Running   0          37m
zk-0                                  1/1     Running   0          37m
zk-1                                  1/1     Running   0          37m
zk-2                                  1/1     Running   0          37m
devin [ ~/repo/dremio-cloud-tools/charts/dremio_v2 ]$

You can see the STS named "example1" with two pods, the STS named "example2" with two pods, and the new STS named "example-override1", with only one pod.

For each engine that you need to customize parameters for (like the pod count, CPU, memory, etc.), you need to do the following:

set the name of the engine in the list of engines in the first section for defining engines
duplicate the block which is begun with the name of that engine indented under the engineOverride parameter
set the parameters for that engine

Then, as before in the UI, you will see the engine name in the "Node Activity" page like this: