Summary/Reported Issue
For HDFS, S3 and other object storage, if 'Enable partition column inference' is enabled, doesn't it make sense not to have dir# fields?
Relevant Versions
Dremio 24.3.x
Cause
I can confirm that this is actually the expected behavior in Dremio. Per the following extract of the documentation:
...if you select the option Enable partition column inference in the advanced options for a data source, you change how Dremio handles partition columns.
In addition to appending a column named dir<n> for each partition level and using subfolder names for values in those columns, Dremio detects the name of the partition column, appends a column that uses that name, detects values in the names of subfolders, and uses those values in the appended column.
For example, before you enable the Enable partition column inference option, your orders table might have these columns:
orderID |
multiple columns |
dir0 |
---|---|---|
000001 | ... | state=CA |
000002 | ... | state=WA |
Suppose that you enable the Enable partition column inference option. The columns in the table remain the same.
...Dremio adds the column state:
orderID |
multiple columns |
dir0 | state |
---|---|---|---|
000001 | ... | state=CA | CA |
000002 | ... | state=WA | WA |
Steps to Resolve
In order to no longer display the `dir0` column you would need to create a view where this column is omitted.