Summary
When the nodejs '@dsnp/parquetjs' library is used to create parquet files, Dremio cannot read the resulting parquet files and fails with a decode error.
Reported Issue
When writing to a datalake table, the write fails with the following error:
DATA_READ ERROR: Failed to decode column application::varchar
Total records decoded and sent upstream 0
Normal value encoded pages read 0
DICTIONARY encoded pages read 0
Total records decoded in current page and sent upstream after passing filter 0
File path /data/mertics/telemetry/localStorage/logins.parquet
Rowgroup index 0
SqlOperatorImpl TABLE_FUNCTION
Location 0:0:40
Fragment 0:0
[Error Id: 58ce0016-08cd-4a38-b411-b8f7393a3608 on dremio-executor-10.dremio.local:0]
(java.lang.NullPointerException) null
Total records decoded and sent upstream 0
Normal value encoded pages read 0
DICTIONARY encoded pages read 0
Total records decoded in current page and sent upstream after passing filter 0
File path /data/mertics/telemetry/localStorage/logins.parquet
Rowgroup index 0
SqlOperatorImpl TABLE_FUNCTION
Location 0:0:40
Fragment 0:0
Relevant Versions
Dremio 25.0.x
Troubleshooting Steps
The above error message is observed in the query profile and coordinator server.log.
Cause
Identified as internal Jira DX-55188 "NPE in RowGroups$1.next():185 (Failed to decode column application::varchar)".
Referenced in the release notes.
Fixed the NullPointerException in RowGroups querying Parquet files with incomplete stats.
Steps to Resolve
Upgrade to Dremio 25.0.9 or higher.
Next Steps
No additional steps required after the upgrade.