Lab: ANSI mode and legacy Parquet schema drift

Hands-on lab to reproduce how spark.sql.ansi.enabled changes failure behavior with legacy Parquet type drift.

Prerequisites

Goal

Reproduce the impact of spark.sql.ansi.enabled on type conversion failures when legacy Parquet partitions contain schema drift.

Setup

SET spark.sql.ansi.enabled = false;

CREATE OR REPLACE TEMP VIEW bronze_orders AS
SELECT *
FROM parquet.`dbfs:/mnt/lake/orders_legacy`;

Step 1: Permissive mode

SELECT
  order_id,
  CAST(amount AS DECIMAL(18,2)) AS amount_num
FROM bronze_orders;

Observe non-convertible values becoming NULL.

Step 2: Strict ANSI mode

SET spark.sql.ansi.enabled = true;

SELECT
  order_id,
  CAST(amount AS DECIMAL(18,2)) AS amount_num
FROM bronze_orders;

Observe explicit cast failures.

Validation