SELECT * FROM enrollments_updates
To track table changes, use the DESCRIBE HISTORY command on the enrollments_updates table.
DESCRIBE HISTORY enrollments_updates
# drop the enrollments_updates table
%sql DROP TABLE enrollments_updates
# remove the checkpoint location associated with our Auto Loader stream
dbutils.fs.rm("dbfs:/mnt/DEA-Book/checkpoints/enrollments", True)
Medallion Architecture, or multi-hop architecture, is a layered data design that improves data structure and quality through stages. It consists of three layers:
Each layer adds value, ensuring a structured and scalable transformation process.
Bronze Layer
is the first stage of the medallion architecture, where raw data is ingested and stored without transformation. It preserves the original format for auditing and traceability. Data sources include files, databases, and streaming platforms like Kafka. The goal is to capture all data, regardless of quality, as a single source of truth.
Silver Layer
The silver layer processes raw data to improve its quality and make it ready for analysis. This includes cleaning, normalizing, validating, and enriching the data—often by joining it with other sources. The goal is to ensure accuracy and consistency, creating a reliable foundation for analytics and reporting.
Gold Layer
The gold layer contains fully refined, business-ready data. Here, data is aggregated and summarized to support decision-making, such as KPIs, financial reports, and customer analytics. This layer is optimized for reporting, dashboards, and advanced use cases like machine learning and AI.
Benefits of Medallion Architecture