Introduction
In one of my projects I've built a logging mechanism with stored procedures in Snowflake. It will log the start of a run, the start of load procedure, the end of the load procedure and the end of the run. You can do this also with hooks in DBT.
From DBT docs : Hooks are snippets of SQL that are executed at different times:
- pre-hook: executed before a model, seed or snapshot is built.
- post-hook: executed after a model, seed or snapshot is built.
- on-run-start: executed at the start of dbt build, dbt compile, dbt docs generate, dbt run, dbt seed, dbt snapshot, or dbt test.
- on-run-end: executed at the end of dbt build, dbt compile, dbt docs generate, dbt run, dbt seed, dbt snapshot, or dbt test.
This blogpost is a blogpost in a serie of blogposts :
- Build a data warehouse with DBT and Snowflake on AdventureWorksLT (PART I)
- Build Staging and Dimensions in DBT for Snowflake (PART II)
- Sources in DBT for Snowflake (PART III)
- Data freshness for DBT in snowflake (Part IV)
- Snapshots in DBT for Snowflake (Part V)
- Generic tests in DBT for Snowflake (PART VI)
- Singular tests in DBT for Snowflake (Part VII)
- Add packages to your DBT project (Part VIII)
- Adding documentation to your DBT project (Part IX)
- Working with hooks in DBT (Part X).
Hooks
--Auditing
CREATE TABLE IF NOT EXISTS RUN (
RUN_ID INT AUTOINCREMENT START 1 INCREMENT 1,
RUN_STARTDATE TIMESTAMP_NTZ,
RUN_ENDDATE TIMESTAMP_NTZ
);
CREATE TABLE IF NOT EXISTS MESSAGE (
MESSAGE_ID INT autoincrement start 1 increment 1,
RUN_ID INT, -- Runnumber of the complete run
MESSAGE_STATUS STRING, -- START, END, ERROR, INTERMEDIATE
MESSAGE_DESCRIPTION STRING, -- Logging message or error message
MESSAGE_LOCATION STRING, -- Individual Stored procedurename
MESSAGE_LAYER STRING, -- Database layer : DATAVAULT, DATAMART
MESSAGE_AREA STRING, -- Subject Area : LEGACY, FINANCE, etc
MESSAGE_LEVEL NUMBER, -- Stored procedure level : 1 = MASTER, 50 = DATAVAULT or DATAMART, 100 = Low level stored procedure
MESSAGE_DATE TIMESTAMP_NTZ, -- Date of logging
MESSAGE_ROWCNT NUMBER -- Affected rows of SQL executed
);
Now, I'm not rebuilding the complete logic into DBT, but I will take some key elements of the logging and build that into DBT.
on-run-start: "INSERT INTO RUN (RUN_STARTDATE) SELECT CURRENT_TIMESTAMP();"
on-run-end: "INSERT INTO RUN (RUN_STARTDATE) SELECT CURRENT_TIMESTAMP();"
And then run the following command in the project folder :
dbt run -m Dim_Customer
Resulting in the following :
How does it look in Snowflake when the command has been executed?
Now the lets check the other hooks in DBT : pre-hook and post-hook. Here I added some code in the dbt_project.yml file : "INSERT INTO MESSAGE table".
dbt run -m Dim_Customer
Geen opmerkingen:
Een reactie posten