zaterdag 11 november 2023

Generating DROP Statements in Snowflake (Part I)

Introduction

I want to drop certain objects in certain schemas of a database (and I don't want to drop the schema) and therefore I decided to build a piece of code that could do that in a dynamic way. The script in this blogpost drops tables, views and stored procedures. 

As an improvement, you can expand this code with functions and other objects. thats up to you. Tasks are different than other database objects. It's a challenge to integrate Tasks in the code. There is not a metadata view available in the INFORMATION_SCHEMA like tables, views and stored procedures. I haven't found a solution for this yet.

Setup a demo database

For this blogpost I've setup a small demo DB for which  I executed the following code :

CREATE OR REPLACE DATABASE BUILDDROPS;

CREATE TABLE A(A STRING);
CREATE TABLE B(A STRING);
CREATE TABLE C(C STRING);
CREATE TABLE D(D STRING);
CREATE TABLE E(E STRING);

CREATE VIEW VW_A AS SELECT * FROM A;
CREATE VIEW VW_B AS SELECT * FROM B;
CREATE VIEW VW_C AS SELECT * FROM C;

CREATE PROCEDURE LOAD_A(PARTIME TIMESTAMP_NTZ) 
RETURNS STRING
LANGUAGE SQL 
AS 
BEGIN 
    SELECT 1; 
    RETURN ('SUCCEEDED');
END;

CREATE PROCEDURE LOAD_B(PARTIME TIMESTAMP_NTZ, ANOTHERPARAMETER STRING) 
RETURNS STRING
LANGUAGE SQL 
AS 
BEGIN 
    SELECT 1; 
    RETURN ('SUCCEEDED');
END;

Just a couple of tables, with views and stored procedures with different parameters and the latter one leads to troubles when you want to drop the stored procedures. 


Problem with dropping procedures

Dropping stored procedures in Snowflake can be cumbersome because you can't just say DROP PROCEDURE <naam>  and that is because Snowflake uses the concept of overloading (instead of optional parameters). If a parameter is optional you need to recreate the stored procedure again but with different parameters and with the same name. I don't like this concept. Anyway, because you can have multiple stored procedures with the same name but with different parameters you have to specify the type of  parameters of the stored pocedure you want to drop.


For instance it's not possible to drop this stored procedure.

DROP PROCEDURE IF EXISTS  LOAD_A;


It will lead to an error:




So you have to specify the type of the parameter :

DROP PROCEDURE IF EXISTS BUILDDROPS.PUBLIC.LOAD_B( TIMESTAMP_NTZ , VARCHAR ); 


The script

I've included a script in the blogpost. This is presented in the codebox below.  In this script I deal with three types of objects (you can expand these with more types like Functions if you want) : tables, views and stored procedures.

As mentioned in the paragraph before, dropping stored procedures can be cumbersome because you have to specify the types of the parameters. As an inspiration I used this article on the site of snowflake : Stored procedure on how to delete multiple procedures in the schema at once. In this blogpost is a script that manipulates the argument_signature field in the meta data table of stored procedure. It will extract the fields, removes the field name and serialize the data types in a string with parentheses.

DECLARE TAB_CURSOR CURSOR FOR
        WITH CTE_TABLES AS (
        SELECT 
                TABLE_CATALOG ||'.'|| TABLE_SCHEMA ||'.'|| TABLE_NAME AS TABLE_NAME,
                T.TABLE_TYPE AS OBJECT_TYPE, 
                1 AS SORT_ORDER
            FROM INFORMATION_SCHEMA.TABLES T
            WHERE T.TABLE_SCHEMA NOT IN ('INFORMATION_SCHEMA')
            AND TABLE_TYPE = 'BASE TABLE'
        ),
        CTE_VIEWS AS (
            SELECT 
                TABLE_CATALOG ||'.'|| TABLE_SCHEMA ||'.'|| TABLE_NAME AS VIEW_NAME,
                T.TABLE_TYPE AS OBJECT_TYPE, 
                2 AS SORT_ORDER
            FROM INFORMATION_SCHEMA.TABLES T
            WHERE T.TABLE_SCHEMA NOT IN ('INFORMATION_SCHEMA')
            AND TABLE_TYPE = 'VIEW'
        
        ),
        CTE_PROC_BASE AS (
            SELECT 
                PROCEDURE_CATALOG , 
                PROCEDURE_SCHEMA, 
                PROCEDURE_NAME, 
                ARGUMENT_SIGNATURE, 
                SPLIT(TRIM(ARGUMENT_SIGNATURE, '()'), ',') AS ARG_ARRAY 
            FROM INFORMATION_SCHEMA.PROCEDURES
        ),
        CTE_PROC_ARRAY_VALUES AS ( 
            SELECT 
                MAX(PROCEDURE_CATALOG) AS PROCEDURE_CATALOG, 
                MAX(PROCEDURE_SCHEMA) AS PROCEDURE_SCHEMA, 
                PROCEDURE_NAME, 
                ARRAY_AGG(SPLIT_PART(LTRIM(d.value,' '),' ', 2)) AS FILTERED_VALUES,
                3
            FROM  CTE_PROC_BASE, LATERAL FLATTEN(ARG_ARRAY) d
            GROUP BY ARG_ARRAY, PROCEDURE_NAME
        ),
        CTE_PROCEDURES AS (
            SELECT 
                PROCEDURE_CATALOG ||'.'|| PROCEDURE_SCHEMA ||'.'|| PROCEDURE_NAME|| '(' || REPLACE (TRIM (FILTERED_VALUES::varchar,'[]'),'"', ' ' ) || ')' AS PROCEDURE_NAME,
                'PROCEDURE' AS OBJECT_TYPE,
                3 AS SORT_ORDER
            FROM CTE_PROC_ARRAY_VALUES
        )
        SELECT OBJECT_NAME, OBJECT_TYPE
        FROM 
            (SELECT TABLE_NAME AS OBJECT_NAME, OBJECT_TYPE, SORT_ORDER FROM CTE_TABLES
            UNION ALL
            SELECT VIEW_NAME,  OBJECT_TYPE, SORT_ORDER FROM CTE_VIEWS
            UNION ALL 
            SELECT PROCEDURE_NAME, OBJECT_TYPE, SORT_ORDER FROM CTE_PROCEDURES
            )
        ORDER BY SORT_ORDER;
        

  var_OBJECT_NAME VARCHAR;
  var_OBJECT_TYPE VARCHAR;
  
  QUERY STRING DEFAULT '';

BEGIN
  OPEN TAB_CURSOR;
  LOOP
    FETCH TAB_CURSOR into var_OBJECT_NAME, var_OBJECT_TYPE ;
    IF (var_OBJECT_TYPE = 'BASE TABLE') THEN
       QUERY := QUERY  || 'DROP TABLE IF EXISTS ' || var_OBJECT_NAME || '; \r';  
    ELSEIF (var_OBJECT_TYPE = 'VIEW') THEN
       QUERY := QUERY  || 'DROP VIEW IF EXISTS ' || var_OBJECT_NAME|| '; \r';  
    ELSEIF(var_OBJECT_TYPE = 'PROCEDURE') THEN   
       QUERY := QUERY  || 'DROP PROCEDURE IF EXISTS ' || var_OBJECT_NAME || '; \r';  
    ELSE
      BREAK;   
    END IF;
  END LOOP;
  CLOSE TAB_CURSOR;
  RETURN (:QUERY);
END;


This script wil return a string that can be copied and executed. I'm a bit cautious with directly executing the DROP statements on the database. With this approach you can inspect the code that is returned and you can make some adjustments if you want.


DROP TABLE IF EXISTS BUILDDROPS.PUBLIC.C; 
DROP TABLE IF EXISTS BUILDDROPS.PUBLIC.E; 
DROP TABLE IF EXISTS BUILDDROPS.PUBLIC.B; 
DROP TABLE IF EXISTS BUILDDROPS.PUBLIC.D; 
DROP TABLE IF EXISTS BUILDDROPS.PUBLIC.A; 
DROP VIEW IF EXISTS BUILDDROPS.PUBLIC.VW_C; 
DROP VIEW IF EXISTS BUILDDROPS.PUBLIC.VW_A; 
DROP VIEW IF EXISTS BUILDDROPS.PUBLIC.VW_B; 
DROP PROCEDURE IF EXISTS BUILDDROPS.PUBLIC.LOAD_B( TIMESTAMP_NTZ , VARCHAR ); 
DROP PROCEDURE IF EXISTS BUILDDROPS.PUBLIC.LOAD_A( TIMESTAMP_NTZ );


Final Thoughts

This blogpost describes a simple approach of dropping objects in a database. 

Geen opmerkingen:

Een reactie posten