zondag 27 februari 2022

Snowflake : Is time travel a replacement for backups (Part II) ?

Introduction

In a couple of blogpposts I've discovered the time travel- and clone capabilities of Snowflake. There are several ways to execute timetravel: time travel on a table, time travel on a schema and time travel on a database. There is also the possibillity of cloning and there is, what I call, "the swipe a version of the table away in order to retrieve an earlier version" (rename and undrop). 


This investigation is due to an event that I've experienced at my current project. In short, we recreated schemas with tables with constraints, but using time travel and cloning resulted that the constraints were gone.


I've already tried to time travel on table level and on database level, but not on schema level. And, this blogpost is about the latter one. What happens when a tablein a schema is recreated and you want to go back to the previous version of the table (with time travel on schema level).


The experiment

First setup the experiments with some scripts. It is a script with statements that create a database in the schema TST.  A couple of insert statements follows.


ALTER WAREHOUSE IF EXISTS COMPUTE_WH RESUME IF SUSPENDED;
USE ROLE SYSADMIN;
USE WAREHOUSE Compute_WH;

-- Create the database
CREATE OR REPLACE DATABASE BackupTimeTravel10;

-- Select the right database
USE DATABASE BackupTimeTravel10;

CREATE SCHEMA TST;

USE SCHEMA TST;

-- Create the table
CREATE OR REPLACE TABLE TST.Test (
  id integer,
  name varchar,
  street varchar,
  PRIMARY KEY (id));

--Insert some data into the table
INSERT INTO TST.Test(id, name, street) VALUES(1, 'Hennie', 'Spicyvalley');
INSERT INTO TST.Test(id, name, street) VALUES(2, 'Peter', 'bakerstreet');


The next step is to drop the TST schema and recreate the schema again. The next step is a table creation but with another structure and data.

-- DROP the schema
DROP SCHEMA TST;

CREATE OR REPLACE SCHEMA TST;

USE SCHEMA TST;

-- Create the table
CREATE OR REPLACE TABLE TST.Test (
  id integer,
  name varchar,
  street varchar,
  town varchar,
  PRIMARY KEY (name));
  
--Insert some data into the table
INSERT INTO TST.Test(id, name, street, town) 
VALUES(1, 'Hennie', 'Spicyvalley', 'Amsterdam');

select * from TST.Test; 

SELECT GET_DDL('Table', 'TST.Test')

The next step is trying to time travel throught the schema, but I can tell you it will not succeed.

create OR REPLACE SCHEMA TST1 clone TST
  at(offset => -120);

USE SCHEMA TST1;

select * from Test;

Results in :



And let's check again if we can retrieve the table if we clone the database. I've done this 

create OR REPLACE database BackupTimeTravel10_clone clone BackupTimeTravel10
  at(offset => -220);
USE DATABASE BackupTimeTravel10_clone;
select * from TST.Test;
Results in :


And we have our previous version of the table back.

SELECT GET_DDL('Table', 'TST.Test')

Results in :

create or replace TABLE TEST (
	ID NUMBER(38,0) NOT NULL,
	NAME VARCHAR(16777216),
	STREET VARCHAR(16777216)
);

The constraint is not present.

Final Thoughts

I've conducted several experiments with time travel, cloning and undrop :
My observations leads to the following conclusions: In the cases when a table is recreated in a database in Snowflake you can't use time travel on table- and schemalevel. It's not possible to retrieve the previous version. Time travel (and cloning) on database level is possible, but the constraints will not be there.

Rename and undrop a table is also possible to retrieve a previous version of the table (and with constraints).

Hennie

zondag 20 februari 2022

Snowflake timetravel and constraints revisited (DROP/UNDROP table)

Introduction

In a previous blogpost about time travel and constraints, I explained that the constraints were lost when timetravel was used (and a clone). We talked with a Snowflake salesperson about this behaviour and he came up with an alternative to retrieve tables which are deleted after they are recreated (the previous version).


The basic scenario is as follows:

  • Create a table with a primary key constraint (eg. A).
  • Drop the table (A).
  • Recreate the table with another structure but with the same name (A).
  • Rename the table to another name (eg. B).
  • Undrop the table (A).
  • The old table is back with (!) the primary key constraint.

Let's try this in an experiment with Snowflake.

Experiment

First start with setting up a small environment in Snowflake. Here is a script with a database, table and insertion of some data.

ALTER WAREHOUSE IF EXISTS COMPUTE_WH RESUME IF SUSPENDED;
USE ROLE SYSADMIN;
USE WAREHOUSE Compute_WH;

-- Create the database
CREATE OR REPLACE DATABASE TestPKFKTimetravel5;

-- Select the right database
USE DATABASE TestPKFKTimetravel5;

-- Create the table
CREATE OR REPLACE TABLE Test (
  id integer,
  name varchar,
  street varchar,
  PRIMARY KEY (id));

--Insert some data into the table
INSERT INTO Test(id, name, street) VALUES(1, 'Hennie', 'Spicyvalley');


Results in :




Now check the history of the table.

-- Show the history of the table
SHOW TABLES HISTORY;


This statement results in the following information as shown in this screenshot.



Check the constraints on the table.

-- Get the DDL  of the table
SELECT GET_DDL('Table', 'Test');


Results in :

create or replace TABLE TEST (
	ID NUMBER(38,0) NOT NULL,
	NAME VARCHAR(16777216),
	STREET VARCHAR(16777216),
	primary key (ID)
);


The next step is to recreate the table by dropping the table and creating the table again.

--Drop the table
DROP TABLE Test;

-- Select the data from the table
SELECT * FROM Test; 

-- Again create the table again with the same name
CREATE OR REPLACE TABLE Test (
  id integer,
  name varchar,
  PRIMARY KEY (name)); 

-- Insert some data into the table
INSERT INTO Test(id, name) VALUES(2, 'Peter');

-- Select some data from the table
SELECT * FROM Test;


Results in  :




Let's recreate the table again by dropping the table and creating the table again.


-- Drop the table
DROP TABLE Test;

--Select data from the table
SELECT * FROM Test;

-- Again create the table 
CREATE OR REPLACE TABLE Test (
  id integer,
  name varchar,
  town varchar,
  PRIMARY KEY (town));

-- Insert some data into the table
INSERT INTO Test(id, name, town) VALUES(3, 'Pim', 'Amsterdam');

-- Show history of the table
SHOW TABLES HISTORY;


The command SHOW TABLES results in :



The table has now three versions in time (travel).


Okay, now comes the trick, rename the table 'test' to 'test1' and then run undrop.

-- Now rename the table
ALTER TABLE Test RENAME TO Test1;

-- Undrop the table
UNDROP TABLE Test; 

SELECT * FROM Test;


And now Peter is back! 


So what happened, we moved a version on to the side and we now we have a view on a previous version of the table.


Next thing is to confirm whether the DDL is correct now.

SELECT GET_DDL('Table', 'Test')


Results in :

create or replace TABLE TEST (
	ID NUMBER(38,0),
	NAME VARCHAR(16777216) NOT NULL,
	primary key (NAME)
);

And can we go one step deeper (or back in time) ?

ALTER TABLE Test RENAME TO Test2;
UNDROP TABLE Test;

SELECT * FROM Test;

SELECT GET_DDL('Table', 'Test')

Results in 

create or replace TABLE TEST (
	ID NUMBER(38,0) NOT NULL,
	NAME VARCHAR(16777216),
	STREET VARCHAR(16777216),
	primary key (ID)
);

In my mind I would represent it something like the following diagram.



Final thoughts

In the specific case that happened with the constraints I described in a previous blogpost, it could be a solution to retrieve a previous version of a table back (with the constraints).  

I tried this also on schemas and it works with schemas too. Constraints were also in place.

Hennie

The complete script


Here is the script I used for the experiments.

ALTER WAREHOUSE IF EXISTS COMPUTE_WH RESUME IF SUSPENDED;
USE ROLE SYSADMIN;
USE WAREHOUSE Compute_WH;

-- Create the database
CREATE OR REPLACE DATABASE TestPKFKTimetravel5;

-- Select the right database
USE DATABASE TestPKFKTimetravel5;

-- Create the table
CREATE OR REPLACE TABLE Test (
  id integer,
  name varchar,
  street varchar,
  PRIMARY KEY (id));

--Insert some data into the table
INSERT INTO Test(id, name, street) VALUES(1, 'Hennie', 'Spicyvalley');

-- select some data from the table
select * from Test;

-- Show the history of the table
SHOW TABLES HISTORY;

-- Get the DDL  of the table
SELECT GET_DDL('Table', 'Test');

--Drop the table
DROP TABLE Test;

-- Select the data from the table
SELECT * FROM Test; 

-- Again create the table again with the same name
CREATE OR REPLACE TABLE Test (
  id integer,
  name varchar,
  PRIMARY KEY (name)); 

-- Insert some data into the table
INSERT INTO Test(id, name) VALUES(2, 'Peter');

-- Select some data from the table
SELECT * FROM Test;

-- Drop the table
DROP TABLE Test;

--Select data from the table
SELECT * FROM Test;

-- Again create the table 
CREATE OR REPLACE TABLE Test (
  id integer,
  name varchar,
  town varchar,
  PRIMARY KEY (town));

-- Insert some data into the table
INSERT INTO Test(id, name, town) VALUES(3, 'Pim', 'Amsterdam');

-- Show history of the table
SHOW TABLES HISTORY;

-- Now rename the table
ALTER TABLE Test RENAME TO Test1;

-- Undrop the table
UNDROP TABLE Test; 

SELECT * FROM Test;

SELECT GET_DDL('Table', 'Test')

ALTER TABLE Test RENAME TO Test2;
UNDROP TABLE Test;

SELECT * FROM Test;

SELECT GET_DDL('Table', 'Test')


ALTER WAREHOUSE IF EXISTS COMPUTE_WH SUSPEND;



//=============================================================================
// Experiment 8 : recreate schema and see if we could get back a previous version of the schema
//=============================================================================
ALTER WAREHOUSE IF EXISTS COMPUTE_WH RESUME IF SUSPENDED;
USE ROLE SYSADMIN;
USE WAREHOUSE Compute_WH;

-- Create the database
CREATE OR REPLACE DATABASE BackupTimeTravel8;

-- Select the right database
USE DATABASE BackupTimeTravel8;

CREATE SCHEMA TST;

USE SCHEMA TST;

-- Create the table
CREATE OR REPLACE TABLE TST.Test (
  id integer,
  name varchar,
  street varchar,
  PRIMARY KEY (id));

--Insert some data into the table
INSERT INTO TST.Test(id, name, street) VALUES(1, 'Hennie', 'Spicyvalley');
INSERT INTO TST.Test(id, name, street) VALUES(2, 'Peter', 'bakerstreet');

-- DROP the schema
DROP SCHEMA TST;

CREATE OR REPLACE SCHEMA TST;

USE SCHEMA TST;

-- Create the table
CREATE OR REPLACE TABLE TST.Test (
  id integer,
  name varchar,
  street varchar,
  town varchar,
  PRIMARY KEY (name));
  
--Insert some data into the table
INSERT INTO TST.Test(id, name, street, town) VALUES(1, 'Hennie', 'Spicyvalley', 'Amsterdam');

select * from TST.Test; 

--Rename the schema.
ALTER SCHEMA TST RENAME TO TST1;

UNDROP SCHEMA TST;

select * from TST.Test 

select * from TST1.Test 


ALTER WAREHOUSE IF EXISTS COMPUTE_WH SUSPEND;



zaterdag 19 februari 2022

Snowflake : Is time travel a replacement for backups ?

Introduction

Snowflake offers a great feature, called time travel. I blogged about some events that happen at my current project with primary and foreign constraints that disappeared when we used time travel. Primary key and foreign key constraints are not enforced in Snowflake. Except one thing: you can't create a  foreign key if the referenced table doesn't exists. Okay, it's annoying when the PK en FK are gone, and if you have them stored separately you could reapply the constraints again, but it can a complex task in case of time traveling. The next thing I was wandering about is that could time travel be a replacement for backups as we are used to in the on - premise world? Let's do some experiments.

After reading a blog about timetravelling and backups I was curious about changing structure and time travel. 

Experiment : add and remove a column

I've created some scripts to test the timetravelling. First I created a database and a table with data.


ALTER WAREHOUSE IF EXISTS COMPUTE_WH RESUME IF SUSPENDED;
USE ROLE SYSADMIN;
USE WAREHOUSE Compute_WH;

-- Create the database
CREATE OR REPLACE DATABASE BackupTimeTravel;

-- Select the right database
USE DATABASE BackupTimeTravel;

-- Create the table
CREATE OR REPLACE TABLE Test (
  id integer,
  name varchar,
  street varchar,
  PRIMARY KEY (id));

--Insert some data into the table
INSERT INTO Test(id, name, street) VALUES(1, 'Hennie', 'Spicyvalley');

-- select some data from the table
select * from Test;

And this script results in the following:




Next step is to insert some extra data in the table.

INSERT INTO Test(id, name, street) VALUES(2, 'Peter', 'bakerstreet');

select * from Test at(offset => -60);


Results in (after 60 seconds waiting) : 




Now let's add a column to the table and insert data in the table:

ALTER TABLE Test  ADD COLUMN town varchar;

--Insert some data into the table
INSERT INTO Test(id, name, street, town) VALUES(1, 'Hennie', 'Spicyvalley', 'Amsterdam');

select * from Test


Results in  : 



Let's execute some timetravelling.

select * from Test at(offset => -160);


Results in :



So this seems all perfect. Now drop the column 'street' in the table.

ALTER TABLE Test  DROP COLUMN street

select * from Test at(offset => -260);


Results in :


The column 'street' is gone with time travelling through the table. 

Now, let's experiment with the database and see whether we can time travel and clone the database to get the columns back. 


create database BackupTimeTravel_clone clone BackupTimeTravel
  at(offset => -500);
  
-- Select the right database
USE DATABASE BackupTimeTravel_clone;

select * from Test


Results in :


And the column is back in the table in the cloned database of the original database. 

Now check if we have the PK available.


SELECT GET_DDL('Table', 'Test');

Resuting in :

create or replace TABLE TEST (
	ID NUMBER(38,0) NOT NULL,
	NAME VARCHAR(16777216),
	STREET VARCHAR(16777216),
	primary key (ID)
);

And now we still have the PK.

Experiment : Recreate a table

Another item that is mentioned in the blog is that when a object is recreated the time travel is gone. Let's do some experiments with this mind. First setup the database and the table.

ALTER WAREHOUSE IF EXISTS COMPUTE_WH RESUME IF SUSPENDED;
USE ROLE SYSADMIN;
USE WAREHOUSE Compute_WH;

-- Create the database
CREATE OR REPLACE DATABASE BackupTimeTravel2;

-- Select the right database
USE DATABASE BackupTimeTravel2;

-- Create the table
CREATE OR REPLACE TABLE Test (
  id integer,
  name varchar,
  street varchar,
  PRIMARY KEY (id));

--Insert some data into the table
INSERT INTO Test(id, name, street) VALUES(1, 'Hennie', 'Spicyvalley');

-- select some data from the table
select * from Test;

INSERT INTO Test(id, name, street) VALUES(2, 'Peter', 'bakerstreet');

select * from Test


Results in :



Now wait for 60 seconds or so (or use the query_id).


select * from Test at(offset => -60);

DROP TABLE Test

-- Create the table
CREATE OR REPLACE TABLE Test (
  id integer,
  name varchar,
  street varchar,
  town varchar,
  PRIMARY KEY (id));
  
--Insert some data into the table
INSERT INTO Test(id, name, street, town) VALUES(1, 'Hennie', 'Spicyvalley', 'Amsterdam');

select * from Test 
Results in :



Timetravelling before the recreation of the table is not possible. The statement (offset => -60) doesn't work, only when the age of recreated table becomes older than 60 seconds.

select * from Test at(offset => -60);

Resulting in : 


The next thing I would like to try is whether I could create a clone of the database at the moment of the initial creation of the table (the first one).

create OR REPLACE database BackupTimeTravel2_clone clone BackupTimeTravel2
  at(offset => -220);
  
-- Select the right database
USE DATABASE BackupTimeTravel2_clone;

select * from Test

Results in 




I have the old table back! So it seems that it's not possible to timetravel back on table level when the table is recreated, but it is still possible when you time travel back on database level.

We only lost the PK (I blogged  earlier about this) .

SELECT GET_DDL('Table', 'Test')

Results in :

create or replace TABLE TEST (
	ID NUMBER(38,0) NOT NULL,
	NAME VARCHAR(16777216),
	STREET VARCHAR(16777216)
);


Final thoughts

At first, I thought that with the help of time travelling through the table, the column could not be retrieved, but with time travelling on database level I was able get the table back in original state. So the blog is true in case of timetravelling on table level but not on database level.

When a table is recreated it's not possible to timetravel back to previous version of the table on table level but it's possible to get the table back when timetravelling is used on database level (but no PK/FK).

There is one thing I would like to test is the following : "If the precision of a column is decreased below the maximum precision of any column data retained in Time Travel, you will not be able to restore the table without first increasing the precision.". This is for a future blogpost. 

The complete script 


//=============================================================================
// Experiment 6 : change the table and see if we can time travel
//=============================================================================
ALTER WAREHOUSE IF EXISTS COMPUTE_WH RESUME IF SUSPENDED;
USE ROLE SYSADMIN;
USE WAREHOUSE Compute_WH;

-- Create the database
CREATE OR REPLACE DATABASE BackupTimeTravel;

-- Select the right database
USE DATABASE BackupTimeTravel;

-- Create the table
CREATE OR REPLACE TABLE Test (
  id integer,
  name varchar,
  street varchar,
  PRIMARY KEY (id));

--Insert some data into the table
INSERT INTO Test(id, name, street) VALUES(1, 'Hennie', 'Spicyvalley');

-- select some data from the table
select * from Test;

INSERT INTO Test(id, name, street) VALUES(2, 'Peter', 'bakerstreet');

select * from Test at(offset => -60);

ALTER TABLE Test  ADD COLUMN town varchar;

--Insert some data into the table
INSERT INTO Test(id, name, street, town) VALUES(1, 'Hennie', 'Spicyvalley', 'Amsterdam');

select * from Test 

select * from Test at(offset => -160);


ALTER TABLE Test  DROP COLUMN street



select * from Test at(offset => -500);

create database BackupTimeTravel_clone clone BackupTimeTravel
  at(offset => -500);
  
-- Select the right database
USE DATABASE BackupTimeTravel_clone;

select * from Test

ALTER WAREHOUSE IF EXISTS COMPUTE_WH SUSPEND;
//=============================================================================
// Experiment 7 : recreate the table and see if we can time travel
//=============================================================================
ALTER WAREHOUSE IF EXISTS COMPUTE_WH RESUME IF SUSPENDED;
USE ROLE SYSADMIN;
USE WAREHOUSE Compute_WH;

-- Create the database
CREATE OR REPLACE DATABASE BackupTimeTravel2;

-- Select the right database
USE DATABASE BackupTimeTravel2;

-- Create the table
CREATE OR REPLACE TABLE Test (
  id integer,
  name varchar,
  street varchar,
  PRIMARY KEY (id));

--Insert some data into the table
INSERT INTO Test(id, name, street) VALUES(1, 'Hennie', 'Spicyvalley');

-- select some data from the table
select * from Test;

INSERT INTO Test(id, name, street) VALUES(2, 'Peter', 'bakerstreet');

select * from Test

select * from Test at(offset => -60);

DROP TABLE Test

-- Create the table
CREATE OR REPLACE TABLE Test (
  id integer,
  name varchar,
  street varchar,
  town varchar,
  PRIMARY KEY (id));
  
--Insert some data into the table
INSERT INTO Test(id, name, street, town) VALUES(1, 'Hennie', 'Spicyvalley', 'Amsterdam');

select * from Test 

select * from Test at(offset => -120);



create OR REPLACE database BackupTimeTravel2_clone clone BackupTimeTravel2
  at(offset => -220);
  
-- Select the right database
USE DATABASE BackupTimeTravel2_clone;

select * from Test

ALTER WAREHOUSE IF EXISTS COMPUTE_WH SUSPEND;