- Set up a warehouse with sample data.
- Connect the warehouse to dbt Cloud.
- Add a Git repository to dbt Cloud (not in the blogpost).
- Execute a dbt transformation using dbt run.
- Schedule a job or transformation.
Set up a warehouse with sample data
CREATE WAREHOUSE IF NOT EXISTS transforming; CREATE DATABASE IF NOT EXISTS raw; CREATE DATABASE IF NOT EXISTS analytics; CREATE SCHEMA IF NOT EXISTS raw.jaffle_shop; CREATE SCHEMA IF NOT EXISTS raw.stripe; create table raw.jaffle_shop.customers ( id integer, first_name varchar, last_name varchar ); copy into raw.jaffle_shop.customers (id, first_name, last_name) from 's3://dbt-tutorial-public/jaffle_shop_customers.csv' file_format = ( type = 'CSV' field_delimiter = ',' skip_header = 1 ); create table raw.jaffle_shop.orders ( id integer, user_id integer, order_date date, status varchar, _etl_loaded_at timestamp default current_timestamp ); copy into raw.jaffle_shop.orders (id, user_id, order_date, status) from 's3://dbt-tutorial-public/jaffle_shop_orders.csv' file_format = ( type = 'CSV' field_delimiter = ',' skip_header = 1 ); create table raw.stripe.payment ( id integer, orderid integer, paymentmethod varchar, status varchar, amount integer, created date, _batched_at timestamp default current_timestamp ); copy into raw.stripe.payment (id, orderid, paymentmethod, status, amount, created) from 's3://dbt-tutorial-public/stripe_payments.csv' file_format = ( type = 'CSV' field_delimiter = ',' skip_header = 1 );
Connect the warehouse to dbt Cloud
It seems there is a need for a small tweak in dbt. Change the name of the warehouse and database to analytics and transforming, respectively.
And now we can initialize the project and start the development. After pressing Start developing the following window appears.
Execute a dbt transformation using dbt runNow let's build a model and a model is a select statement (!). Strange but ok. W'll see why dbt a select statement calls a model. Let's press on Create a new branch.
with customers as ( select id as customer_id, first_name, last_name from raw.jaffle_shop.customers ), orders as ( select id as order_id, user_id as customer_id, order_date, status from raw.jaffle_shop.orders ), customer_orders as ( select customer_id, min(order_date) as first_order_date, max(order_date) as most_recent_order_date, count(order_id) as number_of_orders from orders group by 1 ), final as ( select customers.customer_id, customers.first_name, customers.last_name, customer_orders.first_order_date, customer_orders.most_recent_order_date, coalesce(customer_orders.number_of_orders, 0) as number_of_orders from customers left join customer_orders using (customer_id) ) select * from final
version: 2 models: - name: customers description: One record per customer columns: - name: customer_id description: Primary key tests: - unique - not_null - name: first_order_date description: NULL when a customer has not yet placed an order. - name: stg_customers description: This model cleans up customer data columns: - name: customer_id description: Primary key tests: - unique - not_null - name: stg_orders description: This model cleans up order data columns: - name: order_id description: Primary key tests: - unique - not_null - name: status tests: - accepted_values: values: ['placed', 'shipped', 'completed', 'return_pending', 'returned']