...
Lets create the table now. I have created this as a sproc, this is so I can include it as a failsafe in the main etl code: If the table is not present, then recreate it so as to prevent the main etl bugging-out when it comes to load the data to a table that does not exist.
We won't create this straight away, we'll let the etl script handle this later; this way, we know the failsafe is working:
...
Also, you will notice that this script creates two identical tables - the actual table, and a staging table. This is so we can load the data to the staging table first, check to ensure the data is present before then loading into the main table. This process allows for a failsafe, to retain the existing data in the event that no newer data is available at the time of the refresh.
We won't create this straight away, we'll let the etl script handle this later; this way, we know the failsafe is working:
Paste code macro | ||||
---|---|---|---|---|
| ||||
Create Procedure sp_Company_KPI_Data_Table As Create Table dbo.Company_KPI_Data_Load ( Company_KPI_DL_ID Int Identity(1,1) Not Null Constraint pk_Company_KPI_DL Primary Key Clustered(Company_KPI_DL_ID) ,Order_ID Varchar(20) Null ,Order_Date Date Null ,Ship_Date Date Null ,Ship_Mode Varchar(20) Null ,Customer_ID Varchar(10) Null ,Customer_Name Varchar(30) Null ,Segment Varchar(20) Null ,Country_Region Varchar(20) Null ,City Varchar(20) Null ,[State] Varchar(20) Null ,Postal_Code Int Null ,Region Varchar(10) Null ,Product_ID Varchar(20) Null ,Category Varchar(20) Null ,Sub_Category Varchar(20) Null ,Product_Name Varchar(130) Null ,Sales Real Null ,Quantity Smallint Null ,Discount Real Null ,Profit Real Null ) |
Part 1c: The ETL Script
So now we have the log table, lets now get on with the code.
This code doesn't do anything special, simply selects some data from one table and moves it to another, the only difference is how much data is being moved: On the 1st of the month, the table is truncated and the last 365 days are transferred, whereas any other day, we are just transferring the number of days that have lapsed and become available since the last run, which we would normally expect to be one day.
Paste code macro | ||||
---|---|---|---|---|
| ||||
Create Procedure sp_Company_KPI_Build As -- Part 1: Identify the last run date: Declare @Run_Date Date -- In line with the requirement, when the date is the 1st of the month, run_date needs to be set to 365 days ago Select @Run_Date = RecordSet_Max_Date From Company_KPI_Log Where KPI_Log_ID = (Select Max(KPI_Log_ID) From Company_KPI_Log) If Datepart(day,GetDate()) = 1 Set @Run_Date = DateAdd(day,-365,@Run_Date) -- Part 2: Prepare the table: -- Step 1: If the table does not exist, create it If Object_ID('Company_KPI_Data_Load') Is Null Exec sp_Company_KPI_Data_Table;; -- Checks for the presence of the staging table, drops and recreates If Object_ID('Company_KPI_Data_Load_stg') Is Not Null Drop Table Company_KPI_Data_Load_stg; Create Table dbo.Company_KPI_Data_Load_stg ( Company_KPI_DL_ID Int Identity(1,1) Not Null Constraint pk_Company_KPI_DL Primary Key Clustered(Company_KPI_DL_ID) ,Order_ID Varchar(20) Null ,Order_Date Date Null ,Ship_Date Date Null ,Ship_Mode Varchar(20) Null ,Customer_ID Varchar(10) Null ,Customer_Name Varchar(30) Null ,Segment Varchar(20) Null ,Country_Region Varchar(20) Null ,City Varchar(20) Null ,[State] Varchar(20) Null ,Postal_Code Int Null ,Region Varchar(10) Null ,Product_ID Varchar(20) Null ,Category Varchar(20) Null ,Sub_Category Varchar(20) Null ,Product_Name Varchar(130) Null ,Sales Real Null ,Quantity Smallint Null ,Discount Real Null ,Profit Real Null ); |
Part 1c: The ETL Script
So now we have the log table, lets now get on with the code.
This code doesn't do anything special, simply selects some data from one table and moves it to another, the only difference is how much data is being moved: On the 1st of the month, the table is truncated and the last 365 days are transferred, whereas any other day, we are just transferring the number of days that have lapsed and become available since the last run, which we would normally expect to be one day.
Paste code macro | ||||
---|---|---|---|---|
| ||||
Create Procedure sp_Company_KPI_Build As -- Part 1: Identify the last run date: Declare @Run_Date Date -- In line with the requirement, when the date is the 1st of the month, run_date needs to be set to 365 days ago Select @Run_Date = RecordSet_Max_Date From Company_KPI_Log Where KPI_Log_ID = (Select Max(KPI_Log_ID) From Company_KPI_Log) If Datepart(day,GetDate()) = 1 Set @Run_Date = DateAdd(day,-365,@Run_Date) -- Part 2: Prepare the table: -- Step 1: If the table does not exist, create it If Object_ID('Company_KPI_Data_Load') Is Null Exec sp_Company_KPI_Data_Table; -- Step 2: Truncate the staging table Truncate Table Company_KPI_Data_Load_stg -- Step 3: Load the data to staging Insert Into Company_KPI_Data_Load_stg Select Order_ID ,Order_Date ,Ship_Date ,Ship_Mode ,Customer_ID ,Customer_Name ,Segment ,Country_Region ,City ,[State] ,Postal_Code ,Region ,Product_ID ,Category ,Sub_Category ,Product_Name ,Sales ,Quantity ,Discount ,Profit From dbo.Orders Where Order_Date > @Run_Date; -- Part 3: -- Step 1: Now check the data in the staging table. This is a simple row-count to ensure that data has been loaded. Exit script if the check fails If (Select Count(1) From Company_KPI_Data_Load_stg) < 1 Return -- Step 2: Now we'll truncate the data if its the first of the month: If Datepart(day,GetDate()) = 1 Truncate Table Company_KPI_Data_Load -- PartStep 3: Load Now we can load the data to the production table Insert Into Company_KPI_Data_Load Select Order_ID ,Order_Date ,Ship_Date ,Ship_Mode ,Customer_ID ,Customer_Name ,Segment ,Country_Region ,City ,[State] ,Postal_Code ,Region ,Product_ID ,Category ,Sub_Category ,Product_Name ,Sales ,Quantity ,Discount ,Profit From dbo.Orders Where Order_Date > @Run_DateCompany_KPI_Data_Load_stg |
So now we have the code to run the etl process, and the log file that controls it, we now have two final scripts to write: the Tableau ones.
...