Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Lets create the table now. I have created this as a sproc, this is so I can include it as a failsafe in the main etl code: If the table is not present, then recreate it so as to prevent the main etl bugging-out when it comes to load the data to a table that does not exist.
We won't create this straight away, we'll let the etl script handle this later; this way, we know the failsafe is working:

...

Also, you will notice that this script creates two identical tables - the actual table, and a staging table. This is so we can load the data to the staging table first, check to ensure the data is present before then loading into the main table. This process allows for a failsafe, to retain the existing data in the event that no newer data is available at the time of the refresh.
We won't create this straight away, we'll let the etl script handle this later; this way, we know the failsafe is working:

Paste code macro
languagesql
themeVs
Create Procedure sp_Company_KPI_Data_Table As

Create Table dbo.Company_KPI_Data_Load (
	Company_KPI_DL_ID Int Identity(1,1) Not Null
	Constraint pk_Company_KPI_DL Primary Key Clustered(Company_KPI_DL_ID)
	,Order_ID Varchar(20) Null
	,Order_Date Date Null
	,Ship_Date Date Null
	,Ship_Mode Varchar(20) Null
	,Customer_ID Varchar(10) Null
	,Customer_Name Varchar(30) Null
	,Segment Varchar(20) Null
	,Country_Region Varchar(20) Null
	,City Varchar(20) Null
	,[State] Varchar(20) Null
	,Postal_Code Int Null
	,Region Varchar(10) Null
	,Product_ID Varchar(20) Null
	,Category Varchar(20) Null
	,Sub_Category Varchar(20) Null
	,Product_Name Varchar(130) Null
	,Sales Real Null
	,Quantity Smallint Null
	,Discount Real Null
	,Profit Real Null
)

Part 1c: The ETL Script

So now we have the log table, lets now get on with the code.

This code doesn't do anything special, simply selects some data from one table and moves it to another, the only difference is how much data is being moved: On the 1st of the month, the table is truncated and the last 365 days are transferred, whereas any other day, we are just transferring the number of days that have lapsed and become available since the last run, which we would normally expect to be one day.

Paste code macro
languagesql
themeVs
Create Procedure sp_Company_KPI_Build As

-- Part 1: Identify the last run date:
Declare @Run_Date Date

-- In line with the requirement, when the date is the 1st of the month, run_date needs to be set to 365 days ago

Select @Run_Date = RecordSet_Max_Date From Company_KPI_Log Where KPI_Log_ID = (Select Max(KPI_Log_ID) From Company_KPI_Log)
If Datepart(day,GetDate()) = 1 Set @Run_Date = DateAdd(day,-365,@Run_Date)

-- Part 2: Prepare the table:
-- Step 1: If the table does not exist, create it
If Object_ID('Company_KPI_Data_Load') Is Null Exec sp_Company_KPI_Data_Table;;

-- Checks for the presence of the staging table, drops and recreates
If Object_ID('Company_KPI_Data_Load_stg') Is Not Null Drop Table Company_KPI_Data_Load_stg;

Create Table dbo.Company_KPI_Data_Load_stg (
	Company_KPI_DL_ID Int Identity(1,1) Not Null
	Constraint pk_Company_KPI_DL Primary Key Clustered(Company_KPI_DL_ID)
	,Order_ID Varchar(20) Null
	,Order_Date Date Null
	,Ship_Date Date Null
	,Ship_Mode Varchar(20) Null
	,Customer_ID Varchar(10) Null
	,Customer_Name Varchar(30) Null
	,Segment Varchar(20) Null
	,Country_Region Varchar(20) Null
	,City Varchar(20) Null
	,[State] Varchar(20) Null
	,Postal_Code Int Null
	,Region Varchar(10) Null
	,Product_ID Varchar(20) Null
	,Category Varchar(20) Null
	,Sub_Category Varchar(20) Null
	,Product_Name Varchar(130) Null
	,Sales Real Null
	,Quantity Smallint Null
	,Discount Real Null
	,Profit Real Null
);

Part 1c: The ETL Script

So now we have the log table, lets now get on with the code.

This code doesn't do anything special, simply selects some data from one table and moves it to another, the only difference is how much data is being moved: On the 1st of the month, the table is truncated and the last 365 days are transferred, whereas any other day, we are just transferring the number of days that have lapsed and become available since the last run, which we would normally expect to be one day.

Paste code macro
languagesql
themeVs
Create Procedure sp_Company_KPI_Build As

-- Part 1: Identify the last run date:
Declare @Run_Date Date

-- In line with the requirement, when the date is the 1st of the month, run_date needs to be set to 365 days ago

Select @Run_Date = RecordSet_Max_Date From Company_KPI_Log Where KPI_Log_ID = (Select Max(KPI_Log_ID) From Company_KPI_Log)
If Datepart(day,GetDate()) = 1 Set @Run_Date = DateAdd(day,-365,@Run_Date)

-- Part 2: Prepare the table:
-- Step 1: If the table does not exist, create it
If Object_ID('Company_KPI_Data_Load') Is Null Exec sp_Company_KPI_Data_Table;

-- Step 2: Truncate the staging table
Truncate Table Company_KPI_Data_Load_stg

-- Step 3: Load the data to staging
Insert Into Company_KPI_Data_Load_stg
Select
Order_ID
,Order_Date
,Ship_Date
,Ship_Mode
,Customer_ID
,Customer_Name
,Segment
,Country_Region
,City
,[State]
,Postal_Code
,Region
,Product_ID
,Category
,Sub_Category
,Product_Name
,Sales
,Quantity
,Discount
,Profit

From dbo.Orders
Where Order_Date > @Run_Date;

-- Part 3:
-- Step 1: Now check the data in the staging table. This is a simple row-count to ensure that data has been loaded. Exit script if the check fails
If (Select Count(1) From Company_KPI_Data_Load_stg) < 1 Return
 
-- Step 2: Now we'll truncate the data if its the first of the month:
If Datepart(day,GetDate()) = 1 Truncate Table Company_KPI_Data_Load

-- PartStep 3: Load Now we can load the data to the production table
Insert Into Company_KPI_Data_Load
Select
Order_ID
,Order_Date
,Ship_Date
,Ship_Mode
,Customer_ID
,Customer_Name
,Segment
,Country_Region
,City
,[State]
,Postal_Code
,Region
,Product_ID
,Category
,Sub_Category
,Product_Name
,Sales
,Quantity
,Discount
,Profit

From dbo.Orders
Where Order_Date > @Run_DateCompany_KPI_Data_Load_stg

So now we have the code to run the etl process, and the log file that controls it, we now have two final scripts to write: the Tableau ones.

...