...
Note |
---|
Please ensure you are using ANSI SQL for this task |
ResultExpected result [tbl_output]:
Customer_Name | Order_ID | Order_Date | Order_Value |
---|---|---|---|
Aaron Bergman | CA-2017-140935 | 2017-11-10 | 563.94 |
Aaron Bergman | CA-2015-152905 | 2015-02-18 | 12.624 |
Aaron Hawkins | CA-2018-164000 | 2018-12-18 | 18.704 |
Aaron Hawkins | CA-2015-122070 | 2015-04-22 | 257.752 |
Aaron Smayling | CA-2018-101749 | 2018-10-03 | 171.288 |
Aaron Smayling | US-2015-150126 | 2015-07-27 | 65.78 |
Adam Bellavance | CA-2018-107174 | 2018-11-06 | 2595.388 |
Adam Bellavance | CA-2016-150511 | 2016-09-18 | 18.54 |
...
Using the result from question 2 [tbl_output], please write a query that will return on the same row the:
- Customer Name
- First Order Value
- Last Order Value
- Percentage difference between the first and last orders
ResultExpectation:
Customer_Name | Latest_Value | Earliest_Value | Perc_Diff |
---|---|---|---|
Aaron Bergman | 563.94 | 12.624 | 4367.20532319392 |
Aaron Hawkins | 18.704 | 257.752 | -92.7434122722617 |
Aaron Smayling | 171.288 | 65.78 | 160.395256916996 |
Adam Bellavance | 2595.388 | 18.54 | 13898.8565264293 |
...
First_Name | Last_Name | Order_Date | Order_ID |
---|---|---|---|
Chuck | Clark | 2018-12-30 | CA-2018-126221 |
Erica | SMITH | 2018-12-30 | CA-2018-115427 |
Jill | Matthias | 2018-12-30 | CA-2018-156720 |
Patrick | O'Donnell | 2018-12-30 | CA-2018-143259 |
Michael | Chen | 2018-12-29 | US-2018-102638 |
James | Galang | 2018-12-29 | CA-2018-118885 |
Katherine | hughes | 2018-12-29 | US-2018-158526 |
Ken | Brennan | 2018-12-29 | CA-2018-158673 |
Bruce | Stewart | 2018-12-29 | CA-2018-130631 |
Ben | Peterman | 2018-12-29 | CA-2018-146626 |
Christopher | Schild | 2018-12-28 | CA-2018-135111 |
Erica | Hernandez | 2018-12-28 | CA-2018-127516 |
Greg | Hansen | 2018-12-28 | CA-2018-136539 |
Harry | Marie | 2018-12-28 | CA-2018-129805 |
Kristen | Smith | 2018-12-30 | CA-2018-163979 |
Lori | Olson | 2018-12-28 | CA-2018-163860 |
Jason | Gross | 2018-12-28 | CA-2018-101322 |
Jennifer | smith | 2018-12-30 | CA-2018-164826 |
Pamela | Coakley | 2018-12-28 | US-2018-162068 |
Susan | SMITH | 2018-12-28 | CA-2018-122798 |
The above This table contains 1 trillion rows of data, and has indexes on the Last_Name and Order_Date columns.
The Last_Name field is is case insensitive meaning some entries are in can be either UPPERCASE, some lowercase, and some or MiXed-mODecASe.
You are looking for all customers with the whose Last_Name of is 'smith' , text (all cases). Text searches are case-sensitive meaning that searching in just one case, may not return the complete result. You all customers, and you are also only interested in orders made on 2018-12-30
An initial analysis of the data shows that of the 1 trillion rows, 2018-12-30 contains just 500,000 (five-hundred thousand) rows of data.
In order Which of these queries is the more efficient to fetch the required data, please can you identify which of these queries are the most efficient considering the performance enhancements. Explain why.
...
Query Order | Execution Order |
---|---|
Select From Join Where Group By Having Order By |
Question 5b:
Please explain what What is a Union is, and a and Union All. Can you give ? How they differ? And can you provide an example of their use.
There are five types of join, can you name them and Please can you name the five join types, and provide an example of their use.
Tip | ||
---|---|---|
| ||
Question 6: |
The This table below is a transaction stream log:
dbo.txn
Row_ID | Txn_ID | Account_ID | Txn_DateTime | Status |
---|---|---|---|---|
1 | 123abc | 112233 | 2000-01-01 08:08:45 | Initiated |
2 | 895jkl | 112233 | 2000-01-01 08:09:03 | Initiated |
3 | 747ghd | 456789 | 2000-01-01 08:12:21 | Initiated |
4 | 123abc | 112233 | 2000-01-01 08:13:15 | In Process |
5 | 696pol | 456892 | 2000-01-01 08:14:37 | Initiated |
6 | 123abc | 112233 | 2000-01-01 08:17:43 | Successful |
7 | 895ghd | 784521 | 2000-01-01 08:22:51 | Initiated |
8 | 696pol | 456892 | 2000-01-01 08:23:04 | In Process |
9 | 696pol | 456892 | 2000-01-01 08:23:08 | Failed |
...
Thinking about Tableau, you have a workbook connected to a data-source that is more than 3 .2 billion rows and 120 is around 180 columns wide.
This data needs to be refactored each day so the entirety of the table must be dropped and rebuilt in a job taking more than 12 hours each night.
As a result, Data this large is too much data to be loaded into an extract, and too much for Tableau Server to unpack into temp memory.
...
Looking at this table:
Which of the following exprssions will be the these two calculations is the most performant in achieving the highlighted row?
...
Note |
---|
The version 2020.2 changes to the Tableau data model have superseded this this , but this is still pertinent to a Tableau developer |
Consider this ER diagram:
...
Note |
---|
We also appreciate that you may be unfamiliar with charts 3, 4 & 5, if you are unable to build charts 3, 4 & 5, please build replacement's replacements that you feel will compliment the rest of the charts in the collection. |
It will be awsome awesome if you could add some interactivity into this too.
...