Beginning in early 2015, Tableau has seen an exponential rise in use particularly by Excel users. Much of the training material available, indeed, even this very site, is geared to helping you get the most from Tableau, however, whilst most BI tools are broadly the same, Tableau is one of the least Excel-centric available on the market, which means that greater emphasis needs to be made in helping those users migrating from Excel and Excel-centric tools adapt to the tool far more easier and quicker.
Here, I shall be adressing some of the most common problems I have been asked by migrating users; hopefully, you will find this helpful towards your path as a BI developer.
Whilst this article is Tableau-centric, being a BI tool, this article is also useful for Excel users moving to a competitor BI tool such as QlikSense, Business Objects, Looker etc
What is Tableau?
Before we delve into this, lets turn the question around and ask: what is Excel?
The first question that no-one thinks to ask:
When first introduced to Excel (and other spreadsheet tools such as Googe Sheets and OpenOffice Calc), we don't really question the tool, we just get to work, learning as we go, yet still never too sure about what a spreadsheet really is: Is it a calculator? A data entry and manipulation tool? An analysis tool? A visualisation tool? Knowing this, I hope to help you settle much quicker than without it.
So we know that Excel is a spreadsheet tool. So really, this isn't so much a question of what Excel is but rather, a question of what a spreadsheet is, and how does this compare to Tableau.
So spreadsheets are multi-purpose tools that allow for data entry, data manipulation, analytics and visualisation.
So going back to the original question What is Tableau, we can say that Tableau neither allows for data entry nor data manipulation (at least not directly), though through calculations, we can alter the output and meaning of the data, whilst allowing the original data to persist.
Tableau is a presentation layer. Or rather, a pivoting-tool, in that it presents the output of your data model, whether this be a direct read of your data, or where you have manipulated the meaning of the data through calculations
This is the key answer and the one that should always remain on your mind when you are working with either tool: Tableau is a presentation layer which means, it is the final step in any data-work whether this be:
A data-driven dashboard
An info-chart etc
Where you would use a spreadsheet to enter and manipulate data and then analyse data, Tableau simply visualises data and the output of your calculations. This means that any alterations to the data through calculation, editing and manipulation are either semi-permanent, or permanent and this able to be used by other tools, whereas the data that Tableau display's has only been read by Tableau, it's state has not altered.
This brings us neatly onto the next section: Where are my calculations processed?
Where are my calculations processed?
Great question which allows us to take a peak at the architecture:
Spreadsheets (Excel, OpenOffice Calc, Google Sheets etc) contain all the necessary programming to evaluate and execute the expressions you enter be it arithmetic, logic, string, date, manipulation, or a combination of any of these.
Tableauhas only a rudimentary calculation function for Table Calculations, which are largely intersect and lookup calculations; other than these, all other calculations are sent as SQL queries to the data-engine of the data-source (such as Excel or SQL Server) for execution. Once executed, the source-engine returns the result-set back to Tableau for Tableau to consume and plot as your visual representation.
With Tableau however, we have a third type of datasource: the Tableau extract (Hyper), this is the process of snapshotting your data as it stands today, either as a pre-filtered portion, or in its entirety, into an extract layer, saved as a .hyper archive file (or .tde for pre v10.5).
Although the data has been extracted by Tableau, and exists in the same archive as the Tableau workbook, Hyper (and tde) are processed by a different module than that used for the workbook and visualisation. The extract is expanded into a temporary memory-space as a column-store NOSQL database when your workbook is opened, and expressions formed in the visualisation layer are then sent outside the workbook to the data-processing layer exactly as they are for external sources such as to a database, or flat-file such as text, Excel or Access.
Can Tableau write data?
A classic question:
The short answer (in 2020) is no, Tableau cannot write-data. That said, with a little creative thinking, and an ETL tool (extract, transform and load) to hand, or using or creating a third-party extension, data-writing can be achieved external to Tableau, but based on the results of user analysis, for further future analysis.
Excel users are used to being able to enter data into cells - short of connecting to a database or OLAP cube, or importing data from an alternate source, this is the only other way (and tends to be the primary method), of getting data into Excel, furthermore, once the data has been written into Excel, it can be made available for other workbooks to read from almost as a writeable database.
Natively, being a presentation layer, Tableau can only read from a data-source:
Just remember, Tableau cannot natively write data, neither should it be able to so any method that allows for the capture of data is currently unsupported.
There appears to be loads of things I can easily do with Excel but not or not very easily if at all with Tableau
Tableau is a BI tool...
So you keep saying, but this still doesn't help me to understand why Tableau cannot do seemingly simple tasks
As mentioned above, spreadsheets are self-contained programs that have all the necessary functions and libraries needed to allow them to calculate the expressions that you enter into the cells, the result is, you can do a lot more with Excel in this way than you can with Tableau.
Tableau is a presentation layer - a very pretty presentation layer but nothing more. The easiest way to understand Tableau is to think of it as a pivot-table and pivot-chart. When using either of these in Excel, you define a source typically several rows and columns of data, you then drag the defined elements into place as a table and then, if you need to, place a chart atop the pivoted data.
Tableau is no different as such, Tableau is only really restricted by two items (which are largely the same two items that block full creativity in Excel):
What can the data source do
How to work with the result-set of the pivoted data as behind the scenes, Tableau generates the data as a table and then places your charts atop this.
So if Tableau is nothing more than a Pivot Table and I can already use this in Excel, why do I need Tableau?
You don't necessarily need to use Tableau...
.. if your data is already in Excel, and you are comfortable with the pivot-table functions there however, with Tableau you can:
Analyse the data in situ with no further transformations required beyond expressions
Control outputs from one-to-many charts
Provide interactivity to cut some charts but not others without modifying the source data (illustrated here as the Raw Pivot Table
Use functions beyond their designed scope to add further information to your chart
Access pre-built analytical functions to add further dimensions to your data
Reduce raw processing power as much of these will be handled by the source
All of which is perfectly accomplishable with an Excel pivot-table however, to do so with Excel, you must first define your pivot-table, then use this pivot-table as the basis for new pivot-tables that will provide your analyses and then build against the new tables eg:
So as you can see, it can be done although the question becomes one of ease of use, ease to maintain and, data availability, where Tableau is unrestricted on the numbers of rows it can read (restricted only on the spool size and pipeline width from the source).
If however this is a question of whether Tableau is right for you when you have no Tableau expertise, are very proficient with Excel, have sharepoint installed and configured and, are proficient using PowerBI, then this becomes a question of do you want to move from one BI tool to another?
So, if I can do more with Tableau than I can with Excel, can't I just ditch Excel?
Are you mad?
This would be inadvisable unless you were thinking about ditching and moving to an alternate spreadsheet tool.
Excel (spreadsheet tools) are more than just a means of presenting data, they are data storage, data manipulation, and data calculation tools; they are your first port of call for most things data and as your visualisation career progresses, will be where you can quickly test ideas on sample sets of data to ensure correct functionality before moving the calculations to Tableau.
In short, Tableau should be working with Excel, and it will never be a replacement for it.
Tableau is NEVER wrong
That's a bold statement, I hope you have some rock-solid evidence to back this up.
I do - you see, Tableau is a presentation layer...
Tableau's only function is to visualise data. Tableau does not
Allow for data input
Manipulate the meaning of the data at source (and save this new meaning in place of the original).
As mentioned previously, any expressions we create, are transmitted by Tableau as SQL to the source data engine, this engine evaluates the expression and then returns the result to Tableau to plot.
Your expressions can initially be wrong, however, once these have been verified and the resultant numbers checked (a spreadsheet will help you here), then the future outputs will unlikely ever be wrong - though they can suffer from rounding errors that can generate some interesting results - something to be mindful of.
The data can be wrong though.
Which is where we can use Tableau for Data Quality.
One of the major problems developers will face is user's who complain that there is a problem with Tableau.
The problem is never with Tableau but, we as developers have the unique placement in investigating the data-failure and either fixing ourselves, or instructing the BI team on how to resolve