When Tableau throws an error, or starts to underperform, then what? How do you go about investigating what the error was, or why the performance has suddenly fallen off a cliff?
For performance matters, there's the performance recorder, though this is only really the first port of call, the best place for all your investigations is actually the Tableau logs: (Windows Default) Documents\My Tableau Repository\Logs
Tableau creates three log files, though only the file named "Log.txt" or "Log_n.txt" retains its same function, the others "hyperd.txt" and "tabprotosrv.txt" change their function slightly depending on your data source:
|All data sources||Log.txt||The basic log file, this contains all actions regarding the Tableau dev environment relating to graphics rendering, from the layout of the object panes, to how data is drawn on the page.|
This file has less use in error tracing and query performance optimisation although, it we can use the information captured in this log to determine exactly how much resource a viz took, how long it took Tableau to render the viz, and exactly the compute information of chip and graphics were involved.
This would be one of the logs that Tableau are likely to ask for when making a report as this contains everything about the pc environment that Tableau could need without the user sending their pc to Tableau.
|Flat file (txt, csv,Excel, Access, Google sheet etc)||Hyperd.txt||Contains everything related to the query used to build the visualisation, and includes all the activity surrounding how Tableau prepared the query, the connections made, where the query was sent, data pipes involved along with server worker information; and then, query execution time, numbers of CPU's, cores and workers involved in executing and compiling the query, and whether the query resulted in a success. If the query errored, the error information is also captured to this log.|
On the other hand is much more about the data source meta combined with the server hardware environment.
Whilst many users connecting to flat or Excel files shall connect to local files, such that the hardware environment configuration shall be captured to both the tabprotosrv.txt log and the Log.txt log, in the event the data resides externally, the hardware config is important to Tableau (Software) as they try to identify bugs and errors with the software.
Beyond the hardware config, the Tabprotosrv log captures all the meta information of the data source, from its type and location, to where within the file the data that is currently being interrogated is stored, such as a multi-sheet Excel workbook. This file also contains the data-type meta and column ordinal position and data cardinality which are all VERY important when it comes to optimising data sources and Tableau workbooks.
No query information is captured here, nor is there any information related to hardware use in the query, though this log will capture more information surrounding errors and/or unexpected or unaccounted errors.
Just like much of the rest of this log, Tabprotosrv is much more useful for identifying errors rather than issues surrounding performance.
|Database (SQL Server, Oracle, Teradata, Postgresdb et al)||Hyperd.txt||Is now capturing the meta information as regards the query output AS A PREPARATION FOR creating an extract; in this way, Hyperd is understanding the queries, the elapsed time, data source meta etc to be included with an extract if one is created, much like the execution plan cache that all servers keep on each successive run of the same query.|
Sure, this information is useful for error-tracing though its primary purpose is for query performance optimisation.
|tabprotosrv.txt (1)||On the other hand now comes as TWO files one created a split second after the other, so it is important when using the logs to determine your need and both identify and open the correct one.|
Typically, the first tabprotosrv file contains all the same information as is captured in the same file when working with Excel, that of data environment including query time, but also, all the table meta and errors encountered etc.
|tabprotosrv.txt (2)||The second Tabprotosrv file contains the other side of what is captured to the Hyperd.txt when query flat file which is to say, this is the file needed to identify the queries issued by Tableau to the data-source, and plays the pivotal role in performance optimisation.|
So now we know where the logs are stored, and broadly the information they contain, lets take a look at using them.
Usage and Retention
Due to the logs' naming only being tied to the date and time in which the were created, retaining logs beyond your work serves very little purpose, especially as multiple workbooks will share and write to the same log leading to cross-referencing. As a result, when it comes to the time that you need to use the logs such as, for error tracing, or optimisation, you can just go ahead and clear-out the logs directory.
Get a source code editor
Given their size, both in row count, but also row length, working with the logs is soo much more easier when opened in a source code editor rather than the Windows Notepad, just be sure that your editor can open and refresh actively in-use files.
Though there are plenty of free and proprietary source-code editors available, my favourite is Notepad++ which you can get from here, based on its size, minimal resource consumption, great feature list, language detection and support, and that it can be ran from a portable exe, so no install necessary
Referring back to the table above, usage will depend on the types of data you are connecting to, though the process I suggest is:
- Open the workbook that you are testing, and create a new sheet before saving and closing: This is so when you next open the workbook, Tableau will not begin performing actions and this logging until you are ready
- Close all open workbooks, to ensure nothing is being logged
Head on over to the logs directory, and delete all the log files, and the crashdumps folder.
We now have a clean setup, where the logs that shall appear in the folder, will be created by the workbook we are investigating
- Now open the workbook to be investigated, Tableau will create as a minimum a: Log.txt, Hyperd.txt and a tabprotosrv_yyyy_mm_dd_hh_mm_ss.txt
- Remember, if connecting to a database, when you begin working, Tableau will create a second tabprotosrv file; this is where your queries shall be written to, identified with a later time-stamp encoded to the file name
- Open all three files in your text editor
- Now move onto the viz or dashboard that is causing the investigation. If connecting to a database, this is when the second tabprotosrv log will be created, go ahead and open this file as well in your text/source code editor, refreshing the other three open logs as you go.
- Work through your workbook until you have completed, you can keep refreshing and investigating the logs as you go, or wait until you have completed your work actions, though once complete, move the logs out of the Logs directory to preserve for further analysis.
Video Walkthrough (15:43)
Ok, so this video was actually created to analyse the performance differences between using Custom SQL vs Tableau's relational model, however in order to carry-out this analysis, all of the above steps needed to be undertaken.
Though you will likely benefit from watching the complete video as many of the techniques I demonstrate here are those that you would need to follow to optimise your workbooks and data sources.
The Confluence built-in video player outputs at a much lower bitrate than the video: I recommend for the best viewing quality that you download the video rather than watching here - though you can watch here if you wish