When Tableau throws an error, or starts to underperform, then what? How do you go about investigating what the error was, or why the performance has suddenly fallen off a cliff?
For performance matters, there's the performance recorder, though this is only really the first port of call, the best place for all your investigations is actually the Tableau logs: (Windows Default) Documents\My Tableau Repository\Logs
Tableau creates three log files, though only the file named "Log.txt" or "Log_n.txt" retains its same function, the others "hyperd.txt" and "tabprotosrv.txt" change their function slightly depending on your data source:
All data sources | Log.txt | The basic log file, this contains all actions regarding the Tableau dev environment relating to graphics rendering, from the layout of the object panes, to how data is drawn on the page. This file has less use in error tracing and query performance optimisation although, it we can use the information captured in this log to determine exactly how much resource a viz took, how long it took Tableau to render the viz, and exactly the compute information of chip and graphics were involved. This would be one of the logs that Tableau are likely to ask for when making a report as this contains everything about the pc environment that Tableau could need without the user sending their pc to Tableau. |
Flat file (txt, csv,Excel, Access, Google sheet etc) | Hyperd.txt | Contains everything related to the query used to build the visualisation, and includes all the activity surrounding how Tableau prepared the query, the connections made, where the query was sent, data pipes involved along with server worker information; and then, query execution time, numbers of CPU's, cores and workers involved in executing and compiling the query, and whether the query resulted in a success. If the query errored, the error information is also captured to this log. |
tabprotosrv.txt | On the other hand is much more about the data source meta combined with the server hardware environment.
No query information is captured here, nor is there any information related to hardware use in the query, though this log will capture more information surrounding errors and/or unexpected or unaccounted errors. | |
Database (SQL Server, Oracle, Teradata, Postgresdb et al) | Hyperd.txt | Is now capturing the meta information as regards the query output AS A PREPARATION FOR creating an extract; in this way, Hyperd is understanding the queries, the elapsed time, data source meta etc to be included with an extract if one is created, much like the execution plan cache that all servers keep on each successive run of the same query. Sure, this information is useful for error-tracing though its primary purpose is for query performance optimisation. |
tabprotosrv.txt (1) | On the other hand now comes as TWO files one created a split second after the other, so it is important when using the logs to determine your need and both identify and open the correct one. Typically, the first tabprotosrv file contains all the same information as is captured in the same file when working with Excel, that of data environment including query time, but also, all the table meta and errors encountered etc. | |
tabprotosrv.txt (2) | The second Tabprotosrv file contains the other side of what is captured to the Hyperd.txt when query flat file which is to say, this is the file needed to identify the queries issued by Tableau to the data-source, and plays the pivotal role in performance optimisation. |
So now we know where the logs are stored, and broadly the information they contain, lets take a look at using them.
Due to the logs' naming only being tied to the date and time in which the were created, retaining logs beyond your work serves very little purpose, especially as multiple workbooks will share and write to the same log leading to cross-referencing. As a result, when it comes to the time that you need to use the logs such as, for error tracing, or optimisation, you can just go ahead and clear-out the logs directory.
Given their size, both in row count, but also row length, working with the logs is soo much more easier when opened in a source code editor rather than the Windows Notepad, just be sure that your editor can open and refresh actively in-use files. |
Referring back to the table above, usage will depend on the types of data you are connecting to, though the process I suggest is:
Head on over to the logs directory, and delete all the log files, and the crashdumps folder.
We now have a clean setup, where the logs that shall appear in the folder, will be created by the workbook we are investigating |
Ok, so this video was actually created to analyse the performance differences between using Custom SQL vs Tableau's relational model, however in order to carry-out this analysis, all of the above steps needed to be undertaken.
Though you will likely benefit from watching the complete video as many of the techniques I demonstrate here are those that you would need to follow to optimise your workbooks and data sources.
The Confluence built-in video player outputs at a much lower bitrate than the video: I recommend for the best viewing quality that you download the video rather than watching here - though you can watch here if you wish |