Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...


Table of Contents
minLevel4
stylesquare

  1. Always test the performance using the performance recorder
  2. Consider the use-case and extract data where possible
  3. AVOID using published extracts for published workbooks
  4. Additive calculations belong in the source
  5. Tableau is a presentation layer not a calculation engine: it will not and should not be expected to work like Excel
  6. Try to keep your visuals as simple as possible
  7. Do not over-use filters, they are hugely costly
  8. Dashboards work best when there is less ink on the page
  9. Keep your dashboards to the point: avoid huge numbers of charts and a large scrollable page
  10. keep calculation names short and descriptive, replacing spaces for underscores
  11. Keep your work-space tidy by using folders wherever possible

...

(5.) Always test the performance using the performance recorder

Note
titleAbout the Performance Recorder

The recorder simply records all Tableau events to help you identify performance issues.

The recorder

...

is NOT a screen-capture device: it will not record screen activity, webcam or microphone and all recorded activity is saved locally.


Expand
titleThe performance recorder...

... will tell you everything you need to know about what Tableau and thereby, the data-server is needing to do to build your visual. It can be accessed from Help > Settings and Performance > Start Performance Recording to start recording actions, and then returning to the same location and selecting Stop Performance Recording when you are ready to review.

I would recommend you also use the Tableau Server performance recorder when publishing extracts after you have tuned using the desktop version, to ensure your server and web pipeline are also optimal. This is reached by first activating the toolbar with this url command:

Open the workbook you want to test onto the first chart you want eg:

https://tableau.com/#/divisionsite/financeanalytics/views/fiscal_performance/HighLevelSummary?:iid=1

and replace the ":iid=1" with ":record_performance=yes"

Once active, you will be able

...

to start and stop recording at will and recording is continuous unlike the desktop recorder where only actions that take place between explicit start and stop points are recorded, allowing for better traceback facility.

(6.) Understand your use-case carefully and consider the best methods for connecting to your data

Expand
titleTableau provides three connection methods:
  1. Direct
  2. Hyper Extract (Tableau Data Extract (tde) for pre v10.5)
  3. Published Extract

Understanding usage frequency, data-size and complexity are the three considerations you need to make when choosing the best connection method.

Generally speaking the Direct connection is often the best as you can ensure:

  • Data security and governance - Tableau will only read the data rather than extract it
  • You can leverage the power of your data-server / processing farm to reach the best performance
  • Not limited by Tableau Server memory
  • Tableau does not need to unpack the data into memory before use
  • As the data changes, so does your report

Although, this too has other considerations:

  • The report cannot leave company premises unless the data is cloud-based as it contains no physical data
  • The report needs to be fully-optimised to reduce server impacts, which might not suit where tactical unblocking reports have been deployed ahead of a completed optimised strategic solutions
  •  High user activity could cause data-server throttling of a single pipeline
  • In the event of a catastrophic data failure, the report will remain blank until data is restored (this should be protected through back-ups and correctly defined etl processes)

Hyper extracts by comparison offer a safe medium however, by offering mostly the reverse of the direct connect method.

However, keep in mind that extracts:

  • Take considerably longer to load and are affected by pipe size, lag and network use
  • Can suffer performance drawbacks at even small sizes.
    • On a personal note, I have noticed on many occasions that the extracts begins to lag at anything more than 5 millions rows and 20 columns wide of mixed alpha-numeric data; you will need to test this
  • Are formed of in-memory column-store data-sets:
    • This method is considerably faster than traditional row-based storage as data is not stored together on single pages however, in order to achieve optimal performance, one or more indexes still need to be used. As Tableau has no indexing facility, it must still touch ever row in the column in order to generate the result-set.
  • Can only be queried by the workbook to which it is attached, users wanting to use the same data must create their own extracts in their own workbook(s) resulting in unnecessary reloads of the same data
  • The extract must be refreshed to remain current
  • Data alterations and/or additions or deletions of fields require a complete new extract to be generated before these changes can be used, which can prove very time consuming even when empty extracts (Create an Empty Extract) are used

Finally, Published extracts.

Published extracts can exist as either extracted data-sets like Hyper Extracts or can simply be a published data-source holding the connectivity information to a Direct-connect data-source. They function near identical to their workbook-base counterparts only being server based, they provide the addition of accessibility allowing all users to connect and use the same source as you are using.

But, published extracts and sources are not also without their issues and should still only be used following careful consideration.

Warning

Seemingly offering the best method of connectivity for on-prem workbooks, published sources appear to be the most appealing method however, their usage will generally double, triple and sometimes quadruple and more query load times.

Tableau Software are consistently improving the connection method however, as recently as November/December 2018 I personally worked with two Tableau Software 2nd-line engineers on a performance problem that was attributed entirely to the use of the published extract:

When using a hyper extract of 300 million rows of data at 1.7GB in size, the dashboard turn-around speeds were around 3.2 seconds however, once loaded to Tableau Server this slowed down to around 53 seconds. It was during this deep investigation we found that Tableau Server was converting the 200-line sql code into a 2000-line xml document, passing this to the data-source over a secure-http pipe (despite the source being on the same server) and waiting for the results to be returned (Tableau server decoding the xml into usable code, executing and returning the results back as xml for the workbook side to decode into usable data).

Switching back to a published extract did decrease performance to around 12 seconds rather than the desktop 3.2 but still, this was substantially quicker than 53 seconds.

This investigation was reported back to Tableau Software for further investigation and remedy.


(7.) Additive Calculations belong in the source

Expand
titleAdditive calculations, and even parts of semi-additive and non-additive calculations...

... should exist in the source.


Data drives everything.

Including your reports, but data is not always optimised for the job in-hand, and is often unable to be, so really you should see data and data performance as the main drivers behind your reports, don't try and over-wrangle Tableau.

...