Lets Talk: Level-of-Detail Expressions

Beginning with version 9 in March 2015, Tableau rolled out its then newest calculation feature - Level of detail expressions, an all-new powerful extension set that enables developers to provide an extra-level of detail to their visualisations that crucially, could be unbound from the framework of the existing view - or rather, could be calculated on an altogether different set of dimensions, yet incorporated into the current visualisation.

Syntax & Scope

Like table calculations, Level-of-detail expressions (LOD Expressions) need to be scoped, but unlike table calcs which can consider cell, panes and windows, lod's need to be scoped to one or more dimensions, furthermore, the scope is defined in the calculation, and so persists for all calculations.

Syntax:

The syntax to initiate an lod remains the same regardless of scope, and, just like a regular table calculation, you are free to pre-determine the aggregation used, or, to allow Tableau to apply the default calculation (typically Sum), as you apply the calculation. The one caveat is that any measures used in the calculation must be aggregated:

Level-of-detail Expression: Basic Syntax
{<Scope> <Dimension 1>, <Dimension 2> etc : <Aggregation>(<Measure>)}
  • All level-of-detail expressions must begin and end with curly-braces
  • Only one scope can be defined per expression
  • Multiple scoped dimensions can be included in the expression, separated by a comma
  • Scoped dimensions and the measures if used must be separated by a colon
  • Just like regular calculations you are free to determine how data is aggregated but unlike regular expressions, lod's must be aggregate
  • You can nest multiple expressions into one overall calculation
  • lod's can be used inside table calculations
Example Syntax
{Fixed Category : (Sum(Sales) - Sum(Profit)) / Sum(Profit)}

This will calculate the percentage difference between the Sales and Profit for each category member (Superstore: Furniture, Office Supplies, Technology)

Scope:

Four types of scope exist for level-of-detail expressions:

Fixed (Fixed grain)Include (Lower Grain)Exclude (Higher Grain)None (Highest Grain)

The Fixed-type calculates only on the specified dimension(s)

{Fixed Region : Sum(Sales)}

Include aggregates across all the dimensions of the view and those of the expression, as a result, the effect of the Include-type will only be noticed when not used with sum.

{Include Segment : Sum(Sales)}

The polar opposite to Include-scope, Exclude removes the named dimensions from the final result so calculates at a higher-grain than may be available in the given view

{Exclude Category : Sum(Sales)}

The highest-grain of all, where no scope (and therefore dimensions) are defined, the result is always the highest value of the result set

{Sum(Sales)}

Notice the difference here: Using include with a summation aggregation is functionally the same as a standard sum, only when the aggregation is set as Avg do do we see a difference

Architecture

This is the tech-part, lets look at how level-of-detail expressions work, behind the scenes.

Consider this view:This is the SQL required to generate the viewAnd this is the plan with some of the key measurements highlighted

Select
Region
,Segment
,Sum(Sales) As Sales

From Superstore
Where Category In('Office Supplies', 'Technology')

Group By Region, Segment

Are LOD Expressions the Only Option?

Let us now consider a second view, this one uses the same Sum calculation, a Fixed LOD calculation and finally, a Window_Sum:

The ViewThe SQLAnd Finally, the plan

 Its quite big <expand>...
SELECT 
[t0].[Region] AS [Region]
,[t0].[Segment] AS [Segment]
,[t0].[sum:Sales:ok] AS [TEMP(TC_)(623101596)(0)_Window_Sum]
,[t3].[__measure__2] AS [sum:Calculation_1163336127751135237:ok_Fixed]
,[t0].[sum:Sales:ok] AS [sum:Sales:ok_Sales]

FROM (
	SELECT 
	[Orders].[Region] AS [Region]
	,[Orders].[Segment] AS [Segment]
	,SUM([Orders].[Sales]) AS [sum:Sales:ok]
	
	FROM [dbo].[Orders] [Orders]
	WHERE ([Orders].[Category] NOT IN ('Furniture'))
	GROUP BY [Orders].[Region], [Orders].[Segment]
	
	) [t0]
INNER JOIN (
	SELECT 
	[t1].[Region] AS [Region]
	,[t1].[Segment] AS [Segment]
	,SUM([t2].[__measure__1]) AS [__measure__2]
		
	FROM (
		SELECT 
		[Orders].[Category] AS [Category]
		,[Orders].[Region] AS [Region]
		,[Orders].[Segment] AS [Segment]
			
		FROM [dbo].[Orders] [Orders]
		WHERE ([Orders].[Category] NOT IN ('Furniture'))
		GROUP BY [Orders].[Category], [Orders].[Region],[Orders].[Segment]
		
	) [t1]
	INNER JOIN (

		SELECT 
		[Orders].[Category] AS [Category]
		,[Orders].[Segment] AS [Segment]
		,SUM([Orders].[Sales]) AS [__measure__1]
			
		FROM [dbo].[Orders] [Orders]
		GROUP BY [Orders].[Category],[Orders].[Segment]
	
	) [t2] ON ((([t1].[Category] = [t2].[Category]) OR (([t1].[Category] IS NULL) AND ([t2].[Category] IS NULL))) 
				AND (([t1].[Segment] = [t2].[Segment]) OR (([t1].[Segment] IS NULL) AND ([t2].[Segment] IS NULL))))
		
	GROUP BY [t1].[Region], [t1].[Segment]
) [t3] ON ((([t0].[Region] = [t3].[Region]) OR (([t0].[Region] IS NULL) AND ([t3].[Region] IS NULL))) 
		AND (([t0].[Segment] = [t3].[Segment]) OR (([t0].[Segment] IS NULL) AND ([t3].[Segment] IS NULL))))

I think we can agree that for such a tiny table, the code is quite extensive; but did you notice that both the Fixed and the Window_Sum are calculating the same output.

Lets look at this in some more detail:

VisualSQLPlan

Fixed LOD Expression:


Fixed LOD Expression:

 Can you guess the code? <click to expand>
SELECT 
[t1].[Region] AS [Region]
,[t1].[Segment] AS [Segment]
,SUM([t2].[__measure__1]) AS [__measure__2]
		
FROM (
	SELECT 
	[Orders].[Category] AS [Category]
	,[Orders].[Region] AS [Region]
	,[Orders].[Segment] AS [Segment]
			
	FROM [dbo].[Orders] [Orders]
	WHERE ([Orders].[Category] NOT IN ('Furniture'))
	GROUP BY [Orders].[Category], [Orders].[Region],[Orders].[Segment]
		
) [t1]
INNER JOIN (

	SELECT 
	[Orders].[Category] AS [Category]
	,[Orders].[Segment] AS [Segment]
	,SUM([Orders].[Sales]) AS [__measure__1]
			
	FROM [dbo].[Orders] [Orders]
	GROUP BY [Orders].[Category],[Orders].[Segment]
	
) [t2] ON ((([t1].[Category] = [t2].[Category]) OR (([t1].[Category] IS NULL) AND ([t2].[Category] IS NULL))) 
			AND (([t1].[Segment] = [t2].[Segment]) OR (([t1].[Segment] IS NULL) AND ([t2].[Segment] IS NULL))))
		
GROUP BY [t1].[Region], [t1].[Segment]

Fixed LOD Expression:

 And now the plan <click to expand>

Notice that double table scan - yup, regardless as to whether the table is optimised or not, this query generates two scans

Window_Sum Calculation:

Window_Sum Calculation:

 Click here to expand...
SELECT 
[Orders].[Region] AS [Region]
,[Orders].[Segment] AS [Segment]
,SUM([Orders].[Sales]) AS [sum:Sales:ok]
	
FROM [dbo].[Orders] [Orders]
WHERE ([Orders].[Category] NOT IN ('Furniture'))
GROUP BY [Orders].[Region], [Orders].[Segment]

Window_Sum Calculation:

 The Window_Sum Plan <expand>

This plan is a lot smaller, almost 60% smaller than the LOD plan, and with only a single table scan (or index scan / seek if indexed) and crucially, no joins

So, when considering performance and server impact, which calculation would be the better to use here?

 Answer...

Yup, the Window_Sum. Why?

Well, because the Window_Sum is nothing more than a straightforward aggregation, the windowing function is applied inside Tableau.

But, despite the clear performance drawback to Level-of-detail expressions, you must always consider your use case, as Window Calculations still need your data to structured such that the scope can be applied, whereas LOD expressions, particularly Fixed expressions or Min/Avg/Max aggregations, give you power to provide additional information without a supporting structure.

Furthermore, it is possible to include lod expressions inside a window calc if needed, this is particularly useful when performing lookups and ranking operations.

Concluding Remarks

It is plain to see that Level-of-detail expressions are a very valuable tool to include in your development arsenal, that said though, as with all features with Tableau, developers need to consider whether their use is justified in the view they are currently building.

It is clear from the demonstrations above that alternatives do exist, and where possible, these alternatives should be explored, especially when connecting Tableau to large datasets in excess of 1 million rows, as the impact of these overly costly query will greatly impact time-to-render, and may also impact other resources.

I would also suggest you take time to assess the calculations you are building and see if possible whether the LOD expression can be re-written and included in the underlying set.