What are those DAX expressions with the X at the end? They are called DAX iterators and once you get past the slightly confusing way to write the expressions, they are a fundamental building block for any tabular model including Power BI, PowerPivot and MS Analysis Services. Here is how to master SUMX in DAX.
What is the difference between SUMX and SUM in DAX?
It’s much different than it looks.
SUM looks at a single column as a whole and provides a result.
SUMX takes a different approach. You give SUMX a table of data that tells SUMX how often you want to evaluate an expression. SumX then sums each result of all of the iterations of that expression. It’s hard to write so let’s go through some examples.
We will start with a very basic table of sales data. The table has item quantities, item prices, discounts and taxes. Let’s say that we want to know what the extended price is before discount and taxes.
The basic equation will be Item Price * Quantity for each line. So now we need to figure out how to write that in DAX using SUMX.
Where you should not use SUMX (DAX)
If you are simply trying to calculate the sum of a column of data, SUMX should be avoided. Here’s a sample we’ll call expensiveSUMXexample.
ExpensiveSUMXExample:=SUMX(SalesLines,SalesLines[Quantity])
Result: 217 <— Correct
It works so why not use it? The reason is that SUMX is evaluating the expression for every line in the table using the Formula engine. In our table of a whopping 7 rows, the cost to the processor is nil. When we start to apply this to millions of rows it gets expensive.
Instead, we can use Sum(SalesLines[Quantity]) which provides exactly the same result while looking at the column as a whole.
Why not just add a calculated column, it’s so much easier?
You can, and there will be times that you will be required to do so. I’m just going to say for now. If you can avoid it, avoid it. More on this in another post.
The SUMX (DAX) equation
The Microsoft documentation says this:
SUMX(<Table>, <Expression>)
We have two arguments to deal with and it’s not obvious what they are so let’s break it down.
<Table>
The table can be confusing but once you get the hang of it is pretty basic. The table creates the level at which we want to create our steps or iterations. Also, keep in mind that the table can be a single column and when possible I recommend this.
So, if we want to calculate the expression for every single line. We can use several approaches, here are a couple:
- Just provide the table ‘SalesLines’
- Provide just the ID’s as there is a unique ID for every line. We’ll write this as Values(‘SalesLines'[ID]).
<Expression>
It is natural to think that you just enter the expression as per above. So it would be IteratorItem Price * Quantity. However, when you try to do this you will find that DAX won’t work. It won’t even suggest the columns. Now what?
Let’s go through some examples to find out what works.
Example 1 – aka Epic fail #1
Well let’s see if we can use Max(Item Price) * Max(Quantity) DAX will let us enter it so we get.
So the total equation is:
Example 1:=sumx(VALUES(SalesLines[id]),MAX(SalesLines[Item Price])*MAX(SalesLines[Quantity]))
It seems to be working until we look at the total. We know the correct value to be 9416 so we clearly have a problem.
Let’s change the table and see if it makes a difference.
Example 2 – Full table provided to SUMX – and still doesn’t work
Example 2:=sumx(‘SalesLines’,MAX(SalesLines[Item Price])*MAX(SalesLines[Quantity]))
So, in this case, we’ve gone away from using the single column of distinct values and we’re giving it the entire table. And we get the exact same incorrect result.
Maybe Max() is the problem. Why did we even use Max() in the first place?
Example 3 – Change the equations from Max() to Sum() – and still doesn’t work!?!?!
Example 3:=sumx(VALUES(‘SalesLines'[id]),SUM(SalesLines[Item Price])*SUM(SalesLines[Quantity]))
Ok, all I did here was change the individual expressions from Max to Sum. And I’ve run it for both table approaches. And the result is getting worse!
Now we have a result of 449,624. This is clearly a long way away from our target of 9416.
The solution – create the expressions outside of the SUMX first
Ok, so now we’re going to create stand-alone DAX measures for the maximum Item Price and the maximum Quantity.
MaxQty:=max(SalesLines[Quantity])
MaxItemPrice:=max(SalesLines[Item Price])
These are exactly the same as what we had put in our equations for Example1 and Example 2. We just created them as separate measures.
Example 4 – How to master SumX the right way – IT WORKS!!!
Example4:=sumx(VALUES(SalesLines[id]),[maxqty] *[maxitemprice])
And finally, this approach works. The replacement of measures which have exactly the same functionality causes SUMX to be evaluated properly.
Tips for Table Selection in SUMX
Think about granularity
When choosing the table for any Dax Iterator you want to think about what level you want your expression to be calculated at. In our case, we needed to calculate the expression separately for every line in the table. The best approach was to use Values(SalesLines[id]) which provides a table with one line for each entry.
In other cases, we may want to calculate at the order level. (Note, this is not appropriate for our expression) To iterate at the order level we would use Values(SalesLines[Order]) as our table.
Keep the table Narrow
The calculation will perform better if the table you use has a limited number of columns. This is why Values(Table[Column]) or All(Table[Column)] are often a good approach. There will also be times where you don’t have enough granularity in a single column to make this work.
Tips for the Expression in SUMX
Step one should always be to create measures that you want to use in the expression argument of your SUMX measure.
Why did we use Max() in the individual measures?
In the example expression, we could have used also used min, sum and average without affecting the result. This works when SUMX is iterating through the rows the expression Max(Quantity) is only applied to one line of the column. Depending on your expression and the result you desire you may need to select a different function.
The same applies to all of the DAX iterators
There is a huge number of DAX iterators (All designated by the X at the end of the term) and they all operated in a similar fashion. If you can master SUMX, the others become much easier to master.
Thanks for dropping in, you can read more about NFORM Business Intelligence and don’t forget to subscribe for more great content and how-to’s.
Leave a Reply