DAX (Data Analysis Expressions) is the language used to write formulae in Power BI when you want to create new Calculated Columns or Measures. In many respects it is similar to the way you write formulae in Excel. Stated more simply, DAX helps you create new information from data already in your model. While it is easy to make visuals in Power BI from your existing data model, we regularly want to see information for particular time periods, or with specific filters applied or create new calculated values based on our raw data. There are some super free online resources available to learn DAX, which will help you to turn your simple Power BI model into something great. Learning just a few basic functions of DAX can assist you with unlocking numerous new bits of knowledge from your data. Here I introduce my favourite DAX functions that I use in the Accounts Receivable Solution for Sage 50 and the Power BI Template for Sage 50.
SUM vs SUMX
SUM is a simple function that just adds up the total of all rows in a column. It knows nothing about any other columns in the table or any other related tables. It is fast and efficient.
SUMX is more complex. It examines every single row of the colum, is aware of the other columns in the table and is also aware of other related tables. Like SUM, by default it simply adds the rows in a column of numbers. You can however use any valid DAX expression in place of the column name. This expression can be a calculation across a number of columns in this and/or related tables. As such, it it much more powerful and flexible than SUM.
Where SUM and SUMX will give you different answers
Let's say you have two columns of data [UnitPrice] and [Quantity] and you want to calculate the total [ExtendedPrice] by multiplying the two.
You can't write SUM ( Sales[UnitPrice] * Sales[Quantity] ) as SUM just expects one column name.
SUM ( Sales[UnitPrice] ) * SUM ( Sales[Quantity] ) will give you the wrong answer as you will be multiplying the total of all the unit prices with the total of all the quantities.
SUMX ( Sales, Sales[UnitPrice] * Sales[Quantity] ) will calculate the Extended Price for each row and then add up all the Extended Prices to give you the correct total.
There are some circumstances where SUM may be faster and more efficient than SUMX but we can avoid this by using filters and the CALCULATE function as illustrated here.
In my models, unless I would have a good reason to do otherwise (and I haven't found one yet), I use SUMX rather than SUM. I do this for three reasons.
- SUMX is so much more flexible and powerful than SUM
- Using CALCULATE and FILTER, I can avoid bad DAX where I am iterating within an iteration.
- I expect my code to be read and understood by other people (for example where I am supplying a template file to a customer), so I try to use the same basic constucts with the same functions again and again in my DAX code.
|SUMX (<table>, <expression>)|
The SUMX function takes its first argument as a table, or an expression that returns a table. The second argument is a column that contains the numbers you want to sum, or an expression that evaluates to a column. Only numbers in the column are counted. Blanks, logical values, and text are ignored.
The two measures to evaluate [Debtors] below are equivalent and will simply add up the balance of all of my customers.
Debtors = SUMX ( BICustomers, BICustomers[Balance] )
Debtors = SUM ( BICustomers[Balance] )
Let’s say though that we wanted to do a simple (or complex) manipulation on a column before summing it. For this we need to use SUMX. We discussed earlier the case of calculating the ExtendedPrice from UnitPrice and Quantity. Here is another example, taken from my Accounts Receivable template. In this example, I just want to switch the sign for my Receipts data so that in a chart for “Receipts this Month” the values are positive (in my dataset invoices have positive values and receipts negative values).
GrossAmount*-1 = SUMX ( BITransSplits, BITransSplits[GrossAmount] * -1 )
Since I am performing a calculation (multiplying by minus 1) on my [GrossAmount] column, I need to use SUMX rather than SUM.
The CALCULATE function evaluates an expression in a context that is modified by specific filters. In his DAX 101 tutorial on YouTube, Alberto Ferrari calls CALCULATE the "Queen of DAX functions" because it is so powerful, like the queen on a chessboard.
|DAX = CALCULATE ( <expression> , <filter1>,<filter2>…)|
My favorite construct is to use CALCULATE and SUMX together. This is incredibly powerful and I use it throughout my models. What it does is to take the already flexible SUMX function and allow you to modify the results by applying filters. You can use this technique very easily to create aggregations.
The following example is taken from the Suntico BI Template for Sage 50 and is the formula that I use to calculate SalesThisYTD that you see on the first page of the report.
The first parameter,'BITransSplits’ identifies the table that we wish to aggregate.
The second parameter,[BITransSplits[NetAmount]’, represents the column of the table that we wish to aggregate.
The third parameter "BITransSplits[IsReceivable]" is a filter to select only those transaction records that are sales invoices, credit notes or discount adjustments to sales invoices. BITransSplits[IsReceivable] is a Calculated Column in the BITransSplits table that is TRUE if the transaction is a sales invoice (or credit or discount to an invoice).
The fourth parameter "BIDates[ThisYTD]" is another filter to select only those dates so far this financial year. BIDates[ThisYTD] is a column in the BIDates table that is TRUE if the transaction date is any date up to and including yesterday in the current financial year. Notice that this filter is referencing a different table 'BIDates' to the SUMX function that references 'BITransSplits'. This is ok, because there is a relationship between these two tables.
As you can see, this is a very simple yet flexible construction and I use it very often. Notice that the two filters I used were Boolean. That is they were columns with TRUE/FALSE values. This makes for very easy DAX code writing and understanding.
There are a couple of rules which applies to the CALCULATE function:
- Filter parameters cannot reference measures
- Expressions cannot use any functions that scan or return a table.
The CALCULATE function is typically used with aggregation functions, and although the filter parameters are optional, at least one is typically used.
Filters are extremely important in Power BI. Every time you slice or dice your data, or perform an aggregation over time, you are applying a filter or filters to your data. The power of "our queen of functions" CALCULATE comes from the ability to add filters to an expression.
Since filters are so important in Power BI, then so are filter functions (there are quite a few, of which FILTER is one).
The FILTER function is used to return a filtered subset of a table anywhere where a DAX expression expects a table, or where the DAX expression is expecting a filter (for example at the end of a CALCULATE function).
You use FILTER to reduce the number of rows in the table that you are working with, and use only specific data in calculations. FILTER is not used independently, but as a function that is embedded in other functions that require a table as an argument.
The example below shows the FILTER function in action in the Suntico BI Template for Sage 50. In this case the expression is evaluating net sales for the last 52 weeks. The FILTER function is being used as a filter at the end of a CALCULATE function.
The <table> argument ALL (BIDates) is removing all filters from the BIDates table, before applying the <filter> argument that only selects the previous 364 dates from the date in the current context.
|Gross Sales - running total 52 weeks =
BIDates[Date] > MAX( BIDates[Date] ) - 364,
BIDates[Date] <= MAX( BIDates[Date])