I have covered the topic of Lineage in DAX several times in the past (in other articles). Today I thought of writing a single article dedicated to the lineage.
How will you know, Power BI has several different modes of operation, some of which do not require data to be loaded into the data model. This article does not refer to operating modes that do not load data; specifically refers to the use of Power BI when data is loaded into the data model.
What does the lineage mean?
The lineage is pronounced in 3 syllables: lin-e-age (hear it spoken here). The word Lineage most often used when referring to the bloodline, which means "the linear descent of an ancestor; ancestry or extraction ". With this definition in mind, lineage in DAX refers to virtual tables (created at runtime execution within a formula) that keep a link to their origin (they preserve their lineage). This concept of lineage does not exist in traditional database tools like SQL and, Thus, it's quite unique to DAX.
Physical tables and relationships in DAX
Before moving on to the subject of lineage, Physical tables and relationships are worth taking a moment to reflect on because understanding how they work is critical to understanding lineage. When you upload data to Power BI or Power Pivot, data is loaded into physical tables using one of several different approaches, including Power Query, Enter data and new table (Power BI), Power Query, Linked tables or Power Pivot data. load (Excel). During the charging process, data is first compressed and then loaded into RAM within Vertipaq Storage Engine.
Once the data is loaded, it is possible to create relationships between the tables as shown below.
These relationships are very important in DAX; are those that allow the data from the different tables to act as if they were in a single table. The way relationships work can be deceptively complex to master and can be quite confusing for newcomers to the DAX language.. The most important concepts to understand are
- Relationships are always 1 to much (there is also a relationship type of 1 a 1, but its use is not common).
- There is no support for many or many relationships in DAX.
- Filter propagation
- Relationships allow filters to flow from the side 1 from the relationship to the multiple side of the relationship, but not the other way around.
- You can enable bi-directional cross filtering for a relationship, but nevertheless, that is outside the scope of this article.
Filter the spread in action
Better to demonstrate the above key points with an example. Using the Adventure Works data model shown above, I created the following 2 measures:
Total sales = SUM (Sales[ExtendedAmount])
Product Count = COUNTROWS (products)
Single table filtering
The matrix below has the products[Category] Product table column placed in Rows in a matrix and [Count of Products] added to stock section.
The products[Category] column and the [Count of Products] measure both come from the Products table. The products[Category] column filters the Products table and then the measure ‘count’ how many rows. Technically, the way the process works is as follows:
- A filter is created and applied to the Products[Category] column. In this case, the filter comes from Rows in the Matrix, but it could come from a segmenter, other visual element, the filter card or within a CALCULATE function.
- After filtering the column (and the whole table), the [Count of Products] counts how many rows exist in the filtered copy of the table for each row in the array.
Filtering multiple tables from one to many
Things get a bit more complex when columns and measures come from different tables. In the image below the [Total Sales] The measure is related to the Sales table, but the filters (matrix rows) come from the Products table.
Technically, the way the process works now is as follows:
- A filter is created and applied to the Products[Category] column (Rows in the Matrix).
- Because there is a physical relationship of 1 to many among the Products
table and Sales table, the filter “it spreads” from the Products table
to the sales table. All products that belong to each product[Category]
in the Sales table, Thus, are also filtered for each row of the Matrix.
- Measure [Total Sales] Then it is evaluated for each row in the Matrix and
returns sales only for those filtered products that have records in the
Note: there is 189 components in the Products table that have no sales for these products. This can happen, of course.
Tables don't automatically filter backwards
Relationships are always automatically filtered from one relationship side to the many side, as shown above, but they don't leak in the other direction. To prove the point, I have removed the Products[Category] Matrix column below and Aggregate Sales[OrderDate] instead.
Technically, the way the process works now is as follows:
- A filter is created and applied to sales[OrderDate] column (Rows in the Matrix).
- There is a physical relationship of many to 1 between the Sales table and the Products table. Filters are not “propagate” in this direction and, Thus, the Sales table is filtered but the Products table is not.
- Measures [Count of Products] Y [Total Sales] then they are evaluated against their respective tables and return the results shown above.
Of course, bi-directional cross filtering can be enabled in Power BI Desktop and this will force the filters to propagate back and forth, but there are many reasons not to, including potential negative impacts on performance. as potential problems with circular relationships.
The DAX language
Tables and scalar values
- When you write a measure or calculated column in DAX, the result must always be a scalar value. A measure and a calculated column cannot return a table as the final result. Note that you can display the measurement results in a visual like a table or an array, but the value returned by the measure is always a scalar value.
- You can use DAX as a query language by using a suitable query tool, como DAX Studio. When using DAX as the query language, the query result is always a table. A DAX query cannot return a scalar value. Note that you can display a scalar value as a result of a DAX query by first creating a single row, single column table and placing the scalar value in that table.
Table functions in DAX
- There are many functions that return tables in DAX even if they cannot be returned as a final result in a measure or column. Examples include ALL, VALUES, FILTER, ABSTRACT, to name a few.
- Table functions form the basis of DAX as a query language and can be used natively to return the resulting table in a query tool.
- You can also use table functions with the New Table feature in Power BI to create a new physical table in the data model.
- You can use table functions within measures and calculated columns to help generate the required result as long as the result is a scalar value.
Let's see an example. The following measure returns a scalar value: total sales in Australia. But nevertheless, note that the line 4 below is a table function, in this case FILTER.
The FILTER function takes a table as its first parameter (in this case the Territory Table) and then apply a filter to the table. Later, the CALCULATE function takes the filtered copy of the Territory table and propagates the filters to the Sales table (through the relationship of 1 to much). The line 4 above is a virtual table. The table is created in memory during the execution of the formula. It is used to perform the task in question. When the job is done, the table ceases to exist.
Note: Power BI has sophisticated caching capabilities and virtual tables can remain cached for reuse in subsequent evaluations.
Virtual tables and lineage in DAX
Agree, enough talk about the relationship, What does this have to do with lineage ?!
Lineage refers to a virtual relationship that exists between virtual tables and physical tables in DAX. One more time, an example is the best way to explain this in detail. Consider the following:
The territory table in Adventure Works looks like this.
Note that Country is a column, not a table. There is no table containing all unique countries in the database.
Now consider the following measure:
Average invoice value = SPLIT ([Total Sales],[Total Invoices])
When this measure is added to a table with Territory[Country] in rows, The results are shown below.
But here's the point. The total of $ 748,73 it is not the average of all values for each country, but it is the average of all invoices in all countries. This is not right or wrong, it is just a statement of fact that the total in this example is the average value of the invoice in all countries. So what if I want to know the average invoice value by country and then also the average of country values. The answer would look like this.
With this second measure, the total of $ 758,76 is the average of all country averages above. It's not right or wrong, it is only different from the first measure. If you are familiar with X functions, you will know that to get this answer you need to check the list of countries, one at a time, and calculate the average for each country before finally calculating the average of the averages. . AVERAGEX is a great function to do this.
The DAX formula I used to create this second measure above is as follows:
The first parameter within an X function is always a table or table function. In this case, is a VALUES function that returns a list of all unique countries in the current filter context (line 3 up). This new table created by VALUES is a virtual table that preserves the lineage of the table from which it was created. In other words, the virtual table can be considered to have a virtual relationship with the Territory.[Country] column where he was born. You can consider it like this (Please note that this is an illustration that shows you how you can conceive of the virtual table; in reality, you can't see it there and it never physically exists).
In the illustration above, the virtual table created by VALUES is shown in blue as 1, and the virtual relationship (lineage) It is shown as 2. None of these 2 stuff is physically visible in the data model, but nevertheless, you can think of them conceptually as shown above.
Here is the formula again.
With the conceptual model illustrated above in mind, the above DAX formula works as follows:
- AVERAGEX creates a row context over the virtual table created by VALUES on the line 3.
- Because each measure has an implicit computation involved, the [Average Invoice Value] The measure creates a context transition and converts the row context to a filter context.
- Now there is a filter context in the unique country for the first iteration and, due to the lineage between the virtual VALUES table and the Main Physical Territory table, the filter is propagated through the Territory table to the Sales Table.
- Then, the measure is evaluated for the single row in the virtual VALUES table.
- The process is repeated for each row of the virtual table VALUES. For every step, implicit computation takes advantage of the lineage between the virtual table and the physical table.
- Once all the rows in the virtual table have been evaluated, AVERAGEX finds the average of all the values in the previous steps.
Hopefully you can see that the concept of lineage is very powerful. Means you don't need to load physical dimension tables for everything you might need in your calculations. Instead, you can create virtual tables on the fly within your formulas and, but nevertheless, make these virtual tables behave exactly like a physical table and relation behave.
if(!f._fbq)f._fbq = n;n.push=n;n.loaded=!0;n.version=’2.0′;