Connection fails to Text Analysis from Power BI?

Last weekend I was a little bit confused. I deployed a report with the Azure Service “Text Sentiment” to the Power BI service and I would to configure a scheduled refresh. But I can’t setup the credentials, because the wizard said to me, that the URL are not found:

Failed to update data source credentials: Web.Contents failed to get contents from 'https://southcentralus.api.cognitive.microsoft.com/text/analytics/v3.1-preview.1/languages' (404): Resource Not Found

I tested the connection with Postman, and it works … But the reason was very simple, I needed to deactivate the connection test and after that, the refresh works:

If you want to test it with the newest API, you can go to the Quick Start menu and there you must select the region. The region must to entered into the URL:

https://southcentralus.dev.cognitive.microsoft.com/docs/services/TextAnalytics-v3-1-Preview-1/operations/Languages/console

Here some more information about this feature:
https://docs.microsoft.com/en-us/azure/cognitive-services/text-analytics/tutorials/tutorial-power-bi-key-phrases
https://www.flip-design.de/?p=677
https://azure.microsoft.com/en-us/services/cognitive-services/

Why should you aware of the connection details in a Power BI file?

Yesterday a colleague of mine had problem to setup a schedule refresh at a Power BI file on a Power BI Report Server. The refresh fails every time with an error message that the credentials are missing for the data source. But the credentials are setup for the data source. It seems that we entered in an error from the October release (https://community.powerbi.com/t5/Report-Server/Power-BI-Report-Server-October-2020-update-has-been-released/td-p/1503433). But the error was fixed, and we had installed the updated release.
The fix was easy, because when you create connections to an earlier data source, but with another notation, you get two different connections:

data source settings

For the Power BI service this is not an issue, but the Report Server thinks there two connections, but you will only see one of them at data source wizard. So, when you harmonize the notations and publish the report again to the report server, the scheduled refresh works correctly.

A first look to Azure Purview and how to connect local data sources

Azure Purview is the next generation of the Azure Data Catalog. Now we have a Public Preview of this service and there are some changes how to catalogue local data sources.
Yesterday I started with Power Bis is a good way to create a catalogue with all data source types. Actually Power BI is a little bit limited, because you only see the dataset on a import mode report without any tables, columns and so on. If you use a direct query dataset, you see the source, but also without tables and columns. Another things was, that you get every workspace and report regardless if it was deleted or active. But we talk about a preview and with the Data Catalog there was no way to get information about Power BI.

Okay, let’s go back to the main topic … With Purview you can register online and on premises data sources. For the online sources there is no need to install or register additional components. For on premises sources you need to install a self hosted integration runtime. It is the runtime which we know from the Data Factory. On the link for the runtime you will get more information about the behavior of this setup.
By registering the Runtime at the Purview portal, you will get the keys and the download link:

It is a little bit confusing by the download, because you can download different versions. I used the latest one. After starting the setup and providing the key, the setup finds the runtime which is configured at the portal.

Actually you can discover SQL Server sources, but no sources of the type of Analysis Services. I think a reason of this behavior is, that the runtime is for the data factory and this will write the sources? I’m now sure, but I think there will come a connector.
Next we need a SQL authenticated login to discover the SQL Server. The user will need high permissions. You can check this on the link above. After that, you need to store the login information of this account in your Azure Key Vault:

Is is so great that you can only use credentials from the Key Vault – no paswords stored in the code :)

After that I created on the Purview Page a new data source collection with a data source with the type of SQL  Server.

For the scheduling you need the Key Vault entry:

After a discovery of my source I get all tables of database:

It is a little bit confusing that my SQL Server name is MSSQLSERVER. I installed the server as an unnamed instance. This is the reason why. Now you can discover the tables, columns and setup classifications and so on.
I think this tool gives us more flexibility as the Data Catalog. To inventory local data sources and also online sources, it is a little bit different. Hopefully we will get the GA version very soon 😊

DirectQuery Realtime Datasets and the composite model

I really like the Realtime datasets in Power BI, especially why it is really easy to create one 😊 on the past you create on use it in the service or use it via the Desktop by using the Datasets connector. But there was not so much possibilities to make more with the data, e.g. to merge it with other sources. So you created “complex” processes via Azure, push the Realtime data into a SQL DB and then you are able to merge the data.
With the Direct Query mode for datasets it is more easier. You can connect to the Realtime Dataset, import other data and analyze them.

A small example:

Create a Streaming Dataset at Power BI

With the examples for PowerShell and so on it is very easy to add data to this dataset

Now you can connect to this dataset via the Power Desktop (you must enable the preview feature) and load other data from different datasources into your model:

The you can create relationships between the data and analyze them

Unfortunately it is not possible to modify the metadata, like to add translations via the Tabular Editor to the data from the Streaming Dataset. It is only possible from the data from other sources.

Building an Slowly Changing Dimension with Power BI

Okay, I think this is not a topic for Power BI, because normally I build a SCD in a Data Mart with T-SQL and not with M/PowerQuery for an Self Service Environment. But in some cases it should be very useful like an PoC. Another reason in an self service scenario can be, that you have a small amount of data and you will analyze them with an history and an correct mapping.

Well, let’s take a look to the data:

On the left you will see the employees and you see which employee are assigned to which department with an timestamp. On the right side you have the facts, so Wilma has an sales mount on the 5th of November. The correct department for this timestamp is the Sales Department. When you build up a data model with Power BI with this tables, you can only join the tables via the EmpId and you can not see which department do this sale, only the employee and you did not see the correct assignment. So, you need an surrogate key which is unique and created for the timestamp and can merge the two tables.

We need a data model like this:

… with this result:

The Counter Table

First we must import the Emp.-Table and group it by the count of the EmpId. We need this to check how much rows we have for which employee and mark the latest record.

The name of this table is “MaxIndex”

This table is only need for our ETL process and we can disable the loading, because we don’t need it on our data model.

The Emp-Table

To do this, we must start with the Employee (Emp) table and load it into PowerQuery and sort it by EmpId and Date:

  1. To sort by two or more columns you need to do some M scripting:

#"Sorted by EmpId and FromDate" = Table.Sort(#"Changed Type",{{"EmpId", Order.Ascending}, {"FromDate", Order.Ascending} }),

2. Next we must group the data by the EmpId

3. The we must add an index

  1. Then we can remove the other columns except the new Custom column and expand it.
  2. Next we can merge it with the MaxIndex table and expand the Row Count (MaxIndex)
  • Now we must determine the next row and we add a number to the Index
  • After that we can join this table with it self by the EmpId and the index and the new column. Then we can expand the column and we only need the FromDate column which we can rename this as EndDate.
  • Then we can check if the index equals the MaxIndex column and if it’s match, we know this is the last entry.
  • Next we can adjust the End Date, if it’s null, we can set it to 12/31/9999
  • Now we can add the surrogate key
  1. After that there is only to do some renaming and subtract one day of the end day, so you don’t have an overlapping of the start and end date.

The fact table

  1. After you have loaded the fact into PowerQuery you can merge it with the Emp table by using the EmpId. You can expand the columns Start Date, End Date and the Surrogate Key.
  2. After that you can add a M Step to filter the data:

#“Filtered Rows“ = Table.SelectRows(#“Expanded Emp“, each [Date] >= [Emp.FromDate] and [Date] < [Emp.EndDate]),

  • Now you can remove the Start and End Date and load the data to your model

That’s it, There are many steps to to and you will find the PBIX file for download. I think the steps can be more simplified, but for this approach it’s okay. But please aware of the amount of data, if you have more than 1 Mio. Fact rows, please build it with T-SQL or another database engine.

Power BI: Translations and the XMLA Endpoint

When you need translations for your  Power BI reporting in a multi lingual scenario, you need a Premium capacity. When you have edited your PBIX with Tabular Editor and added translations, this works with Power BI Desktop, but unfortunately not with Power BI Pro or Free. If you use a SSAS Cube with translations as your data source, you should don’t have this problem and you can use pro licenses. 😊

But in the most cases in a self service BI environment, Excel, CSV and so on are used as data sources. Theses sources are not able to provide translations. So you must to do something in your data model with M or DAX. With the Tabular Editor from Daniel Otykier you can add the needed translations to your data model.

With Power BI Premium you can use the XMLA endpoint. This endpoint gives you the possibility to read the data from your model and re-use them for Reporting Services and so on and now you are able to writeback data to your model. You can write back perspectives, translations and measure groups. To try this, you must enable the Read/Write option for your premium capacity. You can also try this with a Azure-SKU for Power BI Embedded – with an A4 or higher.


After that and assigning the capacity to your workspace, you can consume the translations. Otherwise you get the origin data labels from your model.

At the properties dialogue of your workspace or of your dataset, you get the access connection for your XMLA endpoint. With this URL you can connect via the Tabular Editor to your Power BI data model.

Now you can add the translations for your tables, columns and measures.

If you want to add a default translation for every object in your data model, Daniel has wrote an awesome script to do this: https://tabulareditor.com/2019/10/23/TabularEditorTricks3.html
From SQL BI you can get a good editor to edit very easily the export file without having an active connection to the model: https://www.sqlbi.com/tools/ssas-tabular-translator/
So another department can make the translations for you and after that, you can re-import the file to add the translations.

After saving your modifications, you can consume your report:

And here is the report with the added translations: