Mostly at self service BI Projects the business users imports data from unproven sources – like Excel. But if you have a project and/or source like this, data can be have problems, because at a CSV or Excel file the data format can be anything. When you take look at following screenshot. There a two tables. One has the facts (sales per date and branch). The second table provides the branches and here, I’ve added a row with the ID “-2”, this provides as a key, which doesn’t exists at the most sources. This key can be used to map error rows.
So, if you import the data, everything is fine and you can make insights of it:
But after some time, it can be happen that user insert some data which can be not used by the data model:
Then you get following error when you refresh the data and the row will not be loaded and the end-users thinks everything is fine.
To avoid this problem you can edit your Power Query ETL process and replace this error by set the “-2”
Now, you will get the row and will mapped to the Key from the dimension
To notice the users or to build up a QA dashboard, you can write some DAX code to count the errors and give some notice to the users
Missings = var missings = CALCULATE(COUNTA(Facts[BranchId]), FILTER(Branch, Branch[BranchId] = -2 ) ) var message = IF(missings > 0, "Some branches are have incomplete data", "") return message
Very often I need a demonstration model for presentations or workshops and I use the Adventure Works SSAS model. Mostly for customers who are based in Germany and I’m not so happy when I must use a model in English. With Power BI.com (Premium) or Analysis Services (Tabular) you can use translations. So, I’ve created with Tabular Editor and the SSAS Tabular Translator a Json file for the Adventure Works Power BI model which provides a translation into the German language: https://github.com/philipplenz/AdventureWorksASTranslation
Okay, the title looks like marking stuff! At my last session for the Power BI user group for the Netherlands, an attendee asks me, if it’s possible to use Power BI, the Azure Sentiment Analysis and the incremental refreshing functionality. I said yes, but to be honest, I was not sure. Now I have tested it, and now, I can say YES! The story for this session was very easy, you get mails, put them to a storage, read the data to a Power BI report and use the Sentiment Azure Analysis API to get the Language and the sentiment of it and analyze the data. This blog post shows this step by step and also with the incremental refreshing of a Power BI data set.
The Mailbox: I used for this approach an Office 365 mailbox for my received mails:
Next, I’ve created an Excel File into my OneDrive.
Then I created a Power Automate Flow which connects to the mailbox and also the Excel file:
It’s a good idea to convert the body to text, because when you receive HTML mails, it can be possible you get confusing results when you use the text analysis. You can test it, when you send an email to this box, the trigger should write the data to the file. Next I’ve created a Text Analysis service at Azure. How to do it, please take a look to this post.
To make a better test, I’ve created also with Powre Query (M) a calculated column with the imported date/time:
If you create this column with DAX, it will be refreshed on every refresh. This column will be only refreshed if the row will be imported or gets an update.
After I setup the incremental refresh, published the report to PowerBI.com and configured the refresh, it’s time to wait to get new data from your mailbox and for a new timeframe … After this you can view your data:
When you check the metrics at Azure for the sentiment service, you see only the calls for newly received mails. The Azure API are available for different pricing plans, also for a free plan without cost. But it is limited. But when you receive many emails, you reach very quickly the limit, because for every mail you need three calls (Language, Key Phrases, Sentiment). With the incremental refresh you only send new rows to the API and you can safe money. The refresh is also very quick, because not the whole data must be refreshed.
I think this can be a good solution for companies, to the sentiment of their customers who send them emails about their services, products etc. You only need a Power BI Pro license, a free Azure subscription and an Office 365 account.
Last weekend I was a little bit confused. I deployed a report with the Azure Service “Text Sentiment” to the Power BI service and I would to configure a scheduled refresh. But I can’t setup the credentials, because the wizard said to me, that the URL are not found:
Failed to update data source credentials: Web.Contents failed to get contents from 'https://southcentralus.api.cognitive.microsoft.com/text/analytics/v3.1-preview.1/languages' (404): Resource Not Found
I tested the connection with Postman, and it works … But the reason was very simple, I needed to deactivate the connection test and after that, the refresh works:
If you want to test it with the newest API, you can go to the Quick Start menu and there you must select the region. The region must to entered into the URL:
Yesterday a colleague of mine had problem to setup a schedule refresh at a Power BI file on a Power BI Report Server. The refresh fails every time with an error message that the credentials are missing for the data source. But the credentials are setup for the data source. It seems that we entered in an error from the October release (https://community.powerbi.com/t5/Report-Server/Power-BI-Report-Server-October-2020-update-has-been-released/td-p/1503433). But the error was fixed, and we had installed the updated release. The fix was easy, because when you create connections to an earlier data source, but with another notation, you get two different connections:
For the Power BI service this is not an issue, but the Report Server thinks there two connections, but you will only see one of them at data source wizard. So, when you harmonize the notations and publish the report again to the report server, the scheduled refresh works correctly.
Azure Purview is the next generation of the Azure Data Catalog. Now we have a Public Preview of this service and there are some changes how to catalogue local data sources. Yesterday I started with Power Bis is a good way to create a catalogue with all data source types. Actually Power BI is a little bit limited, because you only see the dataset on a import mode report without any tables, columns and so on. If you use a direct query dataset, you see the source, but also without tables and columns. Another things was, that you get every workspace and report regardless if it was deleted or active. But we talk about a preview and with the Data Catalog there was no way to get information about Power BI.
Okay, let’s go back to the main topic … With Purview you can register online and on premises data sources. For the online sources there is no need to install or register additional components. For on premises sources you need to install a self hosted integration runtime. It is the runtime which we know from the Data Factory. On the link for the runtime you will get more information about the behavior of this setup. By registering the Runtime at the Purview portal, you will get the keys and the download link:
It is a little bit confusing by the download, because you can download different versions. I used the latest one. After starting the setup and providing the key, the setup finds the runtime which is configured at the portal.
Actually you can discover SQL Server sources, but no sources of the type of Analysis Services. I think a reason of this behavior is, that the runtime is for the data factory and this will write the sources? I’m now sure, but I think there will come a connector. Next we need a SQL authenticated login to discover the SQL Server. The user will need high permissions. You can check this on the link above. After that, you need to store the login information of this account in your Azure Key Vault:
After that I created on the Purview Page a new data source collection with a data source with the type of SQL Server.
For the scheduling you need the Key Vault entry:
After a discovery of my source I get all tables of database:
It is a little bit confusing that my SQL Server name is MSSQLSERVER. I installed the server as an unnamed instance. This is the reason why. Now you can discover the tables, columns and setup classifications and so on. I think this tool gives us more flexibility as the Data Catalog. To inventory local data sources and also online sources, it is a little bit different. Hopefully we will get the GA version very soon 😊