Wednesday, August 12, 2015

Why I still like Excel PowerPivot models in SharePoint however Tabular solutions in SSAS are better

[2015-Aug-12] It’s quite obvious that in present time there is a growing audience for all the latest changes and updates within the Microsoft Power BI sphere. I think someone has already built a data model that showed number of tweets about Power BI which prevailed other more usual and heavily discussed software development topics on one day. However I would still like to share some thoughts that came as result of building PowerPivot data model in Excel/SharePoint vs. Tabular solutions.

Also this post was inspired by (Tabular Models vs PowerPivot Models) and (When to choose tabular models over PowerPivot models).

So here are a few risks or disadvantages that we may face with PowerPivot Excel data models SharePoint:

Performance/Availability to users

  • Excel PowerPivot in SharePoint is just a physical file that internally holds a backup of tabular model that automatically gets restored into a memory and is utilized whenever users try to interact either with linked PowerView report or manage Excel files itself. Once this “in-memory” file tabular database is not used for while it gets deleted by internal garbage collector. That means that after a short while, it will take another waiting time to bring that “hibernated” data model into memory before it will be available for users.
  • Server Tabular model is a different type in memory model, that still may be “cleansed” by internal garbage collector; and for sure certain performance tests that will compare Excel PowerPivot model vs. Tabular model needs to be built. However preliminary tests showed that Tabular model was more responsive during user interaction.


Stability

  • SharePoint Excel PowerPivot books live under control of Excel Services, PowerPivot Services and Analysis Services within SharePoint environment and may be affected by user activity with other SharePoint objects; SharePoint disk and memory space might bring some other dependency on life of Excel PowerPivot models.
  • So having a dedicated, set apart from SharePoint, SQL Server Tabular environment may bring a peace of mind to all users who will rely on their reporting data availability.


Scalability 

  • Even with a standalone Excel file 2 GB limit, we are faced with different file size limitations that SharePoint will handle (2 GB is still an option in SharePoint, but this has to be specifically defined).
  • With a tabular database we only have to deal with how its hardware infrastructure is organized (Hardware Sizing a Tabular Solution (SQL Server Analysis Services))


Development/Deployment

  • I like both Excel development environment as well as Visual Studio for tabular solution. In both places I can define all the metrics, build relationships between tables and define necessary calculations.  
  • Deployment could be tough with Excel PowerPivot work books: I’ve built a PowerShell scripts that prepares and updates metadata within each Excel PowerPivot data models prior their uploading to different environments in SharePoint (http://datanrg.blogspot.ca/2015/06/updating-powerpivot-model-and-powerview.html). However it’s still depended on how an Excel XML document is structured in Excel 2013 and this may get changed in later versions.


Manageability

  • You’re on your own with managing an Excel PowerPivot model: it’s you and Excel. Yes, SharePoint will let you to schedule data model refresh once a day; however this schedules may be jeopardized by frequent updates and interactions with Excel files themselves (https://support.microsoft.com/en-us/kb/2874957)
  • Tabular solutions can be backed up, restored, detached, and attached; PowerShell cmdlets can be applied. And most importantly we can trigger a partial or complete data refresh of our server data models multiple times a day (XMLA scripting could only be a great benefit for this).


Security

  • When an Excel PowerPivot model file is deployed to SharePoint, users will or will not have a complete access to the data model (depending on their access permission scope defined in SharePoint).
  • With tabular solution we can utilize all the benefits or role base security model and manage access to our server data models on a more granular lever.


Will all this being said, I still see that an initial PowerPivot data model could be designed and developed in Excel; however once it’s defined and tested then it’d be wiser to export this model into a tabular solution for a better manageability and data availability for end users. 


Monday, July 27, 2015

Single value filters and custom error messages in Power BI

"Let me be the one for you, otherwise we need to work on communication". Basically this was a very brief description of a recent use case for a power view report which had been later replicated in a Power BI slide as well.

Users were aware of PowerView / Power BI limitation (as of 2015-July-27) of not having filters with a single choice of values to select. PowerView slicers could be a good solutions for this, which would allow having that single value selection functionality (Power BI slicers still need to be improved); and use of regular filters has been a more preferable choice in order to have more room for useful visualizations within a report. 

However users usually get confused by selecting multiple values for a particular filter, which might lead to improper data discoveries. And then they simply asked, "Yes, we understand PowerView behaviour and we understand that we make mistakes with a multiple filter's values selection. Show us then an error message in our report so that we would be aware about this mistake".

It was easier to tell them that they should properly operate with their filters, however I decided to explore PowerView / Power BI functionality a bit more if it would allow some custom error messages, before taking the easy path.

First, I pulled a sample data from a Wikipedia page (https://simple.wikipedia.org/wiki/List_of_countries_by_population) with countries'  populations statistics into my data model; 
and also exported some lookup data for parameters:
Then I created a simple visualization of this data along with 2 slicers: Population Group & Region.

Then for 'Country by Population' table I added 3 DAX measures:
Region Selected = COUNTROWS(FILTERS(Regions[Region]))
PopulationGroup Selected = COUNTROWS(FILTERS(PopulationGroup[Population Group]))
MultipleSelected = if(OR([Region Selected]>1,[PopulationGroup Selected]>1),1,0)

And one measure for the CurrentDate table:
ShowError = if([MultipleSelected], [MultipleSelected]*MIN([MessageType])+1, MAX([MessageType]))
Then I applied ShowError to the Top Left table region with CurrentDate table data:
and MultipleSelected to the main chart in order to control visual behavior of the top table region and this main chart correspondingly.
And it worked! If I selected more than one value from any of the slicers, then the Top table region would show a defined error message (instead of a report date value, please see the slide with this visualization at the beginning of the post) and main chart would become blank:
Also, I realized that with this chart's bars hiding I have found a bug with the Power BI charts in general (which I believed would be fixed soon) that data labels were still visible while data bars were not, which didn't make any sense.
This very same reports was originally developed in PowerView:

And both Power BI and PowerView files can be found here: 

So at least some communication back to users was possible in Power BI, and I think there will be more and more improvements in this Power BI area :-)

Happy data adventures !

Tuesday, July 21, 2015

Setting specific colors for PowerView charts

I was listening for Phil Collins' "True Colors" song in my car audio this morning and it reminded me about the very issue that PowerView currently has: that it doesn't allow to set specific colors to different elements of its own visuals. There are more and more announcements made about coming changes and improvements in this area for Power BI, however if you're still using SharePoint or Excel based PowerView charts, this might be a challenge for you.


There is a way to change colors in PowerView with the help of setting different themes though (Format Power View reports); however I don't really like this approach when one theme change affects all of your PowerView slides and there is not much control left on how to set different colors for different sections of your report.

So I have come with a quick workaround on how certain colors are still can be chosen for my PowerView visuals.

For this demo test I have pulled 2 data sets from Wikipedia using PowerQuery and pushed them further into a PowerPivot model.
1) Top 10 Countries by Population (https://simple.wikipedia.org/wiki/List_of_countries_by_population)
2) Top 10 Countries by Territory (https://simple.wikipedia.org/wiki/List_of_countries_by_area)




Then I created simple PowerView charts to show this data:
And here is a big question: what if I want to color top chart with Color A and second one with Color B; still not much options left with the existent PowerView interface. But I've noticed that with each color theme there is an order of particular colors, so even if I'm not allowed to set a specific color, I still have a control to choose colors from the existent theme and set them based of my customers' needs.

Default color theme starts with Blue, then goes Red, Yellow and other colors. So I added blank columns to the PowerPivot model's tables (Color 1 & Color 2):

And then I could place empty columns to the set of Values of my PowerView chart this way:
so that by default [Color 1] column would be Blue colored, [Color 2] column will be Red and real Value will become Yellow. Here is a trick: since both [Color 1] & [Color 2] are empty columns, no colors will be shown for them and only one Yellow color will prevail in the chart :-)

You can play a little bit more with this example; I've saved all PowerQueries, PowerPivot model and PowerView charts in this Excel file (Top10Countries.xlsx).

Eventually your PowerView charts can get close to your "true colors" and that's why I like it :-)

Thursday, June 18, 2015

Updating PowerPivot model and PowerView report with PowerShell

Before I start describing a problem that my PowerShell script helps to resolve, these are a few blog posts of other people that inspired me to look at the whole issue of automating changes for Excel PowerPivot models (especially when you have many such Excel files and their manual change would take significant amount of time): 

So here is my problem: we have built several Excel PowerPivot models and we want to use them for different testing environments (DEV, UAT, etc.) after we save those models in SharePoint. The only problem is that for each of the environments those models have to be deployed,  each data connections for different data sources have to be changed; and this can be done only manually: by opening each of the Excel files and through a PowerPivot management to make all the changes; however, if you have too many of Excels files, then this manual change would take significant amount of time. 

Let's me show my current approach to develop SharePoint PowerView reports based on PowerPivot models: 

(A) Excel PowerPivot model deployed to SharePoint (.xlsx file) 
(B) BISM data connection to (A) Excel PowerPivot model (.rsds file) 
(C) SharePoint PowerView report based on (B) data connection (.rdlx file) 



Deployment prerequisites: 
- PowerPivot data model sourcing from different databases
- PowerPivot/PowerView files for different environments are to be saved in different SharePoint folders (or SharePoint servers).

Deployment issues: 
- Reference to different databases has to be changed for each of the (A) Excel PowerPivot models. 
- Once an updated PowerPivot model is deployed to a different SharePoint folder (or server), then a reference to this Excel file has to be updated in the (B) BISM data connection as well 
- (C) SharePoint PowerView report contains internal reference to its data source that has be changed via SharePoint user interface.  
Could be an issue to update many files at every deployment, because it’s a manual process and it will take time if we have to deploy many reporting items. 

We can always do things manually 

Required changes for deployment: 
- Change of database references in PowerPivot models 
- Change a reference to an Excel PowerPivot file at another folder (or server) for BISM data connections 
- Change a reference to a BISM data connection file at another folder (or server) for PowerView report 

Manual process to perform required changes: 
  • Each Excel file with PowerPivot model has to be opened in desktop Excel and then through a PowerPivot management each data source connections have to be changed. 
  • Change connection string via a SharePoint menu “Edit Data Source Definition”. 
  • Change data source reference via a SharePoint menu “Manage Data Source”. 





Let’s try to automate something:  
  • Currently no automation to update PowerPivot data connections; it still has to be done manually for every single Excel file. 
  • BISM connection file is an XML file and can be updated programmatically prior its deployment to SharePoint.  
  • PowerView .rdlx file is an archived file that contains an XML \Reports\report.rdl file and can be updated programmatically prior its deployment to SharePoint 



Possible way to change PowerPivot models programmatically:
1) Each Excel (.xlsx) files can be seen as an archive file (.zip) that contains \xl\model\item.data file and this item.data file contains PowerPivot model of that Excel file. 
2) Also this item.data file is an archive file of SSAS tabular database and we can restore this item.data (item.abf) file to a tabular SSAS server database. 
3) Then we can change settings of the newly restored database (along with connection strings).  
4) And save this file back to our Excel file. 



PowerShell script steps: 

PowerPivot change PowerShell automation: 

  1. Scan through a folder for Excel files 
  2. Rename .xlsx file to .zip file 
  3. Unzip .zip file to a model folder 
  4. Take item.data file and rename it to item.abf file 
  5. Connect to a SSAS Tabular server 
  6. Run XMLA command to restore item.abf archive file to a tabular database on server 
  7. Run XMLA command to update required database connection strings 
  8. Run XMLA command to backup tabular database back to item.abf 
  9. Rename item.abf file to item.data file  
  10. Copy modified item.data file to the model folder 
  11. Archive model folder to .zip file 
  12. Rename .zip file to .xlsx file 

PowerView change PowerShell automation:

  1. Scan through a folder for .rdlx files 
  2. Rename .rdlx file to .zip file 
  3. Unzip .zip file to a model folder 
  4. Update <DataSourceReference> tag in the XML \Reports\report.rdl file 
  5. Replace modified report.rdl file in the model archive 
  6. Rename .zip file to .rdlx file 


BISM data connection change PowerShell automation:

  1. Scan through a folder for .rsds files 
  2. Update <ConnectionString> tag in the XML file 

Here is a link to a complete PowerShell script that does update all PowerPivot/PowerView/BISM data connection files.



Some thoughts about the PowerShell script: 

1) Main area that shows all key steps:

2) Easy way of updating XML files:

3) Sending XMLA command was an easy step too:

4) There was one issue of rather difference in dealing with zipped RDLX file compared with zipped XLSX file. Both desktop Excel and SharePoint didn't bother to accept .xlsx file that was created from zipping a whole folder; however powerview .rdlx file didn't like this approach and SharePoint didn't want to accept it (incompatibility error). So in case of .rdlx files I had to update an existing .rdlx(.zip) file rather than creating a new one from a folder; and this was the only approach that worked and I was able to see my modified PowerView report in SharePoint after without any issues.


And the end all PowerPivot (.xlsx), PowerView (.rdlx) and BISM data connection (.rsds) files were automatically prepared to be saved to SharePoint; and all worked well. 

Happy data adventures!