I’m impressed with the current generation of semantic modeling technology from Microsoft. I’ve had the opportunity to use both PowerPivot and SSAS Tabular Models on a handful of client projects and have had very good results so far. The xVelocity in-memory aggregation and data compression engine seems to be everything that was promised. PowerPivot doesn’t allow us to do everything we can do with Analysis Services cubes but for simple aggregate values and calculations performed with moderately large volumes of data, it performs quickly and is fast and easy to to implement. In some ways, we can do more with Tabular models than multidimensional models with less effort. I’m anxious to continue to test the limits of this technology and really find its practical boundaries and capabilities.
This past week, Microsoft released Office 2013 with PowerPivot and the Power View visualization tool build right into Excel. Power View makes it really easy to analyze PowerPivot and Tabular models and to present results in interesting and impactful ways. Every time I start working with Power View, I can’t put it down because I keep finding more creative ways to present the information.
I’ve given three presentations so far on using PowerPivot, Power View and Tabular models at SQL Saturday events in Dallas and Salt Lake City and I have four more events* on the horizon. One of the reasons I enjoy giving these presentations is that it gives me a good opportunity to find out what people are thinking about when they consider using different database tools. Being in front of an audience challenges me and makes me think hard about the right and wrong ways to use software. Some really good questions came out of these sessions that I’ll share with you. Thank you to those how have attended and I sincerely appreciate your questions and challenges.
At the Dallas SQL Saturday BI Edition, I was delivering a session titled PowerPivot & Power View – High-octane Self-service Reporting and demonstrating how to mash-up different sets of data. I used a bridge table sourced from a text file to resolve a data quality issue; two product numbers that represent the same product, explaining that this is a common practice in an ad-hoc solution. I hand went up and a gentleman said “Isn’t that something that should be resolved in your ETL process and stored in a central data warehouse?” Absolutely, Yes! Ideally, data quality and master data issues should be managed centrally and according to a business-wide standard.
The reality is often that when a user has a set of data containing some inaccuracies, they can’t always wait for the perfect solution and we simply do what must be done at the time to resolve issues in the short term. Exceptions to these rules of “proper” design may be evidence of a business requirement that should be carefully reviewed so we can “do it right” in the future. In another session, I was explaining that one important purpose of PowerPivot is to put powerful yet simple data modeling tools in the hands of business users so they can do the work themselves, rather than waiting for IT to build solutions for them. This brings a very interesting question to bare. Isn’t this what we did with Microsoft Access? Are we setting the stage for a new generation of poorly-designed, user-created applications that promote desktop data sprawl and a general lack of standardization?
Access was and still is a great tool for organizing and presenting information. Since it is installed on so many desktops with the rest of the Office applications, it’s available to many undisciplined and untrained users to created poorly-implemented solutions. Isn’t the same thing true for PowerPivot and Power View? Now that they are both on the desktop and accessible for users, won’t they also create incorrectly designed applications and spread outdated and inaccurate data around the organization? Sure they will. The difference is that these users have a simple path to move from the desktop to the workgroup and the enterprise by storing these models in SharePoint.
To keep things under control in the long term, IT must be a part of this process. Solutions should be sandboxed so that business user created data models and reports are stored in isolated document libraries that are designated for prototypes and proofs of concept. After formal review and redesign, if necessary, models and reports are moved to designated document libraries with the expectation that they may be used with the stamp of approval of the business and IT. These reports use data governed by IT and vetted by the business, having met certain standards of accuracy and reliability. User-generated data models can also be migrated to use enterprise data marts and IT-governed data models, rather than the outdated off-line data models created by users.
Users will always create their own solutions using whatever tools are at their disposal. Some will be wrong and some will be very valuable to the business. The key to achieving the right balance is through proactive user education, the appropriate level of security and control, the separation of “experimental” models and reports from those guaranteed to be accurate and reliable; and a regular cadence of IT and business review of all assets to migrate, correct or redesign these data models and reports to move them from one category to the other.
*I’ll be speaking on these topics at the SQL Saturday # 172 in Portland next week, at the PASS Global Summit, the following week at SQL Saturday #166 in Olympia, Washington and then in December at the DevTeach conference in Montreal.