SSAS Tabular Modeling Article Series on SQL Server Pro Magazine

2 05 2015

I’ve just finished a series of four articles for SQL Server Pro Magazine, along with sample projects and hands-on exercises. The series will take you through SSAS Tabular model design from start to finish, using the Adventure Works sample data in SQL Server 2012 or 2014.  Here are links to all four articles followed by an excerpt from each.

Part 1 – Getting Started with SSAS Tabular
Part 2 – Easy DAX – Getting Started with Data Analysis Expressions
Part 3 – Tabular Model Administration
Part 4 – Deep Dive DAX – Solving Complex Business Problems with Data Analysis Expressions

Download the sample projects here:  Part 1, Part 2, Part 3, Part 4

Starting Your Modeling Career with Analysis Services Tabular Models Part 1

Four-part series

This is the first of a four-part series about getting started with Tabular model design using SQL Server Analysis Services in SQL Server 2012 and 2014.  You will learn the basics from start to finish and build a complete solution.  A sample project is provided for for each stage of the solution so you can follow-along with each article.

Article 1 excerpt:

 

 

Getting Started with SSAS Tabular

Microsoft’s introduction of the xVelocity in-memory analytics engine in 2010 is a significant game-changer for business intelligence and data management.  The availability of this impressive technology and its future application in new product features will change the way you store and consume data, and it’s at the heart of SSAS Tabular.  In this first in a series of four articles about designing and using Tabular models, you will experience how to create a model and analyze data.  In the articles that follow, we will add more sophisticated features to deliver production-scale solutions.

Why Choose Tabular?

On a server equipped with sufficient memory and CPU horsepower, Tabular SSAS can be very fast.  xVelocity, also known as VertiPaq, is the engine behind Tabular and Power Pivot.  In simple terms, it applies some foundational data principles elegantly to achieve high performance.  If you work with SQL Server databases, you know that table lookups are much faster when the columns used in a query are indexed.  Tabular take this principle to the next level.  The xVelocity column store essentially creates a separate index for every column and then compresses the data, eliminating redundant and sparse values.  Unlike other compression methods, the storage for each column is unaffected by adjacent columns.  For numeric, date and short text values; this approach can drastically reduce the size of the stored data.  All of the compressed data is loaded into memory so queries incur no disk IO cost and return results quickly.

In the sixteen year history of multidimensional Analysis Services, many features and capabilities were added to the platform to address some specific business needs and “newer” is not always “better”.  SSAS Tabular is not designed to be a comprehensive replacement for multidimensional Analysis Services.  It is, however, a streamlined platform for analytic reporting and data exploration.  Tabular is a ground-up re-engineered platform with a lot of solid business value today and even more promise for the future.  Think if it as a Formula One race car without cup holders and seat warmers; just designed to go really fast.  A number of publications and resources are at your disposal to compare both options and determine which best meets your requirements.  I encourage you to take Tabular for a test drive.

Create a Tabular Project

The Developer, Business Intelligence and Enterprise editions of SQL Server 2012 and 2014 support SSAS Tabular projects.  SQL Server Data Tools for BI installs with SQL Server 2012.  For SQL Server 2014, install any edition of Visual Studio 2012 or 2013 and then install the SSDT-BI add-in.  You should also have Excel 2010 or 2013 installed.  To follow-along, you’ll need the AdventureWorksDW2012 or AdventureWorksDW2014 sample database.  Links to these downloads and more background information are included in the Resources at the end of this article.

 

Article 2 excerpt:

Easy DAX: Getting Started with Data Analysis Expressions

The DAX language really serves three different purposes. Calculated columns perform row-by-row operations that are processed when data is loaded and the calculated results are persisted into the Tabular model. Measures are calculated at query time on a filtered set of rows. DAX is also a query language and can return an entire result set for reporting.

Continuing our Tabular modeling journey, we will use the project created in the first article, “Getting Started with SSAS Tabular.”

This article will give you a good start with DAX measures, and then we’ll resume and go deeper in the fourth article: “Deep Dive DAX – Solving Complex Business Problems with Data Analysis Expressions.”

Simple Measures

At its heart, DAX is a very simple expression language that, much like Excel, uses functions to perform operations on values and sets of data. The most fundamental functions are simple and easy to use, and then they get more complex and powerful. The first group of DAX functions you’ll learn are aggregates that roll up a range of values into a single result. These functions include SUM, AVERAGE, COUNT, DISTINCTCOUNT, MIN and MAX.

Article 3 excerpt:

Part 3 – Tabular Model Administration & Security

It’s common knowledge that SSAS Tabular and Power Pivot use the same in-memory engine and support the same essential features.  Tabular supports a number of features which take it from desktop and small group scale to a truly enterprise class business solution.  We explore two important enterprise features in the following exercise.

The first thing to understand is that a Tabular solution sits on the SQL Server Analysis Services foundation, which is a 14-year-old mature technology.  When a project is deployed, an SSAS database is created.  In SQL Server Management Studio (SSMS), you may notice the lack of objects that may be familiar to SSAS multidimensional users; like cubes and dimensions.  The database simply contains tables (some of which may be “measure group” type tables and others serve as dimensions.)

To follow-along, use the solution you completed in Part 2 or download the Article 2 Completed Project.

Managing the Workspace Database

When opening the Model.BIM file in the SSDT model designer, a user-specific workspace database is automatically deployed to the server specified in the model properties.  As a general rule, I don’t recommend the workspace database be on a production server. It should be on an instance where you have administrative rights.

Just a word about model stability, recovery and maintenance…

 

Article 4 excerpt:

Part 4 – Deep Dive DAX

The heart and soul of Tabular models is the DAX language.  In part 2 of this series you learned to use some of the basic DAX functions like SUM and CALCULATE.  You also learned about row and filter context and how functions like ALL() and ALLEXCEPT() can be used to override natural row context to change the scope of a calculation.  In simple terms, when you use aggregate functions like SUM(), AVERAGE(), MIN() or MAX() with a numeric measure value, the calculation is applied to all the rows in the context of current filters and groups.  This functionality is much like the behavior of a T-SQL query using a SUM() function and GROUP BY clause but that’s about where the similarities end.

The power of DAX goes well beyond emulating simple grouping and filtering logic.  It can be used to apply advanced business logic to perform complex calculations that would be difficult to duplicate with the same performance in relational tools.  For this final exercise, I’m going to queue some important topics you are likely to encounter, I’ll describe a reporting scenario and the calculations needed to address requirements and then I’ll step you through the exercise using the project we completed in the previous article.

User Interaction and Data Browsing

Analytical reporting is pretty simple in concept.  Typically every measure is going to be grouped and/or filtered by something so when designing calculations, measures and KPIs; we must anticipate different ways users might interact with a model.  The easiest way to think about this is in the context of an Excel Pivot Table.  Connected to the model, a user could place any of the table attributes on rows or columns. They could place any measure or KPI in the intersecting data cells to perform calculations grouped by the rows or column attribute values; and they can apply filters and slicers.

Conditional Calculations

Some calculations only make sense when performed in a certain context.  For example, if financial budgets are calculated for the end of each fiscal year and actual expenses are recorded for each month period, the logic for reporting on a daily, monthly or annual basis will be different and in some cases, may just not make any sense.

Read the entire series starting with Part 1 – Getting Started with SSAS Tabular










SSAS Tabular Models: The Good, the Bad, the Ugly & the Beautiful

24 06 2013

After working on a couple of serious tabular model projects, I’ve been talking to other BI practitioners lately about their thoughts regarding tabular model technology and I’m finding that there is little consensus among all these industry experts about which option to recommend for new BI projects.  At SolidQ, I work with some of the most experienced Business Intelligence experts in the industry.  At conferences and in discussion groups, there has been a lot of discussion about the choice between Analysis Services multidimensional models and the newer tabular semantic model technology introduced in SQL Server 2012.  There are pros and cons for either side but I think there are some important factors to consider.  Among some of the most experienced Analysis Services consultants, I have heard that some are primarily using and recommending tabular for most new projects and others are arguing that tabular offers little significant value over the tried-and-true multidimensional cubes and MDX script solutions.

As is typical for a newer product, the usability experience for the model designer isn’t perfect.  In some ways, it’s different and just may take some getting used to, but in other ways there is clearly room for improvement.  The question now is; do the advantages of tabular vs multidimensional outweigh the somewhat rocky design experience?  I’m not taking sides in this comparison but merely offering an objective analysis of the state of tabular model option in SQL Server 2012 Analysis Services, Service Pack 1.  I expect this product to go through changes so for future reference, this information is for the released product at the time of this posting in June of 2013.

 

The Good

  • The core VertiPaq (or xVelocity) query and aggregation engine is stable & reliable.  Originally developed about five years ago and released with PowerPivot for Excel and SharePoint in SQL Server 2008R2 over three years ago, this technology has proven to be ready for serious use.

  • Under most conditions, for analytic reporting, data scanned and aggregated from an in-memory data structure performs faster than other conventional options; including relational and multidimensional storage.

  • Tabular can be less complex than multidimensional, OLAP SSAS.  The core design and usage concepts are easier for both those who design models and for those use use them for analysis and reporting.

  • Tabular models can be easier & faster to implement because the model structure is simpler and there may be fewer steps in the design process.

  • DAX, the core calculation expression language for tabular models, is fairly easy to learn.  Fundamental DAX expression concepts can be easier to understand than equivalent MDX commands used in multidimensional modeling and calculations.

 

The Bad 

  • Comparing features, tabular can be hard to justify when compared to multidimensional.  Traditional SSAS cubes still have more capabilities than tabular models and for someone who already has OLAP skills and background, tabular design is a new learning curve.

  • PowerPivot is a version 2 product.  As such, it’s simple and fairly reliable in the majority of design scenarios.  In some, mostly rare, cases, error handling and debugging capabilities aren’t as mature and robust as more tenured technologies like SSAS multidimensional or relational.

  • Tabular SSAS is a version 1 product.  As such, it has a list of well-known design annoyances and features that could (and should) improve in the next product go-round.

  • The recommended design patterns & best practices, both from Microsoft development and support teams and from members of the expert practitioner community,  are still emerging.                         

  • One model.bim file = one developer working at a time.  A departure from the long-time SSAS project development environment where every object (such as data sources, dimensions and cubes) were defined in isolated files managed by the project; SSAS tabular manages all of the database objects in a single model definition file.  The model designer treats the model is more of a black box than a solution comprised of independent objects.  However, the fact is that most of the same objects we work with in multidimensional/cube projects are all defined as XML elements in this file.   It can be opened and properties manipulated,and there are a growing number of third-party tools to provide enhancements.  Regardless, it is one big project file tat would need to be checked-out in source code management as a single project asset.

  • The tabular SSAS support community is thriving but still small.  A core group of trusted professionals from all over the world are the loudest voices right now.  They promote tabular solutions and provide active and collective support for one another.

  • The DAX expression editor in PowerPivot & the Visual Studio-based SSDT designer is quirky.  You have to know is strengths and weaknesses and be willing to use it in that context.  It attempts to assists with code completion but in the end, it doubles-up brackets and duplicates, rather then replaces old code, that it suggests.  No doubt that the experience will get better as enhancements are developed and released but we must live with a product today that is useful and reliable a lot of the time but, it leaves plenty of opportunity for improvements.

  • The entire tabular model must fit in memory.  There’s no caching, swapping or distributed processing option for large models.  This means that very large models need very large hardware and there is currently no scale-out option to distribute work loads or fail-over.  Got a lot of data?  Get a lot of memory.

 

The Ugly

  • After you get into serious data with 30, 40 or 50 tables and some complexity to your model, the version 1.0 SSDT tabular model designer can be cumbersome and error-prone.  I’ve seen it stop responding and throw errors when there were no errors.  I will say that it’s helpful and reliable most of the time but on occasion, when it falls down, I often save and close my work; shut down Visual Studio all together and the fire it back up.

  • My biggest peeve about the SSDT model designer is that all design work is performed while connected to the workspace database.  This means that for every table and column rename,and for every single property setting change, this information is written to the model workspace database on the local SSAS tabular server, and after the changes have been committed, control is given back to the model designer.

  • Some key SSAS features are not currently supported.  These include things like parent-child hierarchies, many-to-many relationships, cube actions and display folders.  Actually, this is not entirely true; actions and display folders can be added using after-market tools (like BIDS Helper, available in the CodePlex library) and by hand-writing the XMLA script, but they are not currently supported through the SSDT model designer.  There is simply a category of features that didn’t find their way into this first version of the product.  There are work-arounds and methods to enable these capabilities but they’re not supported, or at least not to the the same degree as they are in multidimensional SSAS.

 

The Beautiful

  • There is no doubt that in-memory, tabular model technology is the promise of the future.  It just makes sense.  Several vendors have come to the same conclusion and are developing products following this paradigm.  Oracle just made a big announcement about adding in-memory column store to their future 12C product. 

  • Data residing and processed in memory is faster than data residing in disk.  This is what VertiPaq does; whether implemented as PowerPivot, an SSAS tabular model or as a SQL Server column store, it works efficiently and elegantly without the complexities and overhead of indexes, partitions, file groups and other techniques typically used to optimize on-disk data stores.

  • Even though tabular is fairly new, many useful & valuable features are supported today and work well.

  • PowerPivot models upgrade seamlessly to tabular models.  This provides a path for business users and IT professionals to author models in familiar tools (Excel or Visual Studio) and then promote them to a server hosted environment.

  • Tabular models are managed and stored by SQL Server Analysis Services!  Although some components of the tabular engine and the designer are new and still have wrinkles to be ironed-out, the core product is based on the solid and time-tested foundation of SSAS.  This means that many of the features not implemented now will be available in future builds.

  • Client applications that support SSAS multidimensional will also support tabular.  In fact, any application built to work with SSAS cubes will natively work with PowerPivot and tabular as if it were a cube.  This is because SSAS tabular uses the same data provider that understands both MDX & DAX queries.

Digg This







Follow

Get every new post delivered to your Inbox.

Join 306 other followers