Mar 292013
 

I’m fairly new to the business analytics space, so multidimensional expressions (MDX) are not even close to second nature to me.  This week however, I had a request for how to calculate a rolling average for the last five time periods for a particular measure.

Clearly, I was going to need the Avg() function, so I checked the reference for it:

Avg( Set_Expression [ , Numeric_Expression ] )

The numeric expression was the easy part – that would simply be the measure (in this case aggregated by count) I want the average of.  The set expression I was less sure about.  After a bit of trial and error (and some guidance from Bo Borland), I discovered that I needed to supply a range that identified the set to average based on the date, and that one great way to do that was to leverage “:” operator and the Lag() function.  In the end, my new calculated measure looked like this:

Avg({[Date].[Date].CurrentMember.Lag(5):[Date].[Date].CurrentMember}, [Measures].[Events])

In this case, the first occurrence of “Date” refers to the name of the dimension referenced by the cube, and the second instance of the word “Date” refers to the level that describes the actual date of the event.  Here are the relevant sections of the schema file I used:

<Cube name="Events" visible="true" cache="true" enabled="true">
 <Table name="fact_events">
 </Table>
 <DimensionUsage source="Date" name="Date" visible="true" foreignKey="date_key" highCardinality="false">
 </DimensionUsage>
 <Measure name="Events" column="id" formatString="#,###;(#,###)" aggregator="count">
 </Measure>
 <CalculatedMember name="Rolling Avg Events" formula="Avg({[Date].[Date].CurrentMember.Lag(5):[Date].[Date].CurrentMember}, [Measures].[Events])" dimension="Measures" visible="true">
 </CalculatedMember>
</Cube>
<Dimension type="TimeDimension" visible="true" highCardinality="false" name="Date">
 <Hierarchy name="Date" visible="true" hasAll="true" primaryKey="date_id">
 <Table name="dim_date">
 </Table>
 <Level name="Year" visible="true" column="year_key" type="Integer" uniqueMembers="false" levelType="TimeYears" hideMemberIf="Never">
 </Level>
 <Level name="Quarter" visible="true" table="dim_date" column="quarter_key" nameColumn="quarter_desc" type="Integer" uniqueMembers="false" levelType="TimeQuarters" hideMemberIf="Never">
 </Level>
 <Level name="Month" visible="true" column="month_key" nameColumn="month_desc" type="Integer" uniqueMembers="false" levelType="TimeMonths" hideMemberIf="Never">
 </Level>
 <Level name="Date" visible="true" column="date_id" nameColumn="date_desc" type="Integer" uniqueMembers="false" levelType="TimeDays" hideMemberIf="Never">
 </Level>
 </Hierarchy>
</Dimension>

As a result, I’m now able to show the rolling average number of events in a column line combo chart (or any number of our other visualizations) as well as a convenient tabular view, such as below.

One improvement I haven’t quite figured out yet is how to make the value supplied to the Lag() function dynamic instead of it always being 5.  But that’s a post for another day.  Hopefully this one is helpful for anyone looking to create a rolling average calculation for use in Analyzer!

Update, 4/9/2013 – there is an easier way, though it has slightly different results.  See my post about it here: http://robinsontechnology.com/blog/2013/04/05/average-trend-measures-in-pentaho-analyzer/.

 

  One Response to “Pro Tip: Rolling Averages in Pentaho Analyzer”

  1. [...] week I posted about how to define rolling averages in Pentaho Analyzer. Since then, my colleague Andy Grohe pointed out that this could also be done via a trend measure [...]

 Leave a Reply

(required)

(required)

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>