So a few weeks ago Robert and I posted an episode called ‘Five Things about Performance’. We went through a number of things – five, to be exact – that can impact performance. We got some great feedback on that episode and also a couple of comments asking us to turn our attention to performance of Analytics and Reports. Why is it any different? Mike Lairson joins me to explore this topic.

Mike, as you surely know, is the guru of CRM On Demand reporting and author of the best selling guide. Together, we came up with five key things that influence reports performance.

It turns out there are a lot of commonalities between our five things a few weeks ago and what we’re talking about today. The difference with Analytics is that you have a lot more control over these things because you are designing the query as well as the presentation. You are in control of what you are asking the application to do, and that is the key.

If you can understand exactly what the application has to do to bring back the report you are asking for, you can identify the times when you are simply asking too much… or perhaps just asking in the wrong way and need to rephrase the question.

1. Data Volume. Don’t ask for more than you need!
This sounds familiar! So we know that millions of records is different from hundreds of records – that’s pretty much common sense. But while an average user may not need to search among them or page through lists with that many records, it’s reasonable you’d want to build analytics that take a lot of data in, even if just to get a summary view.

That’s why a lot of companies invoke “all data access” in analytics for some roles. This keeps your UI experience limited but exposes everything in reporting.

Part of the message here is that you need to prepare for the volume. If a report is going to scan millions of records, you need to plan time for testing and optimization, and also set expectations that it might not be instantaneous.

You also need to think about how the report is going to be used. If my goal is to see the total amount of revenue in the pipeline or the service request volumes, do I really need to display each individual opportunity or service request record in the report?

Are my users really going to read through ten thousand service request records and use that information in some meaningful way? The answer is no.

Design your reports to answer your business question. This is probably the most powerful advice we can give – what is the question your users are trying to answer? Instead of “show me a list of Opportunities closing in the quarter,” maybe the question is “what Opportunities should management be focusing on?”

If you absolutely have to have hundreds or thousands of rows in your report result, just know what you are asking for and maximize performance by using the right fields and filters. This brings us to number 2.

2. Some Fields are Better Than Others
Again, a repeat from our last performance podcast. CRMOD gives you a number of “enhanced” fields – they’ve been juiced up to provide superior performance.

Indexed fields, particularly when used in filters, increase the performance of your reports. Fields from the Date dimension are better than the date fields on the record tables. Owner name from the Owned by User dimension are better than the owner fields on the record table. These dimensions are provided in the reporting subject areas for this reason. Use them when possible.

Row ID’s (Account ID, Opportunity ID, etc) can also be a benefit, but only in certain circumstances. If you add the ID field to your report, you are forcing the report to examine and include each individual record in your report. If that is the detail you need, then by all means, the ID field is great. If, however, you are aggregating data across a number of records and do not need to display every record, don’t force the report to use them.

The metrics in the subject areas provide aggregated data are work really well without the need to include every row in your report data.

3. Filter Order
So we know that filters limit the data coming back in the report. Did you know that the order of the filters can impact performance? I didn’t!

Let’s think about what we are asking a report to do for a moment. We have a dataset of one million records and I need to see in my report the current pipeline revenue amount for the East territory. I need a few filters here to get to this data. Now, of the million records, I only have 1,000 opportunities that are in an open sales stage meaning that is it in the current pipeline. Of this 1,000 opportunities, 300 of them are in the East territory.

When I add filters to this report, I will filter on Sales Stage first and then on territory. The filters are included in the query in order so the first filter brings my dataset down to 1000 records and the second filter only has to examine 1000 records to get to the 300 I need. In the other order, the first filter might only eliminate 700,000 records leaving 300,000 records for the next filter to work with.

The point here is that you want the filter that excludes the most data to run first. If it is not clear which filter is the most exclusionary, and it is a toss-up, go with whichever filter is working with an indexed or reporting-specific field since those will run faster.

So it really helps to know your data and what sort of data volumes you are working with. I know Oracle also recommends limiting the use of cross-object filters in reports where possible. We know this is not always possible, but there are some strategies you can use with your report design that I read in Mike’s book – like filtering based on another report.

But I digress… we could go for days talking about filters and how they affect report performance. The key, again, is to try to understand what you are asking the report to do, and sometimes that requires some knowledge of SQL and databases that comes with practice and patience.

4. Server Side vs Client Side Formulae
What I really like about this is that Mike wrote this as “Form-u-lay”. So proper. I’d have said “forumulas”.

We often want to see metric data in our reports segmented in several ways. For instance, if I am building a service request report, I might want to see the number of service requests opened for specific sets of products by specific types of customers. One option is to group the products using a CASE statement and group the customer types with another CASE statement or a calculated item on a Pivot Table.

I have options here. The problem with this is that the CASE statement and pivot table formats are applied to the query results after the results are returned from the server. What I really want to do here is make the server do most of the work, not my report. The solution here is to use the FILTER function on the number of Service Request field.

With the Filter function I tell the report to only collect the number of service requests for the specified products and account types rather than grab them all and sort them out later. The server does the work of grouping and aggregating the number of service requests for you. Simply add a column to your report for each group and apply the Filter function accordingly.

5. Let’s Get Historical
I’ve always been a big fan of history – you know, those who do not study history are doomed to repeat it. Is that relevant here?

Not in the least. This is probably the most well-known area of performance impact. Simply put, things in the historical subject areas run faster. There are several reasons for this. First, it’s supported by a different server, so it’s not competing for resources with the users doing their business in the UI.

Second, the data has actually been changed. The technical term is “ETL” – extract, transform and load. Every night your data is pulled out of the real-time system, smushed up a bit, reshaped, etc and then loaded into the historical data warehouse.

So basically, this last one on the list is the place to start. Build a report in the historical subject area and then apply all the other stuff to further optimize. Any drawbacks or gotchas?

Well, yes. It all comes back to planning. Historical data is just that -history. One day old, at most. Sometimes that’s too old for certain applications. Also, not all data gets put in the data warehouse. Custom objects are a good example.You need to check the list of subject areas to be certain the records and columns you want to report on are supported.

Episode Wrap-up
So those are your Five Things about Reports Performance. Big thanks to the reporting guru Mike Lairson for joining me again on the podcast.

You can find Mike’s excellent book ‘Oracle CRM On Demand Reporting’ at Amazon.com, and while you are there, look for Oracle CRM On Demand Dashboards which will be in print in August and Oracle CRM On Demand Combined Analysis scheduled to go to print in December. Go now!

Share and Enjoy:
  • Digg
  • LinkedIn
  • del.icio.us
  • Facebook
  • Mixx
  • Google Bookmarks
  • Twitter
  • StumbleUpon
  • Technorati
  1. [...] This post was mentioned on Twitter by Mike Lairson. Mike Lairson said: Check out my latest podcast appearance. Discussing CRM On Demand Report Performance! http://tinyurl.com/3xoq4z5 [...]

  2. [...] original here:  Podcast 012: Five Things About Reports Performance — The Oracle … By pivot | category: pivot, pivot table | tags: are-returned, case, from-the-server, [...]