Sunday, July 4, 2010

Mark you Calenders - Oracle BI 11g Launch - 7th July

BI 11g is here finally. Definitely worth the wait....

Register here



See Why Oracle's Business Intelligence Technology Is the Best‑in‑Class BI Solution
Don't miss this live Webcast as Charles Phillips, President, and Thomas Kurian, Executive Vice President, Product Development, mark the release of Oracle's latest business intelligence software--Oracle Business Intelligence 11g.
Be the first to hear about the new, industry-leading technology platform for business intelligence, which offers:
  • A powerful end-user experience with rich visualization, search, and actionable collaboration
  • Advancements in analytics, online analytical processing (OLAP), and enterprise reporting--all with unmatched performance and scalability
  • Simplified system configuration, lifecycle management, and performance optimization

Event Details:


Wednesday 7 July 2010
9:00 - 11:00 GMT


Live Webcast

Wednesday, June 30, 2010

Data warehousing with the Oracle Database - Compression

Data warehousing has always been synonymous with the database. Data warehousing is inevitably related with TB's of data and powerful DB machines. To a certain extent this is true, as what is analysis without a large data set.  When dealing with such huge sets of data, it is always recommended to leverage the already natively available functionalities of the DB rather than depending of the BI frontend to do this job. The Oracle Database is continuously evolving as a complete Data warehousing system. In this series of articles I will introduce various features of the DB that specifically cater to Data warehousing.

Oracle Advanced Compression

The volume of data being generated and retained by businesses has seen an exponential growth in the last few years. On an average, it is not uncommon to see data growing by 2-3 times every couple of years thanks to a variety of factors such as government regulations that mandate storing historical data for a certain number of years, massive amount of user generated content through web 2.0 applications and increasing adoption of multimedia content – enabled in turn by growing penetration of broadband technologies. Consequently, terabytes sized database – that were a novelty in the past – have becomes so commonplace these days. A key challenge before enterprises today is how to manage this enormous volume of data in a cost effective manner as they still have a limited IT budget. On a more technical level, there is also an associated challenge of ensuring the application performance does not degrade unacceptably while accessing this enormously large volume of data.

This brings compression technology at forefront. Oracle has known and used compression technology in some form of other so it should not need much explanation. The most prominent benefit of compression of course is the reduced storage consumption. However, the compression technology needs to be highly efficient and intelligent before it can be used with enterprise application in view of their demanding performance and scalability requirements.



Let’s first talk about compression of regular relational data. Oracle was the early pioneer in terms of compression technology for databases when it introduced table compression for bulk load operations in Oracle9i. Using this feature, customers can compress data at the time of doing bulk load using operations such as DIRECTLOAD, CTAS, etc. However, until now, compression was not available for regular data manipulation operations such as INSERT and UPDATE. In Oracle database 11g, Oracle has extended the compression technology to support these operations as well. Consequently, compression in Oracle Database 11g can be used for all kinds of workload – be it OLTP or DW.

It is important to mention here that table compression enhancements introduced Oracle database 11g are not just incremental changes. Rather enormous amount of work has gone into making sure that the new compression technology has negligible impact on updates since any noticeable write time penalty due to compression won’t be acceptable in OLTP environment. As a result compression technology in Oracle database 11g is very efficient which could reduce the space consumption up to 2 to 3 times. And while you do that, not only your write performance does not degrade, your read performance or queries actually improves. This is because unlike desktop based compression techniques where you have to wait for data to uncompressed, our technology reads the compressed data directly and does not require any uncompressing. This makes sure that our query performance does not degrade because of compression. Now, the question is, How does the query performance improve?. It does so because we need to fetch less data to answer a query due to compression. Another advantage of fetching less data to satisfy a query is that we can now use the memory used to buffer the data read from disk more efficient thereby maximizing the memory utilization efficiency.

Finally, just to make sure that I do not miss anything obvious, Oracle’s compression technology is completely application transparent. That mean that you can use this technology with any home grown or packaged application such as SAP, Siebel , EBS, etc.    

Sample Syntax

·         OLTP Table Compression Syntax:

CREATE TABLE emp (
emp_id NUMBER
, first_name VARCHAR2(128)
, last_name VARCHAR2(128)
) COMPRESS FOR ALL OPERATIONS;


·         Direct Load Compression Syntax (default):
CREATE TABLE emp (
emp_id NUMBER
, first_name VARCHAR2(128)
, last_name VARCHAR2(128)
) COMPRESS [FOR DIRECT_LOAD OPERATIONS];

Best Practices

·         Compress your 10 Largest Tables
·         80/20 Rule – 20% of your Tables Consume 80% of your Space
·         Better Compression with Bigger Block Sizes
·         Higher Probability of Duplicate Values
·         B-Tree Index Compression
·         Validate Index and Review INDEX_STATS
·         INDEX_STATS.OPT_CMPR_COUNT
·         INDEX_STATS.OPT_CMPR_PCTSAVE
·         Bitmap Indexes are Highly Compressed
·         Good for low and medium cardinality

Compression Advisor

·         Shows projected compression ratio for uncompressed tables
·         Reports actual compression ratio for compressed tables

Quick Links

http://www.oracle.com/technology/products/database/compression/index.html
http://www.oracle.com/technology/oramag/oracle/08-jul/o48advcomp.html
http://www.oracle.com/technology/products/database/compression/compression-advisor.html
http://www.oracle.com/consulting/library/data-sheets/oracle-compression-quick-start.pdf

Tuesday, June 22, 2010

Oracle's BI Stack



With a handful of acquisitions by Oracle in the BI\Data warehousing space in the recent past it will be beneficial to know the correct fitment of the products in the entire Business Intelligence stack. In this discussion I will try to throw more light on the Oracle BI Stack as it stands today. I will try to incorporate as many technologies that are applicable. I have deliberately left out the Middle Ware portion of the stack to keep the architecture simple. However the components of the BI stack are highly pluggable and configurable with the Fusion Middle ware suite of products, thereby enhancing the BI capabilities significantly. Enterprise wide middleware components like the web server and security architecture seamlessly integrate with the entire BI Stack, decreasing the manageability costs associated with the infrastructure.




Business Intelligence is derived from data from the past, present and predictive data about the future. Different products cater for these different sets of data. A Combination of data from across these timelines gives the complete 360 degree view of the business and provides the necessary Intelligence to take decisive actions.





The data can be sourced from various systems. They may range from simple Excel sheets to complex data warehouses\marts. Various OLTP systems like Oracle Ebiz, SAP or custom databases can provide the system with data. Real time data can also be sourced from web services with this architecture

The ETL layer is required to post the data from the various data sources to a centralized data warehouse. ETL tools like Informatica and ELT tools like Oracle Data integrator are options available to carry out the ETL process. Apart from this a recent acquisition Golden Gate can be used to provide very near real-time

ODI -High Performance, High Productivity Data Integration


Oracle Data Integrator Enterprise Edition delivers unique next-generation, Extract Load and Transform (E-LT) technology that improves performance, reduces data integration costs, even across heterogeneous systems. Unlike conventional ETL tools, Oracle Data Integrator EE offers the productivity of a declarative design approach, as well as the benefits of an active integration platform for seamless batch and real-time integration. In addition, hot-pluggable Knowledge Modules provide modularity, flexibility, and extensibility.
Oracle Data Integrator Enterprise Edition is optimized for Oracle Database to provide real-time data warehousing with advanced ETL/ELT and data modeling. Use Oracle Data Integrator EE together with Oracle Data Profiling and Data Quality for advanced data governance capabilities. Use Oracle Data Integrator EE together with Oracle SOA Suite to enable service-oriented data integration and management that gives companies even greater flexibility for their data services.
Oracle Data Integration Enterprise Edition, a foundational component to the Oracle Data Integration Suite is Oracle Fusion Middleware's strategic data integration offering; which has emerged from combining the best of both Oracle Data Integrator (ODI) and Oracle Warehouse Builder (OWB) into a single unified product offering.

Golden Gate - Real-time Access to Real-time Information


Oracle GoldenGate provides real-time, log-based change data capture, and delivery between heterogeneous systems. Using this technology, it enables cost-effective and low-impact real-time data integration and continuous availability solutions.

Oracle GoldenGate moves committed transactions with transaction integrity and minimal overhead on your existing infrastructure. The architecture supports multiple data replication topologies such as one-to-many, many-to-many, cascading and bidirectional. Its wide variety of use cases includes real-time business intelligence; query offloading; zero-downtime upgrades and migrations; disaster recovery; and active-active databases for data distribution, data synchronization and high availability.

As data accumulates it is quite likely discrepancies percolate into it over time. This may lead to data issues and bad analytics. Examples like Pin code or Telephone number patters have to be consistently formatted and represented to ensure uniform interpretation of this data. This is achieved by implementing data Quality techniques during the data load process.

 Also important Dimension data like Customer and Product ideally have to be same across the organizations. For example Scott Tiger and S. Tiger may mean the same thing in the context of a customer, but they may show up as different records in the customer Dimension which can lead up to analytics errors. Such scenarios are avoided by implementing Master Data Management.

Oracle Data Quality and Oracle Data Profiling - Improved Visibility and Control of Your Data


Oracle Data Profiling and Data Quality for Data Integrator extend Oracle Data Integration to provide advanced governance capabilities. Both products are fully integrated with Oracle Data Integration to place quality at the center of all your information initiatives and projects.
Oracle Data Profiling is a data investigation and quality monitoring tool. It allows business users to assess the quality of their data through metrics, to discover or infer rules based on this data, and to monitor historical metrics about data quality.
Oracle Data Quality for Data Integrator is the leading data quality platform that covers even the most complex data quality needs. Its powerful rule-based engine and its robust and scalable architecture places data cleansing at the heart of an enterprise data integration strategy.

Oracle MDM Data Quality


Data cleansing is at the heart of Oracle MDM’s ability to turn your data into an enterprise asset. Only standardized, de-duplicated, accurate, timely, and complete data can effectively serve an organization’s applications, business processes, and analytical systems. From point-of-entry anywhere across a heterogeneous IT landscape to end usage in a transactional application or a key business report, Oracle MDM’s Data Quality tools provide the fixes and controls that ensure maximum data quality.
There are two primary data categories: relatively structured party data and relatively unstructured item data. Party data includes people and company names such as customers , suppliers, partners, organizations, contacts, etc., as well as address, hierarchies and other attributes that describe who a party is. Pattern matching tools are best for cleansing this kind of data. Item data includes Products, Services, Assets, and the full range of attributes that describe what an item is.  Semantic matching tools are required for cleansing this kind of data. This is why Oracle provides data quality tools specifically designed to handle these two kinds of data. One is a suite of Customer Data Quality servers and the other is a state-of-the-art Product Data Quality server.


The heart of the BI system would be the database. Oracles own Database or an Exadata machine can house the expected Terabytes of data optimally to cater to faster BI.

Oracle Exadata


Oracle Exadata Storage Servers combine Oracle's smart storage software and Oracle's industry-standard Sun hardware to deliver the industry's highest database storage performance. To overcome the limitations of conventional storage, Oracle Exadata Storage Servers use a massively parallel architecture to dramatically increase data bandwidth between the database server and storage. In addition, smart storage software offloads data-intensive query processing from Oracle Database 11g servers and does the query processing closer to the data. The result is faster parallel data processing and less data movement through higher bandwidth connections. This massively parallel architecture also offers linear scalability and mission-critical reliability.

Oracle Database


Complex data warehousing features like aggregation, bitmap indexes, Business Intelligence, Change Data Capture, CUBE, data mart, Data Mining, Data Pump, dimensions, direct path loads, ETL, ILM, materialized views, OLAP, partition, query rewrite, ranking, ROLLUP, VLDB are supported OOTB by the database in the latest releases.


Critical decision driving reports and dashboards are the futures of an Analytics solution. Though OBIEE remains the standard Analytics solution, tools like Interactive Reporting, Web Analysis etc that were a part of the Hyperion set of tools also cater to this space. BI Apps, built using the OBIEE tool, provides a OOTB analytics solution for predefined sources like Oracle E-Biz, Siebel CRM, PeopleSoft and JD Edwards. There is a compatibility matrix available with Oracle mapping the availability of the predefined ETL's with specific versions of the OLTP system .There are ETL's grouped as Universal Adaptors, that cater to getting data from other sources apart from the ones listed above  (E.g. SAP, Custom OLTP systems etc)

OBIEE


Oracle Business Intelligence Suite Enterprise Edition Plus (Oracle BI EE Plus) is a comprehensive suite of enterprise BI products that delivers a full range of analysis and reporting capabilities. Featuring a unified, highly scalable, modern architecture, Oracle BI EE Plus provides intelligence and analytics from data spanning enterprise sources and applications—empowering the largest communities with complete and relevant insight.
     
OBIEE Applications


Oracle Business Intelligence Applications support over a dozen different functional areas with complete, pre-built, best-practice analytics, and a rich set of KPIs, metrics and analytic workflows. By accessing data from across the enterprise and delivering deep insight directly to business users, Oracle Business Intelligence Applications enable faster and more informed decisions that help the organization optimize resources, reduce costs, and improve the effectiveness of front-and -back-office activities ranging from sales to human resources (HR) to procurement.

Oracle's ERP Analytics help front line managers improve cash flow, control expenses, manage headcount and employee performance, stream-line spend and supply chain operations, and track the financial performance of major projects. Oracle's CRM Analytics provides fact-based insight into the entire sales process and into product demand, customer price sensitivity, and overall pricing effectiveness. They enable firms to manage and track campaign performance, and to segment and retain the most valuable customers. Organizations can assess the effectiveness of loyalty promotions and partner relationships, track and analyze key service center metrics, and generate a complete picture of contact center performance.     

NOTE:
·         SAP is not supported OOTB in the current release
·         Informatica is the ETL tool that is used to populate the data warehouse from the OLTP systems
·         In the current version, Essbase is not integrated for storing aggregated data. However this can be customized.


Essbase is placed between the two time lines intentionally. An Essbase cube can be used to store aggregated stored data as well as to analyze present data using excel along with Smartview. Essbase can also form a source for OBIEE where sophisticated analysis can be carried out.
NOTE: Essbase is not a solution to store transactional level data.

Essbase


Oracle Essbase is the industry-leading OLAP (online analytical processing) server. The newest release of Oracle's Hyperion Essbase product, Oracle Essbase provides an environment for rapidly developing custom analytic and enterprise performance management applications. With a rapid application development environment that is managed by the business, Oracle Essbase enables the business to quickly model complex business scenarios. For example, Oracle Essbase makes it easy for line-of-business personnel to develop and manage analytic applications that model complex scenarios, forecast the business and perform "what-if" analyses in order to look into the future. Oracle Essbase supports extremely fast query response times for vast numbers of users, for large data sets, and for complex business models. It is hot-pluggable across any data source.




·         What will be the most likely project cost?
·         In the Insurance business, given certain attributes of a person (eg. Job, Salary, Age, Locality etc), What is the person's risk profile?
·         In a Call Center where a caller is ordering a product, would'nt it be nice if you can suggest what else this person may be interested in and create a up sell opportunity

All these are instances of Forecasting and Predictive Modeling. Three major products cater to this space, apart from the Hyperion Application products like Planning or financial management. Essbase itself provides forecasting and trending capabilities as discussed earlier.

Oracle Real-Time Decisions


Oracle's Real-Time Decisions (RTD) platform combines both rules and predictive analytics to power solutions for real-time enterprise decision management. It enables real-time intelligence to be instilled into any type of business process or customer interaction. A high-performance transactional server delivers real-time decisions and recommendations. This server automatically renders decisions within a business process and reveals insights, creating actionable intelligence from data flowing through the process in real time.   

Oracle Crystal Ball


Oracle Crystal Ball is the leading spreadsheet-based application suite for predictive modeling, forecasting, simulation, and optimization. It gives you unparalleled insight into the critical factors affecting risk. With Crystal Ball, you can make the right tactical decisions to reach your objectives and gain a competitive edge under even the most uncertain market conditions.
Oracle Crystal Ball solutions begin with Oracle Crystal Ball, the base package for predictive modeling, Monte Carlo simulation and forecasting. Oracle Crystal Ball Enterprise Performance Management builds on that set of tools by adding the power of Oracle Crystal Ball to your Oracle Enterprise Performance Management (EPM) and Oracle Business Intelligence (BI) applications.



Oracle Data Mining


Oracle Data Mining (ODM)—an option to Oracle Database 11g Enterprise Edition—enables customers to produce actionable predictive information and build integrated business intelligence applications. Using data mining functionality embedded in Oracle Database 11g, customers can find patterns and insights hidden in their data. Application developers can quickly automate the discovery and distribution of new business intelligence—predictions, patterns and discoveries—throughout their organization.

Finally the enterprise wide static reporting solution would be BI Publisher.

Oracle Business Intelligence (BI) Publisher

Oracle Business Intelligence (BI) Publisher (formerly XML Publisher) is an enterprise reporting solution for authoring, managing, and delivering all your highly formatted documents, such as operational reports, electronic funds transfer documents, government PDF forms, shipping labels, checks, sales and marketing letters, and much more. Built on open standards, Oracle BI Publisher also allows IT Staff and developers to create data models against practically any data source and build custom reporting applications that leverage existing infrastructure. Oracle BI Publisher can generate tens of thousands of documents per hour with minimal impact to transactional systems. Reports can be designed using familiar desktop products and viewed online or scheduled for delivery to a wide range of destinations.


Oracle is the leader in Enterprise Performance Management (EPM), unifying Performance Management and Business Intelligence (BI), supporting a broad range of strategic, financial and operational management processes. Oracle provides a complete and integrated system for managing and optimizing enterprise-wide performance. This allows organizations to achieve a state of management excellence - being smart, agile and aligned - which provides competitive advantage and leverages their operational investments.
  • Smart : Enable advanced integration that improves agility and lowers costs of ownership
  • Agile: Integrate information from your financial performance management, operational intelligence, and transactional applications
  • Aligned: Drive pervasive intelligence across the enterprise by linking strategic, financial and operational management processes



Tuesday, June 8, 2010

Data warehousing Architect – A Beginners Guide


“Anywhere between 40% to 90%”. These are unflattering numbers from various past studies about the percentage of Data warehousing projects, which either has failed to achieve all the objectives that it started out to achieve or has failed altogether and shelved completely. From experience, the root cause of most of these failures lie at the very initial conceptualizing of the project. The most fundamental questions are not asked and the urge to deliver a BI solution precedes any logical comprehension of the actual requirement. This is what I call “Putting the cart before the horse” syndrome that affects most of the BI projects.

Introduction

What constitutes this project? This is not an uncommon dilemma at the start of the data warehousing project. The answer to this very important question, from my experience, makes or breaks the entire implementation. I have seen projects run for two years and then eventually break because the answer to this question was not adequately answered. This answer tends to be a bit tricky in Data warehousing implementations , as in the real world , we will be dealing with multiple technologies , integrating in different ways to get the desired project results. In this paper I will try to put together an outline methodology that can help kick start the project.

  

Define and Scope


What exists already


The ground work is the most important aspect to a successful project. No fancy technologies should be proposed here. Absolutely no jargon here please.
 
·         A simple pen and paper would be your tool of choice. 
·         Listening and patience will be your virtues.
·         Silly questions (trust me won’t look all that silly later) will be the agenda for the day.

Try finding what already exists and what the pain points are. You do not want to be delivering an inferior solution at the end of thousands of dollars and thousands analysis report.

Define Data warehousing scope


Data warehousing is an ocean of capabilities and scope. Varied aspects like setting up the database, creating a dashboard, embedding predictive analysis, optimizing or scheduling all come under the purview of the data warehousing umbrella. Broadly the project may involve any or all of the following

  • Sizing and database design
  • Define the data warehousing Model (generally a star schema)
  • Populating the data warehouse using ETL (or ELT)
  • Representing the data to the users using BI tools
  • Creating of aggregation content and optimization

It is very important to clearly define what concepts form a part of this project. This is also the stage where you will map technologies with the requirement. Remember technology must never be a yard stick to restrict the necessary deliverables for a business. In today’s scenario where one technology can seamlessly talk to another, deficiencies can easily be overcome.

Design

Data to Dashboards


This is the stage we start talking code and technical jargon. But before you jump in and start recruiting developers, take a pause and assimilate all the study that is already taken place. These are the questions you need to be clear on

  • Where is the data coming from?
  • Where does the data ultimately have to go?
  • How many parallel threads of development can happen at the same time and what are their dependencies?
  • Most importantly, what are the success criteria’s from each of the smaller code pieces? Once the smaller pieces are taken care of, generally the bigger picture becomes clearer.

Define Sources and Target


   I would consider this snippet as the most important piece of definitions in the context of the entire project. Just listing of the sources and targets are not sufficient here. This is where you will get answers regarding complexity, appropriate staffing plans and the correct testing methodology among others. Let us first split the two and see the critical questions that you need to ask and the parameters that need to be factored in.
  

Source System


  • How fragmented is the source?
  • How many technologies are involved in a source? It can range from simple flat files to complex web services.
  • How good is the source data in terms of quality? If the incoming data quality is very bad, data quality mechanisms need to be put in place to get meaningful results.
  • Particularly true when dealing with flat files, you may need to massage the file to get appropriate results. You may choose to do so in the ETL process, or you may do this upfront in the OS itself. For example if these transformations needs to be carried out in a Unix OS, having a resource skilled in UNIX handy, never hurts.

Target System       

           
  • It is likely you need not build a target system itself. It is likely the BI system access these data sources directly rather than accessing an intermediate data warehouse. This is a decision that would have already been taken in the previous steps. The important factors effecting this decision are mostly likely cost and time for the implementation. In these cases OBIEE's Common Enterprise Information Model is your knight in shining armor. Of course you need to be extra careful while designing you metadata for a OBIEE Presentation layer.

  • Data warehouse or No Data warehouse, this should lead to a common goal - "This system must answer all the business questions in a reasonable amount of time". This is what should drive a Business Intelligence project, no matter what decision is taken. Probably you should write this down in a post-it and stick it in your monitor. You should then first have a look at this post-it, ponder and then answer the question the developer just asked you.

It is advisable you freeze the target design before you start a parallel thread of designing a BI, else you will end up with a non ending cycle of development.

ETL -Define Transporting Mechanism


  This is almost the most undervalued aspect of a data warehousing implementation. More emphasis is always laid on the colorful BI piece of the project and ETL is often overlooked. This may later prove to be single point of failure in the fag ends of the project. Hence extra care needs to be taken at this stage. Here is my suggested simple methodology for design.
 
  1. Create the first cut of the Data warehouse aka star schema
  2. Figure out how each of the properties defined in the entities would get their data i.e. a Source to Target Mapping
  3. Next look at the post-it you created above. Try to map your business question to actual data and see if that makes sense.  If you get all of the questions the very first time, kudos to you, but it is seldom the case.
  4. Redefine your data warehouse
  5. Go back to step 2

At the end of the cycle you must have the fixed source system for your BI front end. During this design you may want to think about Materialized views or Cubes to store aggregate data .Strategies to populate these structures have to be documented here as well.

Do look up data warehousing jargon like CDC, SCD, ELT, Trickle Feed, Partitioning, MDM, Conformed dimensions etc. They may sound complex, but in reality they are very small but important concepts that encapsulate the real world scenarios.

(B)I-cing in the cake


All the hard work done, now you can start with the BI part of your project. This must generally not give you a lot of sleepless nights if all the previous steps are done right. I point to stress here is that if your number of rows that get displayed in the dashboard scrolls more than one page, this is possibly not the solution for you. If you are looking at static kind of reports, you must consider solutions like BI Publisher that are more suited for static reporting. Your BI dashboard should be aimed to cater to immediate and intuitive decision making and must not involve scrolling and interpreting data as in older times.
  
   I have to talk about one question that keeps getting asked repetitively in my session, particularly OBIEE. Where do I define the business transformations? Not a simple question to answer. My answer is always "Depends". Here are the considerations.
  
  • Define in the BMM layer of your RPD if the business logic is Complex and reused by many requests
  • If a logic is a one-off, it makes sense to define it in the front end itself
  • Not so famous option is to embed this in the Data warehouse itself and populate this in the ETL stage. Use this option if your Business logic remains fairly constant. Also if these derived metrics are used across BI solutions, it is worth doing this extra effort in the backend.

Finally define "What to Access" and "Who will access”. This is taken care of by implementing Security. You really don’t want your colleague to access the HR database and find out your real age, do you? Just Kidding! But in reality only information relevant to the privileges assigned to the user must be displayed.

Remember two things here Authentication and Authorization. These mean 2 different things. Things like LDAP, ADSI and Custom are some of the Authentication methods. VPD and Dashboard web groups signify Authorization.


Conclusion


As we have seen, data warehousing is a process and just not about beautiful dashboards and colorful graphs. There is nothing called a perfect data warehousing solution that fits all. There are not right or wrong answers here and this makes the job of the architect all the more important. I firmly believe a data warehousing technologist is more like a Psychiatrist, only that we earn a lot lesser. The job is more about listening and guiding the customer to the obvious BI solution to achieve their ultimate business KPI's.
 Just remember your planning decides if your data warehousing solution stands the tide of time and runs for ten years or your project runs for three years, runs out of funding and closes down .  

Happy data warehousing!!!!




Thursday, May 27, 2010

Excel To Intelligence using OBIEE



Business Intelligence is nothing but gathering useful insights of the business from the tons of data all organizations have.  Again this information gathering must not take a lot of time, nor require very sophisticated deductions to make sense of it.
Legacy BI systems have always been plagued with problems like Fragmented, Inconsistent Data. The analysis was very Report-centric and hence restricted. This led to lots of data that was pretty Non-Intuitive to a new user.

Oracle today is a market leader in the BI space as it has addressed the basic needs for BI reporting that has been lacking all these years. The basic pillars to contribute to an insight driven enterprise has been.

·         Empower Everyone – Every person is provided with relevant, complete information tailored to their role.
·         Provide Real-time Intelligence – Deliver insight that predicts the best next step, and deliver it in time to influence the business outcome
·         Use Insight to Guide Actions – Lead people to take action based on facts to optimize decisions, actions and customer interactions

These are the pillars of a BI enabled enterprise will drive the next level of value creation and competitive advantage for organizations.




Generally any Business Intelligence discovery session starts off with an excel sheet.  The impact on that opportunity stands greatly enhanced, if the sales consultant is able to immediately demonstrate the capabilities of the very powerful (yet simple) features of the product.

 A BI deduction derived from the OBIEE reports definitely adds the awe factor to the product. During the demonstration if you hear lines like “I didn’t know that about my department!!” consider your selling done.

This document will help you doing just that.  The power to take the customers  sample excel, designing a OBIEE solution based on that excel and to gain BI insights on the data , creates a greater impact rather than running through a 100 slides listing the capabilities of the product. Remember, generally it may be the very first time the customer is seeing a BI solution and “Simplicity” is probably the feature that needs to be demonstrated. At this point the customer is probably not interested in knowing jargon like disconnected analytics, Query Federation or the very powerful caching capabilities of the product.  These feature are definitely powerful solutions built in that needs to be told, but after the customer has got hooked to the idea of a BI solution.

 In this document I have taken a simple skills bank survey excel from a fictitious organization and used OBIEE to design requests based on the data.  Features like Common Enterprise Information Model, Guided Analytics, and Dash boarding Capabilities etc are some of the features that have been highlighted in this simulated solution.




We start off with a simple excel sheet on our journey of BI discovery.



The excel sheet has the following data columns.

  • Employee
  • Acquired Level Rating
  • Job Family
  • Required Level
  • Job Role
  • Gap
  • Job Role Grade
  • Business
  • Skill Domain
  • Department Id
  • Skill
  • Department Id Description
  • Survey Status
  • Location
  • Years In Current Job
  • Director Employee Number
  • Years In Current Job Range
  • Grade
  • Years Of Service
  • Outcomes Rating
  • Years Of Service Range
  • Behavior Rating
  • Acquired Level
  • Overall Rating

At this point, the Data-Warehousing consultant in you may be calling out to design Facts and Dimensions.  Remind yourself that the customer is new to BI itself and does not understand too much about Data warehousing and star schemas.  However in an ideal implementation case, this is the point you will design your data warehouse and out in place various ETL techniques to populate your warehouse that will form a source to your OBIEE.


You may directly use the excel sheet as a source to OBIEE using ODBC. However in this example I have sourced the data into an Oracle Database Table using Oracle Data Integrator (ODI).



I have done this primarily for 2 reasons

  1.  An Oracle Database gives me greater flexibility to remodel this solution at a later point in time if required.
  2.  I am introducing the customer to an ETL tool that usually is incorporated when dealing with large enterprise wide BI solutions.




  
Next we need to put in place our Metadata that will form the source to the Answers layer. We first have to import the table we have just created in our backend. This step assumes you have already created a ODBC connection to the respective Oracle schema.
Point out to the customer the option to import a Multidimensional Data Source.  Again reminding yourself about the customer being new to Data warehousing, explain in brief about the need for multidimensional view of a data ( eg, No . Of Units sold viewed against a combination of Region\Time\Customer Profile\Product Profile\ Customer Satisfaction etc ). This may get the ball rolling for up-selling Oracle Essbase at a later point in time.





Point out the three separate layers in the Oracle BI Server Admin Tool you have just opened - Physical, Business Model & Presentation. 

The Physical layer is an exact depiction of the target physical database(s).  The Oracle BI EE platform is 100% data agnostic and has no preconceived notion of how the data is modeled.  The Key Point here is that the Analytics platform should not dictate the data design.  Some BI tools require the data to be modeled in a particular way, such as a star or snowflake schema.  The Oracle BI EE tool maps in the data the exact same way in which the customer designed that data to support their business processes.  No data is moved.  Oracle BI EE simply references the way the data looks from the database system catalog.


Next we move to defining the Business Model in the BMM portion of the BI Admin tool.

·         It provides a simplified “business view” of information available to all BI  tools and all users in the organization regardless of physical data locations or structures.  In this way, you can almost think of the business model as a sort of firewall between the users of the information and all the mechanics of the physical data access layer.
·         Data is separate from the model. So let’s say that you have data on inventory in a certain operational data store that is mapped in, and then someone decides that system is going to be retired, so you have to map to a different source to get that inventory data. You can simply remap to the new schema, and it won’t impact all the end user reports because there is a separation or abstraction layer between the physical data layer and the logical business model layer. This allows a high degree of flexibility for IT to make changes to data structure without breaking thousands of end user reports, dashboards, etc.
·         It provides a place where business metrics and calculations are defined, enabling more consistency and alignment in the organization.  For example, let’s say the organization has a particular way in which they define customer value, or product profitability.  These calculations and metrics can be defined in the metadata and made available to users – define once, deploy everywhere.

We can create a logical table in the BMM layer and drag the required columns from the Physical layer we had previously imported.  From the analysis of excel there are 3 distinct areas of information

  1. The Skill Details
  2. The Job details
  3. The rating for an employee against each skill. Apart from rating other employee specific information like region, job grade etc also form a part of this logical table

We will hence create 3 different logical tables for each of the above logical groupings







We need to now relate these 3 different logical sources. The BI Admin tool provides you with a GUI which enables you to create these relations with ease.
There is a concept of something called
“Complex Join” when you join 2 sources. To put it simply this type of join paves the way for the BI server to intelligently relate 2 sources that may come from very different technologies. This is the way to establish joins in cases like ours where we are using a flat Excel sheet.





 Next we need to define the keys that will differentiate the rows in the supporting logical tables namely Jobs and Skills.

On referring back to the excel sheet, in jobs the combination of JOB_ROLE and JOB_FAMILY form a unique combination. Similarly for Skills the SKILL column has unique values.


The Business Model is also where the administrator creates the default Drill-Paths or Hierarchies.  In our excel sheet there are 2 distinct hierarchies of information

  1. Skill Domain Drills down to Skill
  2. Job Family Drills down to Job Role

E.g. If I click on Consulting Service Delivery, should drill down to the following:
·         Executive Director
·         Senior Director
·         Director
·         IC4


These hierarchies are called “Dimensions” in OBI Tool. We can create them by simply right clicking the corresponding support logical table and selecting the “Create Dimension” option.



 The resultant dimension is not complete. As JOB_FAMILY forms a parent for the JOB_ROLE, you need to drag that field to the higher level.

         

You may follow the same steps for defining the Skills Hierarchy as well. Below is the drill down effect we are trying to achieve in the front end.




Finally we can now populate the Presentation layer of the tool. The Presentation layer is a second layer of abstraction that sit-on-top of Business Model.  You will simply drag-and-drop the logical applications you have just created from the Business Model, into the Presentation layer.  The Presentation layer is presented to end-users in a complete WYSIWYG framework.

The goal is to make the navigation of the Subject Area as easy as possible for the end-users to enable self-service analyses.  The Presentation layer is completely masked from changes in the physical data.  Therefore there is no need to re-build reports each and every time the physical data changes.  Put this in context when you have a couple thousand reports within an application.  The amount of re-work is mind-boggling.



Now we have almost completed (note “almost”) completed the metadata setup. All this is saved in a file stored in the File system with an extension of .rpd. This file is loaded by the BI Server when it starts up. This is the core of all the magic we are about to see via the front end.


The end user usually accesses the presentation service.  The entire BI solution is light weight and is accessed via a browser.  This will the very first screen they will be prompted with.


This will be the username\password of the .rpd we had created earlier.  The default user name is “Administrator” with the password as blank.

After successful log-in the user is looking at the “Answers” module of OBI.  This is the section where As-Hoc queries are designed and executed.  The first step is the selection of a “Subject Area “. This is nothing but the parent folders we had defined in our Presentation Layer in the previous steps. In our case we just have one called “SkillDB”.  If the solution was more complex we could have chosen to distribute the sources from the BMM (Business Model and Mapping Layer) into different Subject Areas, just to ease manageability.

           


After this selection we get to see the exact replica of different columns we defined in our Presentation layer.




We see all the columns are in Capital case.. not so pleasing to the eye. Remember I told you before “almost” completed with the .rpd setup.  Now there are some neat tools within our BI Admin tool that can be used to do neat little tasks like this. We may have chosen to manually rename each of the columns, buy why do work when there is automation. The Rename Wizard is available under ToolsàUtilities.




You will see in the above figure all the elements are marked in red. This is because I have opened the .rpd in “online mode” and then checked them out before I make the changes.

After the rename process is done you will need to check-in the changes and refresh the changes in the presentation services before you see the changes. The “Refresh Display” and “Reload Server Metadata” options are available below the Presentation layer Table listing in the Answers page.



Now let’s get to some serious business of creating Ad-hoc reports and gaining insights into the excel data.  I want to find out which of the employees are not performing so well. Add the following columns in the Criteria

·         Employee, Gap from Employee Facts
·         Skill Domain and Skill from Skills

Then click on the Pivot table option



All the columns in your criteria are grouped under rows by default. Using the simple Drag and Drop interface, distribute the columns as in the below figure


If you do not see the gap values, change the “Aggregation Rule” to Sum in the Measures section.




What we have ended up with is with a pivot table with the knowledge gap data.  Though this is useful, it is not as intuitive as is expected of a BI tool.


We will now put in traffic lights as applicable to make the data more interpretable.  We shall add conditional formatting to the GAP column. We will show a red, yellow and green traffic light based on the gap values of <0, 0 and >0 respectively. The steps of which are shown below.









Lo and behold below is the very user friendly report that is the result of the conditional formatting we placed above.




Some conclusions that can be deduced from the results are :

  • Emp0000002 has exceptional Leadership qualities. Probably he\she must be encouraged to get into a managerial stream.

  • Emp0000001 is lagging behind in most of the skills. Ideal to send this employee to trainings to fill in, the knowledge gaps.  We can extend this report by adding conditional guided navigations that will guide the end user to an internal training enrolment page perhaps.


This report will go away as soon as we close the browser.  We don’t want to lose all our hard work do we? So we go ahead and save this report (or requests in OBIEE terms). Before we can do so there is one small step we need to carry out.  These request first needs to be added onto something called the “Compound View”. This is the view that enables the request to be published to the outside world. Multiple such requests may be added to the compound view.  All requests can be of different types as well e.g. Graphs, Pivot Table, Simple Tabular view, Narrative, Simple Text etc

Steps to add to the compound view and saving the request are demonstrated below.







I have chosen to delete the tabular view that is added by default and just retained the Pivot table before saving this request.






By now the customer is pretty awestruck by the products capabilities. Drive home the point by added these requests to dashboards and demonstrating the ease of putting together very complex enterprise wide dashboards in very little time. You may add things like the view selector, Dashboard Prompts, Drill down Graphs etc to demonstrate the sophistication of this tool.







Now is the time to talk to the customer of other advanced topics like Star Schemas, Caching capabilities, Aggregation setups, query federation, security models, Multi user development environments, BI Publisher, Disconnected Analytics, Oracle Delivers (e.g ibots), BI Office Add-In etc



To conclude these are the points that needs to be driven in about OBIEE at the end of this demo

  • Simplifies BI tools infrastructure
  • Integrates BI foundation with IT architecture
  • Offers consistent view of information\single-source accountability
  • Improves customer choice for BI solutions