Data warehousing has always been synonymous with the database. Data warehousing is inevitably related with TB's of data and powerful DB machines. To a certain extent this is true, as what is analysis without a large data set. When dealing with such huge sets of data, it is always recommended to leverage the already natively available functionalities of the DB rather than depending of the BI frontend to do this job. The Oracle Database is continuously evolving as a complete Data warehousing system. In this series of articles I will introduce various features of the DB that specifically cater to Data warehousing.
Oracle Advanced Compression
The volume of data being generated and retained by businesses has seen an exponential growth in the last few years. On an average, it is not uncommon to see data growing by 2-3 times every couple of years thanks to a variety of factors such as government regulations that mandate storing historical data for a certain number of years, massive amount of user generated content through web 2.0 applications and increasing adoption of multimedia content – enabled in turn by growing penetration of broadband technologies. Consequently, terabytes sized database – that were a novelty in the past – have becomes so commonplace these days. A key challenge before enterprises today is how to manage this enormous volume of data in a cost effective manner as they still have a limited IT budget. On a more technical level, there is also an associated challenge of ensuring the application performance does not degrade unacceptably while accessing this enormously large volume of data.
This brings compression technology at forefront. Oracle has known and used compression technology in some form of other so it should not need much explanation. The most prominent benefit of compression of course is the reduced storage consumption. However, the compression technology needs to be highly efficient and intelligent before it can be used with enterprise application in view of their demanding performance and scalability requirements.
Let’s first talk about compression of regular relational data. Oracle was the early pioneer in terms of compression technology for databases when it introduced table compression for bulk load operations in Oracle9i. Using this feature, customers can compress data at the time of doing bulk load using operations such as DIRECTLOAD, CTAS, etc. However, until now, compression was not available for regular data manipulation operations such as INSERT and UPDATE. In Oracle database 11g, Oracle has extended the compression technology to support these operations as well. Consequently, compression in Oracle Database 11g can be used for all kinds of workload – be it OLTP or DW.
It is important to mention here that table compression enhancements introduced Oracle database 11g are not just incremental changes. Rather enormous amount of work has gone into making sure that the new compression technology has negligible impact on updates since any noticeable write time penalty due to compression won’t be acceptable in OLTP environment. As a result compression technology in Oracle database 11g is very efficient which could reduce the space consumption up to 2 to 3 times. And while you do that, not only your write performance does not degrade, your read performance or queries actually improves. This is because unlike desktop based compression techniques where you have to wait for data to uncompressed, our technology reads the compressed data directly and does not require any uncompressing. This makes sure that our query performance does not degrade because of compression. Now, the question is, How does the query performance improve?. It does so because we need to fetch less data to answer a query due to compression. Another advantage of fetching less data to satisfy a query is that we can now use the memory used to buffer the data read from disk more efficient thereby maximizing the memory utilization efficiency.
Finally, just to make sure that I do not miss anything obvious, Oracle’s compression technology is completely application transparent. That mean that you can use this technology with any home grown or packaged application such as SAP, Siebel , EBS, etc.
Sample Syntax
· OLTP Table Compression Syntax:
CREATE TABLE emp (
emp_id NUMBER
, first_name VARCHAR2(128)
, last_name VARCHAR2(128)
) COMPRESS FOR ALL OPERATIONS;
· Direct Load Compression Syntax (default):
CREATE TABLE emp (
emp_id NUMBER
, first_name VARCHAR2(128)
, last_name VARCHAR2(128)
) COMPRESS [FOR DIRECT_LOAD OPERATIONS];
Best Practices
· Compress your 10 Largest Tables
· 80/20 Rule – 20% of your Tables Consume 80% of your Space
· Better Compression with Bigger Block Sizes
· Higher Probability of Duplicate Values
· B-Tree Index Compression
· Validate Index and Review INDEX_STATS
· INDEX_STATS.OPT_CMPR_COUNT
· INDEX_STATS.OPT_CMPR_PCTSAVE
· Bitmap Indexes are Highly Compressed
· Good for low and medium cardinality
Compression Advisor
· Shows projected compression ratio for uncompressed tables
· Reports actual compression ratio for compressed tables
Quick Links
http://www.oracle.com/technology/products/database/compression/index.html
http://www.oracle.com/technology/oramag/oracle/08-jul/o48advcomp.html
http://www.oracle.com/technology/products/database/compression/compression-advisor.html
http://www.oracle.com/consulting/library/data-sheets/oracle-compression-quick-start.pdf
No comments:
Post a Comment