Thursday, July 2, 2015

Partitioning to Perfection


In Oracle Database 11g, your partitioning choices are now virtually limitless.


"Divide and conquer"—that figurative principle has never been better illustrated than by Oracle Database's partitioning features. Since version 8, you can partition a table or index into multiple segments and then place them in different tablespaces. The table is still addressed as a logical entity while the individual partitions are stored as separate segments, which allows for easier manipulation of data.
In version 11, enhancements such as reference partitioning, interval partitioning, partitioning virtual columns, and extended composite partitioning enable infinite partitioning design possibilities and boost manageability.

Extended Composite Partitioning
With composite partitioning—a scheme introduced in Oracle8i Database—you can create subpartitions from partitions, allowing further granularity of the table. But in that release, you could subpartition range-partitioned tables only via hash subpartitioning. In Oracle9i, composite partitioning was expanded to include range-list subpartitioning.
These schemes cater to most circumstances but not all. For example, say you have a table called SALES with many columns including two special ones that are candidates for partitioning: state_code, which stores a two-digit code for the state where the sale was made, ostensibly for the purpose of calculating the sales tax; and product_code, a three-digit number identifying the product sold by that sales record. Users query on the table filtering on both columns equally, and the archival requirements are also based on both these two columns. When you apply the principles of partitioning decisions, you find that both these columns are good candidates for partitioning keys.
In Oracle Database 11g, you can solve the problem fairly easily. In this release, you are not limited to range-hash and range-list composite partitioning. Rather, your choices are virtually limitless; you can create composite partitions in any combination.
In this example case, you can decide to LIST partition the table on product_code, as that column has more discrete values and then subpartition on state_code, again as list. This code sample shows how to do that:
create table sales
(
   sales_id     number,
   product_code number,
   state_code   varchar2(2)
)
partition by list (product_code)
subpartition by list (state_code)
(
   partition p101 values (101)
   (
      subpartition p101_ct values ('CT'),
      subpartition p101_ny values ('NY'),
      subpartition p101_def values (default)
   ),
   partition p201 values (201)
   (
      subpartition p201_ct values ('CT'),
      subpartition p201_ny values ('NY'),
      subpartition p201_def values (default)
   )
)


The options are not limited to what is shown here. You can also create LIST-RANGE composite partitions. Suppose, in the example above, product code is not discrete but is more of a range. You would want to list partition on state_code and then subpartition on product_code. Here is the code sample that does it.

create table sales1
(
   sales_id     number,
   product_code number,
   state_code   varchar2(2)
)
partition by list (state_code)
subpartition by range (product_code)
(
   partition CT values ('CT')
   (
      subpartition ct_100 values less than (101),
      subpartition ct_200 values less than (201)
   ),
   partition NY values ('NY')
   (
      subpartition NY_100 values less than (101),
      subpartition NY_200 values less than (201)
   )
)


You can create range-range composite subpartitions as well, which come in very handy when you may have two date fields. Consider, for instance, a table for a sales processing system that has a transaction date and delivery date. You may want to range partition on one date and then also range subpartition on the other. This scheme allows you to do backups, archiving, and purges based on dates.

In summary, you can create the following types of composite partitions available in Oracle Database 11g:
Range-range
Range-hash
Range-list
List-range
List-hash
List-list
Reference Partitioning
Here is a typical problem in designing partitioning schemes: not all the tables have the same columns on which you need to partition. Suppose you are creating a sales system with two simple tables, sales and customers:

create table customers
(
   cust_id   number primary key,
   cust_name varchar2(200),
   rating    varchar2(1) not null
)
partition by list (rating)
(
   partition pA values ('A'),
   partition pB values ('B')
);


The table sales is created as shown below. This is a child table of the customers table.

create table sales
(
   sales_id    number primary key,
   cust_id     number not null,
   sales_amt   number,
   constraint  fk_sales_01
    foreign key (cust_id)
    references customers
);


Ideally, you would want to partition the table sales in the same manner as table customers: list partitioned on the column rating. But there is a serious problem: table sales does not have a column called rating! So how do you partition it on a non-existent column?

In Oracle Database 11g you can, using a new feature called Reference Partitioning. Here is an example to show how you can apply it to the sales table:
create table sales
(
   sales_id    number primary key,
   cust_id     number not null,
   sales_amt   number,
   constraint  fk_sales_01
    foreign key (cust_id)
    references customers
)
partition by reference (fk_sales_01);


This creates partitions identical to those in the parent table, customers. Note that there is no column called rating, yet the table has been partitioned on that column. The clause partition by reference (fk_sales_01) has the name of the foreign key in the partition definition. This instructs Oracle Database 11g to confirm the partitioning is done per the scheme used in the parent table—in this case, customers. Note the NOT NULL constraint for column cust_id.; this is required for reference partitioning.

If you check the partition boundaries for the partitions in sales table:
SQL> select partition_name, high_value
  2  from user_tab_partitions
  3  where table_name = 'SALES';

PARTITION_NAME  HIGH_VALUE
---------------        -------------------------------
PA
PB


The high value is null, meaning that the boundaries here are derived from the parent table. The partitions have the same names as in the parent table. You can check the type of partitioning by querying the view user_part_tables. A special column called ref_ptn_constraint_name shows the foreign key constraint name.

SQL> select table_name, partitioning_type, ref_ptn_constraint_name
  2  from user_part_tables
  3  where table_name in ('CUSTOMERS','SALES');

TABLE_NAME                     PARTITION REF_PTN_CONSTRAINT_NAME
------------------------------                ---------     --------------------------
CUSTOMERS                      LIST
SALES                          REFERENCE FK_SALES_01


Reference partitions come extremely handy when you want to partition a child table in the same fashion as in the parent table but do not have the same columns, and you do not want to introduce them just for the sake of partitioning. Furthermore, you do not need to explicitly declare a long partitioning clause for each child table.
Interval Partitioning
Range partitioning allows you to create partitions based on ranges of the values of the partition key column. Here is an example of the range partitioned table:


create table sales6
(
   sales_id    number,
   sales_dt    date
)
partition by range (sales_dt)
(
   partition p0701 values less than (to_date('2007-02-01','yyyy-mm-dd')),
   partition p0702 values less than (to_date('2007-03-01','yyyy-mm-dd'))
);


Here you have defined partitions for January 2007 and February 2007 only, so what happens if a record is inserted into the table that has the sales_dt in March 2007? The insert will fail with the following error:

ORA-14400: inserted partition key does not map to any partition


Obviously you need to add a partition for March 2007 before you can insert a record. But this is often easier said than done. Often you can't afford to create a lot of partitions beforehand and too few of them may result in this error.

Wouldn't it be better if Oracle somehow automatically sensed the need for new partitions and then created them? Oracle Database 11gdoes, with a feature called Interval Partitioning. Here, you don't define partitions and their boundaries but merely an interval that defines each partition's boundaries. Here is the same example in interval partitioning:
create table sales6
(
   sales_id    number,
   sales_dt    date
)
partition by range (sales_dt)
                             
interval (numtoyminterval(1,'MONTH'))
(
   partition p0701 values less than (to_date('2007-02-01','yyyy-mm-dd'))
);
                           
Note the clause: interval followed by the interval. Here you have instructed Oracle to create intervals of one month each. You have also created the initial partition named p0701, for the January 2007 data. Now, suppose you insert a record with June 2007 data:

SQL> insert into sales6 values (1,'01-jun-07');

1 row created.


Oracle does not return an error; rather; it successfully executes the statement. So where does the record go to? The partition p0701 can't have the record and we haven't defined a partition for June 2007. But at this time if you check the partitions of the table:

SQL> select partition_name, high_value
  2  from user_tab_partitions
  3  where table_name = 'SALES6';

PARTITION_NAME  HIGH_VALUE
---------------        ----------------------------------------------------------------
P0701           TO_DATE(' 2007-02-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_C
                ALENDAR=GREGORIA

SYS_P41         TO_DATE(' 2007-07-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_C
                ALENDAR=GREGORIA


Note the partition named SYS_P1 with a high value of July 1, 2007, which will accommodate data up to the end of June. This partition was created dynamically by Oracle and has a system generated name.

Now suppose you enter a value lower than highest value, such as May 1, 2007. It should ideally have its own partition, as your partition interval is a month.
SQL> insert into sales6 values (1,'01-may-07');

1 row created.

SQL> select partition_name, high_value
  2  from user_tab_partitions
  3  where table_name = 'SALES6';

PARTITION_NAME  HIGH_VALUE
---------------        ----------------------------------------------------------------
P0701           TO_DATE(' 2007-02-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_C
                ALENDAR=GREGORIA

SYS_P41         TO_DATE(' 2007-07-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_C
                ALENDAR=GREGORIA

SYS_P42         TO_DATE(' 2007-06-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_C
                ALENDAR=GREGORIA


Note the new partition SYS_P42, which has the upper bound as June 1—thus the partition can hold the May 2006 data. This partition was created by splitting the SYS_P41 partitions (for June). Thus, Oracle automatically creates and maintains the partitions when you define an interval partitioning scheme.

If you want to store the partitions in specific tablespaces, you can do so by using the store in clause:
interval (numtoyminterval(1,'MONTH'))
store in (TS1,TS2,TS3)


which stores the partitions in tablespaces TS1, TS2, and TS3 in a round robin manner.

How would an application developer address a specific partition? One way is to know the name, which may not be possible, and even if you know, it is highly error prone. To facilitate the access to specific partition, Oracle Database 11g offers a new syntax for partitioning SQLs:
SQL> select * from sales6 partition for (to_date('15-may-2007','dd-mon-yyyy'));

  SALES_ID SALES_DT
  ----------    ---------
      1    01-MAY-07


Note the new clause for (value), which allows you to directly reference partitions without explicitly calling them by their exact name. If you want to truncate or drop a partition, you can call this extended portioning syntax.

After the table is created in this manner, the PARTITIONING_TYPE column in view DBA_PART_TABLES shows INTERVAL.

System Partitioning
Although Oracle anticipates few if any practical uses for this feature, I want to describe it anyway because it's really cool.
Here is a rare but not inconceivable of use case: Imagine that you have a table that just can't be partitioned in any logical way. The result is a huge, monolithic table, which poses problems such as the need fo extended index maintenance and other operations.
So, the developers come forward with a solution: They promise that if the table can be partitioned somehow, they can write to the partitions in an intelligent manner. By doing so, the application can control which partition a specific record goes to. The DBA need merely define the partitions. Here is an example:
create table sales3
(
    sales_id   number,
    product_code number,
    state_code   number
)
partition by system

   partition p1 tablespace users,
   partition p2 tablespace users
);

Note that there is no partition key or the boundaries. So, the table is physically divided into two segments but is still a logical table. Once defined this way, the database creates two segments for the table, instead of just one monolithic table. You can check it by:

SQL> select partition_name
  2  from user_segments
  3  where segment_name = 'SALES3';

PARTITION_NAME
------------------------------
P1
P2

When you create a local index, it is also partitioned the same way.

SQL> create index in_sales3_state on sales3 (state_code) local;

Index created.

SQL> select partition_name
  2  from user_segments
  3  where segment_name = 'IN_SALES3_STATE';

PARTITION_NAME
------------------------------
P1
P2

You can check the type of partitioning by checking in user_part_tables:

SQL> select partitioning_type
  2  from user_part_tables
  3  where table_name = 'SALES3';

PARTITION
---------
SYSTEM

This shows up as SYSTEM, indicating system partitioning, of course. One point to note that the high_value column values are NULL for these types of tables.

SQL> select partition_name, high_value
  2  from user_tab_partitions
  3  where table_name = 'SALES3';

PARTITION_NAME HIGH_VALUE
--------------        ---------------------
P1              
P2

Here's an interesting question: if there is no partitioning key or scheme such as range, list or hash,how does Oracle know to which partition the incoming record should go?

The answer is: Oracle doesn't. Here is an example of what happens if you want to insert a record into the table:
SQL> insert into sales3 values (1,101,1);
insert into sales3 values (1,101,1)
            *
ERROR at line 1:
ORA-14701: partition-extended name or bind variable must be used for DMLs on
tables partitioned by the System method
The partitions bounds are not known so the application must provide that information by using the partition-aware syntax while inserting data. You would need to rewrite that statement as:


SQL> insert into sales3 partition (p1) values (1,101,1);

1 row created.

While deleting, you do not have to provide the partition-aware syntax—but remember, there is no concept of partition boundaries. So, when you issue a statement like this:

SQL> delete sales3 where state_code = 1;

Oracle has to scan all the partitions to see where the row resides. To avoid that, you should write it as:

SQL> delete sales3 partition (p1) where state_code = 1;

The same goes for updates. This limits the partitions where the record is searched.
System partitions offer tremendous advantages when a table can't be partitioned in any logical way. They let you take advantage of the benefits of partitioning while allowing free rein to the developers in deciding to which partition a record should go.
Tablespace Transport for a Single Partition
In earlier versions of Oracle Database, you gained the ability to transport a tablespace and later plug it into a different database or to the same one. The process involves the copying of datafiles so it is the fastest way to transfer data across databases. However, until now, you didn't have the ability to transport the tablespace of a single partition and then plug it back. In Oracle Database 11g, you can.

Suppose you have a table called SALES5, with several partitions named CT, NY, etc.
SQL> select partition_name, tablespace_name
  2  from user_tab_partitions
  3  where table_name = 'SALES5';


PARTITION_NAME TABLESPACE_NAME
-------------- ---------------
CT             TS1
NY             TS2

Now, you can transport the partition CT using the command shown below:


$ expdp tables=scott.sales5:ct transportable=always directory=data_pump_dir dumpfile=p_ct.dmp Export:
Release 11.1.0.4.0 - Beta on Sunday, 10 June, 2007 16:05:40 Copyright (c) 2003, 2005, Oracle.
All rights reserved. Username: / as sysdba

Connected to: Oracle Database 11g Enterprise Edition Release 11.1.0.4.0 - Beta
With the Partitioning, Oracle Label Security, OLAP, Data Mining
and Oracle Database Vault options
Starting "SYS"."SYS_EXPORT_TABLE_01":  /******** AS SYSDBA tables=scott.sales5:ct transportable=
  always directory=data_pump_dir dumpfile=p_ct.dmp
Processing object type TABLE_EXPORT/TABLE/PLUGTS_BLK
Processing object type TABLE_EXPORT/TABLE/TABLE
Processing object type TABLE_EXPORT/TABLE/END_PLUGTS_BLK
Master table "SYS"."SYS_EXPORT_TABLE_01" successfully loaded/unloaded
****************************************************************************
Dump file set for SYS.SYS_EXPORT_TABLE_01 is:
  /home/oracle/oracle/admin/PROBE2/dpdump/p_ct.dmp
******************************************************************************
Datafiles required for transportable tablespace TS1:
  /home/oracle/oradata/PROBE2/PROBE2/ts1_01.dbf
Job "SYS"."SYS_EXPORT_TABLE_01" successfully completed at 16:05:55

Now, you can take these two files—p_ct.dmp and ts1_01.dmp—to another system and try to plug into the database. For learning purposes, let's try to plug into the same database. First, you need to drop the table and then the tablespace ts1.

SQL> drop table scott.sales5;

Table dropped.

SQL> drop tablespace ts1 including contents;

Tablespace dropped.

Now, plug the tablespace into the database. But here's a little problem: the table sales5 no longer exists and you had initially exported only one partition (ct), not the entire table. So how can you import just one partition of a non-existent table?
In Oracle Database 11g, a new command line option in Data Pump Import called partition_options makes that possible. If you specify the value departition, Data Pump will create a new table from the partitions exported. In a way this approach "breaks" partitions, so it's appropriately named departition. Let's see how it works.
$ impdp partition_options=departition dumpfile=p_ct.dmp
   transport_datafiles='/home/oracle/oradata/PROBE2/PROBE2
/ts1_01.dbf'

Import: Release 11.1.0.4.0 - Beta on Sunday, 10 June, 2007 21:58:08

Copyright (c) 2003, 2005, Oracle.  All rights reserved.

Username: / as sysdba

Connected to: Oracle Database 11g Enterprise Edition Release 11.1.0.4.0 - Beta
With the Partitioning, Oracle Label Security, OLAP, Data Mining
and Oracle Database Vault options
Master table "SYS"."SYS_IMPORT_TRANSPORTABLE_04" successfully loaded/unloaded
Starting "SYS"."SYS_IMPORT_TRANSPORTABLE_04":  /******** AS SYSDBA partition_options=
   departition dumpfile=p_ct.dmp transport_datafiles=/home/oracle/oradata/PROBE2/PROBE2/ts1_01.dbf
Processing object type TABLE_EXPORT/TABLE/PLUGTS_BLK
Processing object type TABLE_EXPORT/TABLE/TABLE
Processing object type TABLE_EXPORT/TABLE/END_PLUGTS_BLK
Job "SYS"."SYS_IMPORT_TRANSPORTABLE_04" successfully completed at 21:58:23

This SQL creates a table called sales5_ct, which is nothing but the ct partition of SALES5 table exported by transportable tablespace earlier. The table name, as you can see, is a combination of the original table and the partition names. You can conform the presence of the segment by checking the DBA_SEGMENTS view.

SQL> select segment_name
  2  from dba_segments
  3  where tablespace_name = 'TS1';

SEGMENT_NAME
-----------------
SALES5_CT

You can use single-partition transportable tablespace feature to plug in a single partition of a table to a different database. After plugging it in, you may want to perform an exchange partition operation to put that as a partition on some table there.

Partitioning on Virtual Columns
Let's see another common problem. In your table called sales, you have the following columns:

SQL> desc sales
 Name                                      Null?    Type
 -----------------------------------------                     --------     ------
 SALES_ID                                  NOT NULL NUMBER
 CUST_ID                                   NOT NULL NUMBER
 SALES_AMT                                          NUMBER

Suppose you want to partition this table by some scheme that allows you to purge and archive is based on the amount of sale. Here are the four categories of sale:



If sale_amt is and cust_id is Then sale_category is
0-10000 anything LOW
10001-100000 0-100 LOW
10001-100000 101-200 MEDIUM
10001-100000 >200 HIGH
100001-1000000 0-100 MEDIUM
100001-1000000 101-200 HIGH
100001-1000000 >200 ULTRA
>1000000 Anything ULTRA

You want to partition this table on the column sale_category, but there is a problem: there is no column called sale_category. This is something you derived from the sale_amt column. How can you partition this table, then?
In earlier versions of Oracle you could have put a new column in the table called sale_category, and used a trigger to populate the column using the logic shown in the table. But the presence of the new column would have caused another performance hit due to the triggers.
In Oracle Database 11g, a new feature called Virtual Columns allows you to create a column that is not stored in the table but rather is computed at run time. You can also partition on this column. Using this feature, it's a breeze to partition this table.
create table sales
(
   sales_id      number,
   cust_id       number,
   sales_amt     number,
   sale_category varchar2(6)
   generated always as
   (
      case
         when sales_amt <= 10000
            then 'LOW'
         when sales_amt > 10000
            and sales_amt <= 100000
            then case
               when cust_id < 101 then 'LOW'
               when cust_id between 101 and 200 then 'MEDIUM'
               else 'MEDIUM'
            end
         when sales_amt > 100000
            and sales_amt <= 1000000
            then case
               when cust_id < 101 then 'MEDIUM'
               when cust_id between 101 and 200 then 'HIGH'
               else 'ULTRA'
            end
         else 'ULTRA'
      end
    ) virtual
)
partition by list (sale_category)
(
   partition p_low values ('LOW'),
   partition p_medium values ('MEDIUM'),
   partition p_high values ('HIGH'),
   partition p_ultra values ('ULTRA')
)

Now if you insert rows:
SQL> insert into sales (sales_id,cust_id,sales_amt) values (1,1,100);  1 row created.
SQL> insert into sales (sales_id,cust_id,sales_amt) values (2,1,1500);  1 row created.
SQL> insert into sales (sales_id,cust_id,sales_amt) values (3,102,1500);  1 row created.
SQL> insert into sales (sales_id,cust_id,sales_amt) values (4,102,10000);  1 row created.
SQL> commit;  Commit complete.

The record was placed in the appropriate partition.

Partitioning on virtual columns allows you to create partitions that make sense for business even though the column itself is not present. Here you have used a very simple calculation for the virtual column but it can be as complex as you like. In those cases, partitioning on a virtual column becomes even more valuable.

No comments:

Post a Comment