Site Network: Home | About

SQL Loader has basically two types of data load modes... - Conventional path load method and - Direct path load method By default it is first method...which we've been using in our project to load the data. In my previous posts, I think I had explained about how we do data load, which strategy we follow in distributing the files across 3 different instances. Let me just quickly take you through...

  • Ours is RAC with 3 instances.
  • Need to utilize maximum h/w capacity.
  • and data being loaded is a partitioned table(LIST)...so if we randomly pick the files and split into 3 groups and start data load...obviously there would be a lot contention on partition level. How do we do here?
each file will also into a single partition and we also know that into which partition files go into. So, we distribute the files carefully in a such a way that no two instances at any point of time during the load, loads the data into a same partition. Confused!! even I can't get my words clearly...I will put it in simple indirectly we make every instance to use only a set of partitions exclusively. Then what?...absolutely tremendous throughput..total time was cut down from 15hr to 5 hrs!!. Now, everything is going to change. In my last post (about JB's meeting) I talked few about our new design where we are going to use HASH partitioning instead of LIST. List partitioning is fully under our control - we know into a which a partition will be put, so files distribution was possible. But, Hash is different - only Oracle knows it. Basically, HASH itself is a slow process when it comes to data load - because Oracle needs to apply Hash algorithm on every record and decide the partition. Now it became a challenge implementing the similar for Hash, we don't even know the partition of a record, how can we distribute the files??. - absolutely not possible!! So, I tried to use DIRECT PATH Load option in SQL Loader. But again this did not work. We have a sequence generator (for primary key column) in sql loader and this is not allowed in Direct path load - it simply inserts NULLS. I tried to find alternatives but no use. http://www.orafaq.com/forum/t/50054/0/ - this thread has few discussions on the same. There it was suggested to a SEQUENCE() function - but not recommended, performance would again be bad. And I also felt that it is not easily maintainable between the tables.
....
deptno CHAR TERMINATED BY ':',
projno,
loadseq  SEQUENCE(MAX,1))
The SEQUENCE function generates a unique value in the column loadseq. This function finds the current maximum value in column loadseq and adds the increment (1) to it to obtain the value for loadseq for each row inserted.
- Karteek

0 Comments:

Post a Comment