Thursday, November 11, 2010

SQl Loader

SQLLDR Parameters

BAD

The name of a file that will contain rejected records at the end of the load. If you do not specify a name for this, the BAD file will be named after the CONTROL file (see later in the chapter for more details on control files), we use to load with. For example, if you use a CONTROL file named foo.ctl, the BAD file will default to foo.bad, which SQLLDR will write to (or overwrite if it already exists).


BINDSIZE

The size in bytes of the buffer used by SQLLDR to insert data in the conventional path loader. It is not used in a direct path load. It is used to size the array with which SQLLDR will insert data.


CONTROL


The name of a CONTROL file, which describes to SQLLDR how the input data is formatted, and how to load it into a table. You need a CONTROL file for every SQLLDR execution.

DATA

The name of the input file from which to read the data. DIRECT Valid values are True and False, with the default being False. By default, SQLLDR will use the conventional path load method.


DISCARD

The name of a file to write records that are not to be loaded. SQLLDR can be used to filter input records, allowing you to specify that only records meeting a specified criteria are loaded.


DISCARDMAX

Specifies the maximum number of discarded records permitted in a load. If you exceed this value, the load will terminate. By default, all records may be discarded without terminating a load.

ERRORS

The maximum number of errors encountered by SQLLDR that are permitted before the load terminates. These errors can be caused by many things, such as conversion errors (by trying to load ABC into a number field for example), duplicate records in a unique index, and so on. By default, 50 errors are permitted, and then the load will terminate. In order to allow all valid records to be loaded in that particular session (with the rejected records going to the BAD file), specify a large number
such as 999999999.


FILE

When using the direct path load option in parallel, you may use this to tell SQLLDR exactly which database data file to load into. You would use this to reduce contention for the database data files during a parallel load, to ensure each loader session is writing to a separate device.

LOAD

The maximum number of records to load. Typically used to load a sample of a large data file, or used in conjunction with SKIP to load a specific range of records
from an input file.


LOG

Used to name the LOG file. By default, SQLLDR will create a LOG file named after the CONTROL file in the same fashion as the BAD file.


PARALLEL

Will be TRUE or FALSE. When TRUE, it signifies you are doing a parallel direct path load. This is not necessary when using a conventional path load – they may be done
in parallel without setting this parameter.

PARFILE

Can be used to specify the name of a file that contains all of these KEYWORD=VALUE pairs. This is used instead of specifying them all on the command line.


READSIZE

Specifies the size of the buffer used to read input data.

ROWS

The number of rows SQLLDR should insert between commits in a conventional path load. In a direct path load, this is the number of rows to be loaded before performing a data save (similar to a commit). In a conventional path load, the default is 64 rows. In a direct path load, the default is to not perform a data save until
the load is complete.

SILENT

Suppresses various informational messages at run-time.


SKIP

Used to tell SQLLDR to skip x number of records in the input file. Most commonly used to resume an aborted load (skipping the records that have been already loaded),
or to only load one part of an input file.

USERID

The USERNAME/PASSWORD@DATABASE connect string. Used to authenticate to the database.


SKIP_INDEX_MAINTENANCE


Does not apply to conventional path loads – all indexes are always maintained in this mode. In a direct path load, this tells Oracle not to maintain indexes by marking them as unusable. These indexes must be rebuilt after the load.

SKIP_UNUSABLE_INDEXES

Tells SQLLDR to allow rows to be loaded into a table that has unusable indexes, as long as the indexes are not unique indexes.

Control file Informations:-

LOAD DATA

LOAD DATA – This tells SQLLDR what to do, in this case load data. The other thing SQLLDR can do is CONTINUE_LOAD, to resume a load. We would use this option only when continuing a multi-table direct path load.


INFILE *

INFILE * – This tells SQLLDR the data to be loaded is actually contained within the control file itself (see below). Alternatively you could specify the name of another file that contains the data. We can override this INFILE statement using a command line parameter if we wish. Be aware that command line options override control file settings, as we shall see in the Caveats section.

INTO TABLE DEPT

INTO TABLE DEPT – This tells SQLLDR to which table we are loading data, in this case the DEPT table.


FIELDS TERMINATED BY ‘,’


FIELDS TERMINATED BY ‘,’ – This tells SQLLDR that the data will be in the form of commaseparated values. There are dozens of ways to describe the input data to SQLLDR, this is just one of the more common methods.
(DEPTNO,
DNAME,
LOC
)
(DEPTNO, DNAME, LOC) – This tells SQLLDR what columns we are loading, their order in the input data, and their data types. The data types are of the data in the input stream, not the data types in the database. In this case, they are defaulting to CHAR(255), which is sufficient.


BEGINDATA

BEGINDATA – This tells SQLLDR we have finished describing the input data, and that the very next line will be the actual data to be loaded into the DEPT table:

10,Sales,Virginia
20,Accounting,Virginia
30,Consulting,Virginia
40,Finance,Virginia


Load Delimited Data

Delimited data, data that is separated by some special character, and perhaps enclosed in quotes, is the most popular data format for flat files today. On a mainframe, a fixed length, fixed format file would probably be the most recognized file format, but on UNIX and NT, delimited files are the norm. In this
section, we will investigate the popular options used to load delimited data.
The most popular format for delimited data is the CSV format where CSV stands for comma-separated values. In this file format, where each field of data is separated from the next by a comma, text strings can be enclosed within quotes, thus allowing for the string itself to contain a comma. If the string must contain a quotation mark as well, the convention is to double up the quotation mark (in the following
code we use “” in place of just a “).

A typical control file to load delimited data will look much like this:


LOAD DATA
INFILE *
INTO TABLE DEPT
REPLACE
FIELDS TERMINATED BY ‘,’ OPTIONALLY ENCLOSED BY ‘“‘
(DEPTNO,
DNAME,
LOC
)
BEGINDATA
10,Sales,”””USA”””
20,Accounting,”Virginia,USA”
30,Consulting,Virginia
40,Finance,Virginia
50,”Finance”,””,Virginia
60,”Finance”,,Virginia


The following line performs the bulk of the work:
FIELDS TERMINATED BY ‘,’ OPTIONALLY ENCLOSED BY ‘“‘

It specifies that a comma separates the data fields, and that each field might be enclosed in double quotes. When we run SQLLDR using this control file, the results will be:
SQL> select * from dept;
DEPTNO DNAME LOC
---------- -------------- -------------
10 Sales “USA”
20 Accounting Virginia,USA
30 Consulting Virginia
40 Finance Virginia


Another popular format is tab-delimited data: data that is separated by tabs rather than commas. There are two ways to load this data using the TERMINATED BY clause:

a) TERMINATED BY X’09’, which is the tab character using hexadecimal format (ASCII 9 is a tab character), or you might use



b) TERMINATED BY WHITESPACE

======================================
LOAD DATA
INFILE *
INTO TABLE DEPT
REPLACE
FIELDS TERMINATED BY WHITESPACE
(DEPTNO,
DNAME,
LOC)
BEGINDATA
10 Sales Virginia
======================================

On the other hand, if you were to use FIELDS TERMINATED BY X’09’, as the following control file does:

LOAD DATA
INFILE *
INTO TABLE DEPT
REPLACE
FIELDS TERMINATED BY X’09’
(DEPTNO,
DNAME,
LOC
)
BEGINDATA
10 Sales Virginia

You would find DEPT loaded with the following data:
SQL> select * from dept;
DEPTNO DNAME LOC
---------- -------------- -------------
10 Sales

Here, once SQLLDR encountered a tab, it output a value. Hence, 10 is assigned to DEPTNO, and DNAME gets Null since there is no data between the first tab, and the next occurrence of a tab. Sales gets assigned to LOC.

This is the intended behavior of TERMINATED BY WHITESPACE, and TERMINATED BY . Which is more appropriate to use will be dictated by the input data, and how you need it to be interpreted.


Lastly, when loading delimited data such as this, it is very common to want to skip over various columns in the input record. For example, you might want to load columns 1, 3, and 5, skipping columns 2 and 4. In order to do this, SQLLDR provides the FILLER keyword. This allows us to map a column in an input record, but not put it into the database. For example, given the DEPT table from above, the following control file contains 4 delimited fields but will not load the second field into the
database:


LOAD DATA
INFILE *
INTO TABLE DEPT
REPLACE
FIELDS TERMINATED BY ‘,’ OPTIONALLY ENCLOSED BY ‘“‘
( DEPTNO,
FILLER_1 FILLER,
DNAME,
LOC
)
BEGINDATA
20,Something Not To Be Loaded,Accounting,”Virginia,USA”


The resulting DEPT table is:
SQL> select * from dept;
DEPTNO DNAME LOC
---------- -------------- -------------
20 Accounting Virginia,USA


Loading Dates
=====================
LOAD DATA
INFILE *
INTO TABLE DEPT
REPLACE
FIELDS TERMINATED BY ‘,’
(DEPTNO,
DNAME,
LOC,
LAST_UPDATED date ‘dd/mm/yyyy’
)
BEGINDATA
10,Sales,Virginia,1/5/2000
20,Accounting,Virginia,21/6/1999
30,Consulting,Virginia,5/1/2000
40,Finance,Virginia,15/3/2001
The resulting DEPT table will look like this:

SQL> select * from dept;
DEPTNO DNAME LOC LAST_UPDA
---------- -------------- ------------- ---------
10 Sales Virginia 01-MAY-00
20 Accounting Virginia 21-JUN-99
30 Consulting Virginia 05-JAN-00
40 Finance Virginia 15-MAR-01

===============================
LOAD DATA
INTO TABLE DEPT
APPEND
FIELDS TERMINATED BY ‘,’
(DEPTNO,
DNAME “upper(:dname)”,
LOC “upper(:loc)”,
LAST_UPDATED “my_to_date( :last_updated )”
)
===============================


Ref : Expert Oracle - Thomas Kyte
Also look the below on faq on sqlldr


Problem:-

field in data file exceeds maximum length in sql loader

Solution by tom:-

The default datatype in SQLLDR is char(255). Simply code:
...
colname char(1000) NULLIF colname=BLANKS,
colname char(4000) NULLIF colname=BLANKS,
....

in the control file to have sqlldr allocate a big enough buffer to hold it



http://www.orafaq.com/wiki/SQL*Loader_FAQ

No comments:

Post a Comment