Bagi Chandrakasan
SQLLDR options

SQLLDR Options


SQL*Loader (SQLLDR):

My daily workflow is getting flat files, normally pipe delimited, load them into staging tables and either upsert into main table or use partition exchange. As the input files vary widely, so are the options used - anywhere from calling functions to parse the fields to defaulting values or skipping them when no longer in use. Below are few methods i used frequently.

The SQLLDR utility is commonly used to fast load flat files into a staging table and then use PL/SQL or other programming languages to update the final tables. Recent releases of oracle have external tables which uses sqlldr internally to provide a seamless access to the underlying data.

Let’s call my sqlldr control file my_big_loader.ctl. To run this, on command prompt sqlldr control=my_big_loader.ctl userid=user/pass@db - if userid is omitted, sqlldr will prompt for missing info.

Listing commonly used options and variations here. Not all possibilities are covered here.

file: my_big_loader.ctl

Specify options:

  options (rows=1000, errors=1000) -- run in conventional path
  options (rows=1000, errors=1000, direct=true) -- run in direct path.
  options (SKIP_UNUSABLE_INDEXES=TRUE) -- allows loading into table when indexes are in unusable state. Unique indexes/constraints will still throw an error.

Direct path load is good for loading large files but has its drawbacks as well. Direct path can be parallelized as well. This skips the conventional SQL path and creates db-blocks directly.

Redo logs:

unrecoverable – When run in direct path, redo log generation is omitted. Check backup and recoverability options when using this.

Input and error files:

load data
infile my_large_file.dat       -- file to load
badfile my_bad_file.bad        -- Bad records stored here (optional)
discardfile my_discards.dsc    -- Discarded records go here (optional)

infile can be * if data is specified inline (within the control file with BEGINDATA). If no path is specified, input file is expected in current directory. Full paths can be specified for bad and discard files as well.

When there are errors, bad records are written into the badfile specified till the max error count errors as specified in options is reached. After reaching the max limit, the load stops.

load data can specify charactersets if different than NLS_CHARACTERSET.

load data     -- default characterset
load data CHARACTERSET UTF8 -- use UTF8 as characterset
load data CHARACTERSET TH8TISASCII  -- Thai
load data CHARACTERSET WE8ISO8859P9 -- Western European 8859

When characterset is specified, the data being loaded is assumed to be using that value. After loading, data will be stored in DB character set. This is particularly useful when dealing with files from a variety of data sources from different markets.

One of my load program parsed the main input file and split it into around 7 different data files depending on the country code of the record. Each file is then loaded into the DB with different characterset specified. It was hard to maintain (adding a field requires adding in all 7 files etc).

A final solution after moving the DB to UTF8 was to load them all using UTF8 initially into a staging table and used Oracle’s convert function twice to convert them properly (first from utf8 to the actual characterset of the file and then to utf8 back). After all transfer systems were able to handle UTF8 or such characterset, these problems went away.

select
    CONVERT (CONVERT (in_text, 'WE8ISO8859P1', 'UTF8'),
                 'UTF8',
                 'WE8ISO8859P1') from DUAL;

Specify table to be loaded

append/insert/replace/truncate    -- one of these
into table my_staging_table
fields terminated by "|" Trailing nullcols  

Choose one of the load methods truncate is fastest for full loads. append adds data to existing table. insert option to insert into table - but table should be empty. replace acts like delete/insert - triggers will fire if defined on delete - also a full table operation not a subset. Use the right method for the job - read the restrictions on their use (primary key/foreign key, unique keys etc).

I commonly use append for adding small batch files and truncate to bulk load data into staging area.

Appropriate privileges to load data into table option should exist. fields terminated can be specified using any valid character form (ascii, hex, oct etc). If the records are of fixed length, POSITION option can be used for each field to specify its length. Additionally, an optionally enclosed by can be specified if data is quoted like optionally enclosed by '"'.

Occasionally, the source file have more fields than we load - usage of trailing nullcols allow the load to proceed ignoring the additional fields.

SQLLDR allows to load into multiple tables. For this the into table option can specify the table name with condition like below:

into table order_header  when recordid='H'  -- header records
into table order_detail  when recordid='L'  -- order details or line items
into table order_payment when recordid='P'  -- payment record

fields...
(recordid filler position(1) char,   -- position of the load identifier
)

Define the fields in order:

(
  field1      ,
  field2      NULLIF field2="NULL",
  skip_field3 filler,  -- filler - this field is skipped
  field4      DATE 'DDMMYYYY'  -- use Format specified
  field5      
)
More options example:
(
field_time                DATE 'YYYYMMDDHH24MISS',
field_date                DATE 'DDMMYYYY',       
field_seq_number          recnum,            -- Rownum equivalent while loading the file
field_normal_column       ,
field_def_date            sysdate,  -- default to sysdate - not in file
field_constant_num        constant 0,  -- default value - not in file
field_constant_alpha      constant "a", -- default value
field_skip_in_file        filler,      -- skip this field
field_call_user_function  "schema.pkg.function(:field_call_user_function)",
field_substr_function     "substr(:field_substr_function,1,8)",
field_using_decode        "decode(:field_using_decode, NULL, 'ABC', :field_using_decode)",
field_last_day_of_mth     "last_day(to_date(:field_last_day_of_mth,'DDMMYYYY'))" -- last day
)