Vertica

How to load filename while loading file in Vertica (bit 4)

This post is a part of Bit Series blog.

This is going to be a quick trick on how to load the file name to DB table, when loading the data from file to vertica using COPY command.

Many people have this requirement, where they need to load the file name as the source of data, when loading the data from file ,as a part of ETL operations.

Vertica’s COPY command is a mini ETL tool in itself, if your know all the tricks on using it. Its like a little Pandora’s box, which has many secrets of its own, ready to be explored.

So how do you load a file name , when loading data from that file in Vertica.

This can be done using the function “current_load_source()”, this function can only be used with a COPY command when loading the data from a file. this helps to load the data to a column which is meant for storing the file names.

Please refer the below example

## Showing the content of the file
[navin~]$ cat test_file_name.data
1
2
3
4
5
## Getting into Vertica console
[navin~]$ vsql
Welcome to vsql, the Vertica Analytic Database interactive terminal.

Type:  \h or \? for help with vsql commands
       \g or terminate with semicolon to execute query
       \q to quit

-- Creating a table for storing the incoming data.
navin=>  create table test_file_name_table (i int, filename varchar);
CREATE TABLE

-- Load the data from file using the function current_load_source()
navin=> copy test_file_name_table (i, filename as current_load_source()) from '/home/navin/test_file_name.data';
 Rows Loaded
-------------
           5
(1 row)

-- Validate the data in the file.
navin=> select * from test_file_name_table;
 i |             filename
---+-----------------------------------
 2 | /home/navin/test_file_name.data
 3 | /home/navin/test_file_name.data
 5 | /home/navin/test_file_name.data
 1 | /home/navin/test_file_name.data
 4 | /home/navin/test_file_name.data
(5 rows)

navin=>

Not only this, the function supports data loading from multiple files in a single COPY command too.

-- Truncating the table first
navin=> truncate table test_file_name_table;
TRUNCATE TABLE

-- Loading data from multiple files here using same function current_load_source()
navin=> copy test_file_name_table (i, filename as current_load_source()) from '/home/navin/test_file_name*.data';
 Rows Loaded
-------------
          10
(1 row)

-- Validating the records
navin=> select * from test_file_name_table;
 i  |              filename
----+------------------------------------
  2 | /home/navin/test_file_name.data
  3 | /home/navin/test_file_name.data
  5 | /home/navin/test_file_name.data
 67 | /home/navin/test_file_name2.data
  1 | /home/navin/test_file_name.data
  1 | /home/navin/test_file_name2.data
 32 | /home/navin/test_file_name2.data
  4 | /home/navin/test_file_name.data
 45 | /home/navin/test_file_name2.data
 89 | /home/navin/test_file_name2.data
(10 rows)

So will this work even from windows, Yes it will even from Windows machine as below

-- This command is executed from vsql in windows console
navin=> copy test_file_name_table (i, filename as current_load_source()) from
local 'D:\Navin\txt\copy_from_windows.txt'
navin-> ;
 Rows Loaded
-------------
           5
(1 row)


-- Validating the records
navin=> select * from test_file_name_table;
  i   |              filename
------+------------------------------------
    2 | /home/navin/test_file_name.data
    3 | /home/navin/test_file_name.data
    5 | /home/navin/test_file_name.data
   67 | /home/navin/test_file_name2.data
    4 | /home/navin/test_file_name.data
   45 | /home/navin/test_file_name2.data
   89 | /home/navin/test_file_name2.data
  345 | D:\Navin\txt\copy_from_windows.txt
    1 | /home/navin/test_file_name.data
    1 | /home/navin/test_file_name2.data
   32 | /home/navin/test_file_name2.data
  123 | D:\Navin\txt\copy_from_windows.txt
  678 | D:\Navin\txt\copy_from_windows.txt
  123 | D:\Navin\txt\copy_from_windows.txt
 3456 | D:\Navin\txt\copy_from_windows.txt
(15 rows)

This concludes the post on how to use current_load_source() function to load file name , when loading data into vertica.

Go ahead an explore the different possibilities with this function.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s