Spark Journal : Adding Alias for columns in bulk with SELECT API

This is more of a continuation to my previous blog, which shows how to use alias for column names when using the SELECT API on dataframes in spark.
Exploring the same, I found a good way to handle another scenario, when you are dealing with multiple columns (good number of columns) . In such cases, its not feasible to write a SELECT command with each column manually.

Instead I would prefer a programmatic way to do it, so that its easier, keeps the code clean and is readable.
In this approach ,
1. Firstly we are going to use a predefined Scala Map, which has column names as Keys and column Alias as values stored. Its going to be a default immutable scala object.
2. Secondly, the Map defined above will be used a lookup and we will traverse through each column name of the dataframe to compare /match the existing columns in Map (keys)
3. Thirdly, we will just use this final columns identified after comparison, as a list and replace the list in SELECT API using scala ascription(varargs)

val dataList = List((1,"abc"),(2,"def"))
val df = dataList.toDF("id","Name")

val colalias : Map[String, String] = Map("id" -> "unique id", "Name" -> "Actual Name")

val aliasedCols = df.columns.map(name => colalias.get(name) match { 
  case Some(newname) => col(name).as(newname) 
  case None => col(name) 

df.select(aliasedCols: _*).show

|unique id|Actual Name|
|        1|        abc|
|        2|        def|

Next time, you have such a task at hand, and don’t want to use the traditional way, use this smart way to replace alias of columns dynamically.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s