SQL Server Integration Services – An Updated HDFS Task

This is an update to my previous post about loading data into HDFS. After using the component a few times I realized that having to pass in a list of files to upload seemed a bit odd inside of a Data Flow task.

So instead I have changed the component to be an SSIS Task, instead of a Destination. This means that it is used in the Control Flow of a package, instead of in a Data Flow task.

I have also made a few other changes:

  • Added a file type filter
  • Improved the UI design
  • Added the ability to create a new HDFS Connection Manager
  • Added a UI to the HDFS Connection Manager

This is what the component now looks like:

hdfs-task-ui

The File Type Filter allows you to specify what types of files should be uploaded from the source directory specified. This is useful if you have a mixture of files or only want to upload a subset.

The update has been pushed to GitHub – https://github.com/kzhen/SSISHDFS.

Advertisements

One thought on “SQL Server Integration Services – An Updated HDFS Task

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s