In SSIS, we normally process multiple files in a folder using
ForEach Loop Container. This container has a wildcard filter, where we can
specify the type of files to be processed.
But there can be cases, where folder may contain multiple
kinds of files and we have to process 2 or 3 kinds of files from them. During
such cases, single wildcard filter won’t help us out.
Example for the
above scenario:
Some time back we had a requirement, to pull 6 set of FTP
files daily. All these files were in zipped format.
All zipped files (except one), were in “.csv” format and
only one file was in “.txt” format.
Now, directly if we unzip and process files using ForEach,
it will try to load the data even from zipped files to a table. But this will
throw an error. To resolve this we have 3 options:
1.
Move .zip files to some other folder using
another ForEach
2.
Create 2 separate ForEach, one for .csv and
another for .txt
3.
Next option is to ignore the .zip files and
process only .txt and .csv files
Here is a
step-by-step process on how the third option can be achieved
Suppose, we had to process all the files which had a name
“SSIS” in it and all were in “.csv” format:
Inside ForEach Loop, under Files section we would have given
*SSIS*.csv, something
as below:
But now along with .csv file we need to process”.txt”
file which has a word SSIS in its filename.
i.e., we need to process both “.txt” and “.csv” files
which has a word SSIS in its filename
To loop such multiple kinds of files, we can use regular
expressions. Detail on how it can be used in SSIS is given below
Create 2 variables
1.
RegExFilter:
Data Type String; Value = ^.*\.(txt|csv)$
2.
PassesRegexFilter:
Data Type Boolean;
Using above 2 variables, we will check if the file which
will be processed is a csv or txt file using a Script Task.
Drag a Script task inside ForEachLoop container and add the
below code( C# code )
/*
Microsoft SQL
Server Integration Services Script Task
Write scripts
using Microsoft Visual C# 2008.
The
ScriptMain is the entry point class of the script.
*/
using System;
using
System.Data;
using
Microsoft.SqlServer.Dts.Runtime;
using
System.Windows.Forms;
using
System.Text.RegularExpressions; // Added
using
System.IO; // Added
namespace ST_8102847e64a74fb5ac4b523cfa570381.csproj
{
[System.AddIn.AddIn("ScriptMain",
Version = "1.0", Publisher = "", Description = "")]
public partial class ScriptMain :
Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
{
#region VSTA generated code
enum ScriptResults
{
Success =
Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
Failure =
Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
};
#endregion
/*
The
execution engine calls this method when the task executes.
To
access the object model, use the Dts property. Connections, variables, events,
and
logging features are available as members of the Dts property as shown in the
following examples.
To
reference a variable, call
Dts.Variables["MyCaseSensitiveVariableName"].Value;
To
post a log entry, call Dts.Log("This is my log text", 999, null);
To
fire an event, call Dts.Events.FireInformation(99, "test", "hit
the help message", "", 0, true);
To
use the connections collection use something like the following:
ConnectionManager
cm = Dts.Connections.Add("OLEDB");
cm.ConnectionString
= "Data Source=localhost;Initial
Catalog=AdventureWorks;Provider=SQLNCLI10;Integrated Security=SSPI;Auto
Translate=False;";
Before
returning from this method, set the value of Dts.TaskResult to indicate success
or failure.
To
open Help, press F1.
*/
public void Main()
{
// Get the filename from the complete filepath
String FileName = Path.GetFileName(Dts.Variables["User::ImportFile"].Value.ToString());
// Create a regex object with the pattern from the SSIS
variable
Regex FileFilterRegex = new
Regex(Dts.Variables["User::RegExFilter"].Value.ToString());
// Check if it is match and return that value (boolean) to
the SSIS variable
Dts.Variables["User::PassesRegexFilter"].Value
= FileFilterRegex.IsMatch(FileName);
Dts.TaskResult = (int)ScriptResults.Success;
}
}
}
This code will check if the file is txt /csv. If yes then
the variable PassesRegexFilter is set to true.
After this task, add other logic to process the file. Prior to
these task logics, add a script task and connect them using an expression @[User::PassesRegexFilter] ==
True as below
The ForEach Loop looks something as below
Reference:
No comments:
Post a Comment