Extracting Data From a Flat File With SSIS
Charles Donaldson
January 3, 2006
What happened to DTS?
SQL Server Integration Services (formerly Data Transformation Services) provides
a comprehensive solution for transferring and transforming data between diverse
data sources. The architecture of SSIS has been redesigned to separate package-control
flow from data flow. Two engines have been introduced to handle these aspects of
data transformation: the SSIS Run-Time engine and the SSIS Data-Flow engine.
The SSIS Designer is hosted in Business Intelligence Development Studio, making
it possible to develop SSIS projects while disconnected from the server. You can
also develop SSIS solutions using SQL Server Management Studio.
Transformations
SSIS provides the following built-in transformations:
- Conditional Split transformation
- Multicast transformation
- Union-All, Merge, and Merge Join transformations
- Sort transformation
- Fuzzy Grouping transformation
- Lookup and Fuzzy Lookup transformations
- Percentage Sampling and Row Sampling transformations
- Copy/Map, Data Conversion, and Derived Column transformations
- Aggregation transformation
- Data Mining Model Training, Data Mining Query, Partition Processing, and Dimension
Processing transformations
- Pivot and UnPivot transformations
For this example I am importing sample client contact information from a flat
file into the ‘charlestest’ database on a SQL 2005 server.
To create a new Integration Services package to extract data from a flat file:
- Choose Start > All Programs > Microsoft
SQL Server 2005 > SQL Server Business Intelligence Development Studio.
- On the File menu, choose New > Project
to create a new Integration Services project.
- In the New Project dialog box, select Integration Services
Project
- In the Name box, change the default name to Importing Flat
Text File.
- Click OK.
An empty package called Package.dtsx will be created and added
to your project. We can rename this package in the Solution Explorer
toolbar; right-click on Package.dtsx and click Rename,
then rename the default package to importingcustomerdata.dtsx.
Next, we must add a Flat File connection manager to the package that we just created.
A Flat File connection manager allows a package to extract data from a flat file.
To add a Flat File connection manager:
- Right-click anywhere in the Connection Managers area, and then
click New Flat File Connection.
- On the Flat File Connection Manager Editor screen, for
Connection manager name, type Extract Customer Info.
- Click Browse.
- In the Open dialog box, browse to the flat text file you intend
to extract data from.
- To rename columns in the Flat File connection manager, click Advanced,
in the property pane change the ‘Name’ property for each column.
Next, we need to add an OLE DB Connection Manager to connect to the destination
database. An OLE DB Connection Manager allows a package to extract data from or
upload data into any OLE DB data source (SQL Database for our purposes). Using the
OLE DB Connection Manager, you can specify the server, the authentication method,
and the default database for the connection.
To add and configure an OLE DB Connection Manager:
- Right-click in the Connection Managers area, and then click
New OLE DB Connection.
- On the Configure OLE DB Connection Manager screen, click
New.
- For Server name, enter SQL05-1.orcsweb.com (or the appropriate
server name).
- In the Log on to the server group, verify that Use SQL
Sever Authentication is selected and enter the appropriate username and
password.
- In the Connect to a database group, choose Select
or enter a database name box, and then select ‘charlestest’.
- Choose Test Connection to verify the connection settings you have
specified are valid.
- Click OK.
- Click OK.
- In the Data Connections pane of the Server Explorer Configure
OLE DB Connection Manager screen, verify that sql05-1.orcsweb.com.charlestest
is selected.
- Click OK.
You can also specify ‘localhost’ as the server name and the connection manager will
connect to the default instance of local computer.
Now that we have created the connection managers for the source and destination
data, we now need to add a Data Flow task to our package. The Data Flow task encapsulates
the data flow engine that moves data between sources and destinations, and provides
the functionality for transforming, cleaning, and modifying data as it is moved.
The Data Flow task is where most of the work of an extract, transform, and load
(ETL) process actually occurs.
To add a Data Flow task:
- Click the Control Flow tab.
- In the Toolbox, expand Control Flow Items, and
drag a Data Flow Task onto the design surface of the Control
Flow tab.
- On the Control Flow design surface, right-click the newly added
Data Flow Task, click Rename, and change the name
to Export Customer info.
Next, you will add and configure a Flat File source for your package. A Flat File
source is a data flow component that uses metadata defined by a Flat File connection
manager to specify the format and structure of the data to be extracted from the
flat file by a transform process. The Flat File source can be configured to extract
data from a single flat file by using the file format definition provided by the
Flat File connection manager.
To add a Flat File Source component:
- Click on the Data flow tab. In the Toolbox, expand Data Flow Sources, and
then drag a Flat File Source onto the design surface of the
Data Flow tab.
- On the Data Flow design surface, right-click the newly added
Flat File Source, click Rename, and change the name to
Customer Info*.
- Double-click the Flat File source to open the
Flat File Source Editor box.
- In the Flat file connection manager box, select
Importing Customer Info.
- Click Columns and verify that the names of the columns are correct.
- Click OK.
*Alternatively, if the data source file has a header row, it will automatically
name the columns for you with the proper names from the header row.
This package now can extract data from the flat file source and transform that data
into a format that is compatible with the destination. The next task is to actually
load the transformed data into the destination. To load the data, we must add an
OLE DB destination to the data flow. The OLE DB destination can use a database table,
view, or an SQL command to load data into a variety of OLE DB-compliant databases.
To add and configure the OLE DB destination:
- Double-click on OLE DB Destination under DataFlow Destinations
in the Toolbox on
the right. Then Drag the green arrow from the Flat file source to the OLE DB Destination
as show below:

- Then double-click on the OLE DB Destination and in the
OLE DB Destination Editor
box, ensure that sql05-1.orcsweb.com
is selected in the OLE DB Connection manager
box.
- In the Name of the table or the view box select the name of the table that you are
importing the data into.

- Click Mappings.
- Verify that the input columns are mapped correctly to the destination columns.
- Click OK.

Your package is now complete and should look something like this:

It is time to test your package.
To run this package:
- Click on the Debug menu, click Start Debugging.
- After the package has completed running, on the Debug menu, click
Stop Debugging.
- The data in your flat text file has now been imported. Celebrate!
References provided by
www.microsoft.com
Charles Donaldson is an Internet Support Specialist at
ORCS Web, Inc. - a company that provides managed complex
hosting for clients who develop and deploy their applications on Microsoft Windows platforms.