70-448 (070-448) Exam prep, Part 1/3 - SSIS

This section covers about 30% of the exam.  The breakdown, roughly, seems to be SSIS (30%) / SSRS (30%), SSAS (40%).


Package Configurations

à make a package more dynamic – set options like server name, data source in a configuration file to make them easier to move from server to server.


à properties and variables are updated at run time

Can be stored in an XML config file, environment variable, registry, SQL table, or a parent variable inside a package.

If you have no file system access, a SQL server configuration would be required.

BIDs-created packages are stores as .dtsx files; these are XML-formatted files.

Using an environment variable as the pointer to an XML file = indirect configuration

Packages can share SQL Server configurations across servers

Parent package: use for variable-passing

Moving packages from devà prod, changing paths: use indirect config

Multiple entries for 1 SSIS config: XML or SQL server

Data sources are project-level; connections are package-level

Control Flow tasks in SQL Server 2008:



The ValidateExternalMetadata property on various SSIS objects determines if SSIS will automatically check for the existence of referenced objects during design.


Data Profiling Task:


Generate XML reports of data analysis inside an SSIS package

Analysis includes candidate key recommendations, column length/value distributions

SSIS variables


User-defined variables are defined by the developer.  System variables are built into SSIS.

SSIS System variables


VersionGUID: Unique Identifier for package version.

TaskID: Unique Identifier for Task

Can be set programmatically.


Scope: Package-level variables are available to everything inside the package.

Container-level variables are available to the container and everything inside it.

If the package calls another using the execute package task, variables can be passed to the child package.

Inside a script task, variables must be included in ReadOnlyVariables or ReadWriteVariables to be available inside the scripting engine.

Backup and restore of SSIS packages:

Both SSIS packages and SQL Server agent job information are stored in MSDB.  Backing up a copy of this database will protect your packages.  Some of the testing information incorrectly specifies that they are stored in the master database.

Control Flow

Checkpoints (package property)


Not to be confused with breakpoints, which are used for debugging,

Can be used to restart a package that fails part way through.  You can restart from the checkpoint.  It’s most often used in situations where a large file download or some other intensive task had occurred previously in the package; checkpoints can avoid running that task again.

CheckPointUsage property: IfExists: use the checkpoint; Always: package won’t run without a checkpoint; checkpoints cannot be used with

Package option/parameter TransactionOption: Required

Breakpoints: pause the control flow

Can be on any task, container, or inside control flow script


Transactions in SSIS packages

Great article here:


Transaction support is built in to SSIS.  The TransactionOption property exists at the package level, container level (e.g. For Loop, Foreach Loop, Sequence, etc.), as well as just about any Control Flow task (e.g. Execute SQL task, Data Flow task, etc.).  TransactionOption can be set to one of the following:

  • Required - if a transaction exists join it else start a new one
  • Supported - if a transaction exists join it (this is the default)
  • NotSupported - do not join an existing transaction

Note also that the SSIS package elements also have an IsolationLevel property with a default of Serializable.

MSDTC must be running.

Auditing tasks should be set to NotSupported so they won’t be rolled back as part of another transaction.

“Supported” makes no sense at the package level outside of nested packages, since it’s the top level and there couldn’t be another transaction running.  Required should be used at the package level to initialize a transaction.

Property Expressions

Update control flow properties in the midst of package execution, including connection properties


For example, If you have a directoryFile variable that operated on a list of files, you would –

Set the directoryFile variable property EvaluateAsExpression to TRUE

Loop through the files, setting the directoryFile variable to the next filename using an expression

Precedence evaluation options

Constraint (completion status of upstream task)


Expression and Constraint

Expression or Constraint                                                                                                            

SSIS precedence constraints:

Solid Green: Success

Red: Fail

Blue: Completion

Logical AND: Green (Solid)

Logical OR: Green (Dotted)

Data flow paths and errors

Error Output options: Fail transformation (entire data flow fails); Ignore Failure: Set error values to NULL but continue down green path; Redirect Row: sends rows down a different (Red) path

Different error outputs can be set based on the column the error occurs in

Control flow event handlers

SSIS Designer à Event handlers

Task + Event Handler pairing

OnError: Any error

OnExecStatusChanged: Any change in status for any objects (Success, Failed, in Process)

OnInformation: informational messages

OnPostExecute: task/container completion (success)

OnPostValidate: task/container validation (success)



OnProgress: any progress of task/container

OnQueryCancel: Manual cancelling of an Execute SQL task


OnVariableValueChanged: If RaiseChangeEvent property = TRUE, raised when variable changes



Container/Task-level property DisableEventHandlers controls event handler activations

Fuzzy grouping is a de-duplication tool that runs against a dataset.


Package protection levels


explained well here.

You can set with the designer or DTUTIL.

Command-Line / Executables DTUTIL, DTEXEC, DTEXECUI


 package management from the command line

Copy, Move, Delete, Exists (check), encrypt, decrypt, sign, fcreate (folders)


DTUTIL return codes: 0 = success, 1 = failure, 4 = package not found, 5 = package cannot load, 6 = bad syntax


Dtexec – execute packages from command line

DtexecUI – visual interface to help build dtexec commands



If a package configuration is not specified, you can apply it with DTEXEC/DTEXECUI at execution time.

DTEXEC for logging on an error: /DumpOnErr



Integration services roles/ Package Security on SQL Server


DB_SSISOPERATOR: view, execute, export packages

DB_SSISLTDUSER: execute, delete, change and export only packages that user imported

DB_SSISADMIN: all permissions

Windows administrator: view/stop packages


All roles exist in MSDB.  If a user needs more granular access, custom roles would be required.

Packages can be digitally signed to ensure authenticity.


The BlockSignatureStates registry entry controls how these signatures are handled.

0: No Administrative Restriction

1: Block Invalid Signatures (But not unsigned packages)

2: Block Invalid and untrusted signatures (not unsigned packages, but does block self-generated signatures)

3: Block Invalid and untrusted signatures and packages

Package Protection levels

DontSaveSensitive: no connection pwds saved; must used configuration for execution if SQL logins are used; window authentication doesn’t require saved pwds


EncryptAllWithUserKey: Can only be opened by same user on same computer as was developed



Rely on Server Storage – only applies to SQL-deployed packages

SSIS installation/configuration/high availability/monitoring

SSIS service – windows service that executes MsDtsSrv.exe

Connect via SSMS

Running packages: shows which are currently executing

Stored packages: shows packages in the package store in file system and deployed to MSDB database inside SQL Server (sysssispackages)

SSIS is not cluster-aware but can be configured as a cluster resource

Restartability rules apply as normal on a cluster (eg. Checkpoints)


Data Mining/Modeling:

Changing chart settings will not affect trends in the lift chart, but will affect trends in the profit chart [ From a .VCE – no context ]


Deployment manifests create a package deployment wizard, which simplifies deployment but lacks any flexibility.

From the SSIS Project properties page,  change “CreateDeploymentUtility” to TRUEthen build the project to create the deployment manifest.


SSIS LoggingSSIS Menu à Logging

Containers: Package, Container, or task-level logging

Log Provider Type: SQL Server tables (sysssislog), text files, EventLog, Profiler, XML

Log Events: OnError, etc.


Data Flow

OLEDB Destination vs SQL Server Destination:

SQL Server Destination uses shared memory and must connect to a data source on the same system as the package is running, making it poor for portability but potentially faster

For bulk processing, OLE DB destination adapters can be configured for “Fast load”.



Flat File Destination: Text/CSV


Error logging for specific tasks:

Decide what file type the destination will be (.DOC, .TXT, etc) – create a destination component of the appropriate type.

Configure the error output to flow to that destination.


Control Flow: MaximumErrorCount specifies max errors allowed before package failure



Data viewers


allow you to see the data inside a data flow as it moves from one component to the next, and then allow it to continue



Data Flow transformations



Audit: adds auditing columns


Cache transform: caches data for subsequent transformations (Using full cache mode)

Character Map: string operations, only works on string data types



Copy Column

Data conversion: data type conversions


Derived column – builds new columns based on expressions/other logic


Export column – moves data into files; for large objects


Import columns – moves LOBs into relational data model

Row Count



Conditional Split – moves data in one output or another (exclusively)

Lookup – matches the current data set in the flow to external lookups (SQL tables)

Merge: Combines two sorted data sets into one large dataset.  This isn’t a join and should have been called UNION or something to differentiate it from MERGE JOIN.

Union All: like merge - combines datasets, but can be unsorted and combine 2+ datasets – row count is total of all input data sets

The merge join transformation


is just a T-SQL join (left, right, inner)

It requires sorted data:


The easiest way is to set IsSorted = TRUE on any previous components.

Sort transformations can also be used.


To insert rows into a table but first check if they exist, a combination of MERGE JOIN and LOOKUP could be used.


Multi-Row (aggregations and stats, usually intensive)

Aggregate – SUM, COUNT, MIN, MAX

Percent sampling – takes a sampling of the input data set


Row sampling  - fixed sampling

Sort – sorting, allows removal of dupes



Data Prep (advanced operations)

The OLEDB Command transformation runs a sql statement for each row in a data flow.



Slowly changing dimension – dimension changes

Data mining query – data mining model predictions

Fuzzy grouping – column à row matching for data cleanup operations

Fuzzy Lookup – column à Dataset matching for data cleanup

Script Component

Term extraction – looks for keywords

Term Lookup – looks for keywords based on an existing dataset



Lookup transformation caching options in SQL 2008


Full Cache (Default): entire reference set is pulled at pre-execute of data flow.  Most memory consumption, highest initial startup time.  Lookup operations fast.

Partial Cache: New rows are compared to existing cache; if they don’t exist, added to cache

No Cache: Hit DB for every row



For Loop: Uses an integer as the count

For Each: Iterates over a number of objects, like files


Extracting data from multiple files:

Use a ForEach Loop Container.

Add a variable at the package level.

Create a flat file connection manager; use an expression for the connection string.





Assemblies / Custom Components


It’s possible to develop custom data flow components


for SSIS packages. 

Questions about this functionality seem well beyond the scope of a general BI exam, but some are included.

Custom assemblies must be moved into the BIDS folder (for example, C:\Program Files\Microsoft SQL Server\100\DTS\ to be visible to BIDS.

In more detail,


1)      Create strong name key

2)      Sign + Build

3)      Deploy into BIDS folder

4)      Deploy into GAC (probably C:\Windows\Assembly), close/reopen bids

5)      In Bids/Toolbox items, select custom assembly