More Reasons to Adopt Biml

SSIS packages aren’t backwards compatible – that means if you start building packages targeting SQL Server 2012 and then find out you need to also support and earlier version (like what happened to me this week), you will need to start over. However, if you used Biml to generate your packages you could spin up an instance of SQL Server 2008 in Azure, load your Biml project in BIDS and regenerate your packages targeting a different version.

You can change quickly – I recently found myself in the situation where I had developed an Extract, Transform and Load package and when it came to test them realize that they were wrong. I need to load data after it had been transformed because down stream Transforms depended on data being loaded. Here’s a picture

1-Before

What I had to do was combine the Transform and Load packages into a single one. I could have gone through and copy, pasted, joined up and fixed any problems for each and every one (there were about 30+ transforms), but that would have taken a while and been a very boring task. Instead I made a very minor change to my Biml package and then regenerated the package.

2-After

You can refactor quickly – this is in a similar vein to the point above. In another project I did my development against SQL Server 2012 when it came time to deploy I found out the client site I had to deploy to was still running SQL Server 2008. Which meant I had to say good bye to the Project Deployment model, along with the nice and easy configuration experience you get with 2012 and go back to using XML configuration files – this was a bit of a pain to figure out. Given that I had about 40 packges, the effort of going through each one to enable Package Configurations would have been immense (and boring), instead after I figured out how to do it in Biml all I needed to do was regenerate the packages.

Drag and drop doesn’t scale – it works fine for small ad-hoc packages that you end up throwing away, however I strongly hold the view that if you’re doing any serious development, building packages by hand just isn’t going to scale. This is especially true when all you’re really doing is implementing design patterns. I don’t remember the last time I dragged and dropped something in SSDT BI, and I have been doing a lot of SSIS development recently!

In conclusion and the main reason you should adopt Biml, your time scales better. All the things I have outlined above I could have done by hand, but it would have taken me a while and would have been boring. I probably would have made mistakes along the way too. Instead, by using Biml to create the packages I can concentrate on more important things (like writing this blog post, automating our database deploytment or automating our release process).

Advertisements

BIML – Passing Variables to Child Packages

A project I’ve been working on recently has consisted of lots of little packages which are called from a single coordinating package. I started off developing against SQL Server 2012, and thankfully used BIML to generate all my packages. The main controller package passes a variable down to the sub-packages, which is then subsequently used when executing a stored proc. This all worked nicely until I was told that it needed to work against SQL Server 2008 😦 Thanks to the ease with which BIML allows changes to be made I had it fixed and working again within minutes.

This post will show how to pass variables from a parent package to a child package for both SQL Server 2012 and SQL Server 2008.

SQL Server 2012

This is actually really easy for SQL Server 2012, thanks to the introduction of package parameters. The below BIML will generate two SSIS packages.

Parent – This package has a variable defined as ParentVar, the Execute Package data flow task will pass this variable as the value for the ChildVar parameter defined in the Child package.

Child – This package has a single parameter defined, ChildVar which will have the value filled in by the Parent package.

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
  <Packages>
    <Package ConstraintMode="Linear" Name="Parent">
      <Variables>
        <Variable DataType="String" Name="ParentVar">FromParentPackage</Variable>
      </Variables>
      <Tasks>
        <ExecutePackage Name="Execute Child">
          <ParameterBindings>
            <ParameterBinding VariableName="User.ParentVar" Name="ChildParam"></ParameterBinding>
          </ParameterBindings>
          <ExternalProjectPackage Package="Child.dtsx"></ExternalProjectPackage>
        </ExecutePackage>
      </Tasks>
    </Package>
    <Package ConstraintMode="Linear" Name="Child">
      <Parameters>
        <Parameter DataType="String" Name="ChildParam"></Parameter>
      </Parameters>
      <Tasks>
        <Container ConstraintMode="Linear" Name="DoStuff"></Container>
      </Tasks>
    </Package>
  </Packages>
</Biml>

SQL Server 2008

In SQL Server 2008 we don’t have the luxury of package parameters so we have to fall back to Package Configurations. To pass a variable to a child package we need to define a “Parent package variable” configuration, which tells the Child package to expect a given variable from the Parent package and which local variable it should be mapped to.

Once again the following BIML will generate two packages.

Parent – This package has the variable defined and set, as well as an Execute Package data flow task which will execute the Child package

Child – This package has a Package configuration which will map the ParentVar variable to the local ChildVar variable. You can see the Package Configurations by right clicking anywhere in the package control flow and then clicking Package Configurations, this will bring up the following:

Package-configurations

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
  <Packages>
    <Package ConstraintMode="Linear" Name="Parent">
      <Variables>
        <Variable DataType="String" Name="ParentVar">FromParentPackage</Variable>
      </Variables>
      <Tasks>
        <ExecutePackage Name="Execute Child">
          <Package PackageName="Child"></Package>
        </ExecutePackage>
      </Tasks>
    </Package>
    <Package ConstraintMode="Linear" Name="Child">
      <Variables>
        <Variable DataType="String" Name="ChildVar" InheritFromPackageParentConfigurationString="ParentVar"></Variable>
      </Variables>
      <Tasks>
        <Container ConstraintMode="Linear" Name="DoStuff"></Container>
      </Tasks>
    </Package>
  </Packages>
</Biml>

(To use this sample, make sure that the project you generate the packages in is set to the “Package Deployment Model”)

Biml From the Trenches – Convention over Configuration

Biml from what I have observed is still in it’s infancy. Don’t get me wrong, what can actually be done with Biml is impressive, however it seems to me how we are using it and how we are talking about it still needs to mature before it can be fully and effectively embraced by the community at large.

There is a growing number of people in the SQL Server community that are contributing excellent guidance and driving forward its maturity, as well as BimlScript.com which has an ever growing set of examples to learn from. I firmly believe that the future of Biml is bright, if you don’t believe me check out this interview with the creator Scott Currie.

What I would like to see more of, is people talking about implementing design patterns in Biml, let’s have less examples and more theory.

An area that there is still little guidance on is convention based Biml.

Talk to any experienced software developer and they will tell you all about “Convention over Configuration” and how it makes their life easier, but if you’re a BI developer this might be brand new concept to you.

Wikipedia has a great description:

Convention over configuration is a software design paradigm which seeks to decrease the number of decisions that developers need to make, gaining simplicity, but not necessarily losing flexibility.

And

The phrase essentially means a developer only needs to specify unconventional aspects of the application.

In my mind there are a few different types of conventions:

  1. Those that require strict adhered
  2. Those that can be overridden
  3. On/Off by default

In this post I’m going to give two examples of the first type, those I class as “strict adherence required” from a recent ETL project I was working on. I’ll cover the other types in subsequent posts.

NB: I’m deliberately not including any Biml or screenshots of SSIS packages in this post. The reason for this is that there are numerous ways to implement design patterns, so rather than showing you how I do it, I want to introduce the concepts and get you thinking about it.

Use a convention to Extract data

What I want to achieve: I want to move data from a source system into my target database, but only a subset of tables in the source system.

My convention: For each table in the Import schema of my Target database, there is a table with the same name and structure in the source system. Then, for each table in the Import schema generate Data Flow task which will read from the source table and copy the data into the target table.

The advantage: When I need to start importing more data from the source system all I need to do is create a new table in my import schema that matches the source and then regenerate the extract package. No configuration is needed, I don’t need to maintain a list of tables which needed to be imported, the presence of the table in the Import schema is enough. Simple.

Use convention to Transform data

The ETL project that I working on already had a number of queries written which would transform the source system data to match our table structure. We had a number of tables in the Stage schema which the transformed data needed to go before being loaded into the live tables.

What I want to achieve: Transform data from the Import schema tables and load into tables in the Stage schema

My convention: For each table in the Stage schema, there is a corresponding stored procedure with a name like: [Stage table name]_Transform

The advantage: Once again, when I need to start transforming more data, only two things need be done, create new table in Stage schema, create transform stored procedure. Then I just regenerate my SSIS package, no changes required.

What’s the point?

Hopefully you’ve figured it out, but the point of “convention over configuration”, or “coding to a convention” is to remove the number of decisions that you need to make during the development effort. This allows you to spend more time on the hard stuff and let the computer take care of the rest.

BIML Tip – Use LinqPad to write C#

To see any real benefit of using BIML you will need to combine it with C# to generate dynamic packages. The only problem is, unless you fork out $1000’s for MIST then your only alternative is BIDSHelper. This is a great tool, but one of the biggest problems is that you loose IntelliSense for C#. Lately I’ve been using *LinqPad to prototype and test C# code which I then paste into my BIML files.

In this tip I’m going to show you how you can use LinqPad to speed up your BIML Scripting experience.

Just a quick note before we get started: To get the added benefit of IntelliSense in LinqPad you have to fork out $39USD for the premium edition, however as you will see this is definitely a worthwhile expense. Even if you don’t purchase the premium version you can still follow along, you’ll just miss out on the auto-complete. (I have not being paid for this endorsement).

Setting Up LinqPad

You can download LinqPad from http://www.linqpad.net/

Once you have installed and launched LinqPad you will see an empty query window.

1-main-query-window

To be able to use the BIML engine and Varigence extensions we’ll need to add a reference to the required assemblies and their namespaces. To do that, press F4, this will bring up the Query Properties window.

The first thing that you will need to do is add the assembly references. Click the Browse button and locate BimlEngine.dll (for me this is in C:\Program Files (x86)\BIDS Helper 2012). I also added a reference to WindowsBase.dll which you can do by clicking the Add button then searching for that assembly.

2-additional-references

The next thing to do is setup the namespaces, click the “Additional Namespace Imports” tab, and paste the following:


System.Data
System.Data.SqlClient
Varigence.Hadron.CoreLowerer.SchemaManagement
Varigence.Hadron.Extensions.SchemaManagement
Varigence.Languages.Biml
Varigence.Languages.Biml.Connection
Varigence.Languages.Biml.Table
Varigence.Hadron.Extensions

3-additional-namespace-imports

Click OK to close the Query Properties window.

Language Modes in LinqPad

LinqPad gives you a few different options of writing C#.

4-language-modes

 

I’ve found the most useful modes to write C# to be used with BIML is either the “C# Program” or “C# Statements” mode.

The “C# Program” mode is useful when you want to write different methods which can be reused in your BIML. Whereas “C# Statements” mode is useful when you just want to write some C# to directly paste into your BIML, you can’t break this into different methods.

Writing C# – Example 1

In this first example I’ll show you how you could use LinqPad to write some very simple C# code that will use the Varigence extension methods to query the metadata of your database.

This is useful in BIML when you want to generate either a number of different packages to perform over a series of tables, or multiple sequences in a single package over a series of tables.

Set LinqPad to use “C# Statements” mode and paste in the following code:

string connectionstring = "Data Source=.;Initial Catalog=AdventureWorksDW2012;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;";

var targetConnection = SchemaManager.CreateConnectionNode("SchemaProvider", connectionstring); 

IEnumerable<AstTableNode> sourceTables = targetConnection.ImportDB("dbo", "Dim%", ImportOptions.ExcludeViews).TableNodes; 

sourceTables.ToList().ForEach(x=>x.Name.Dump());

This will simply use the SchemaManager and ImportDB extension methods provided by Varigence to dump the names of all the tables in the AdventureWorksDW2012 dbo schema that start with Dim

5-example1-1

 

 

The next thing to do is make this a bit more useful, paste the following code:


string connectionstring = "Data Source=.;Initial Catalog=AdventureWorksDW2012;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;";

var targetConnection = SchemaManager.CreateConnectionNode("SchemaProvider", connectionstring);

IEnumerable<AstTableNode> sourceTables = targetConnection.ImportDB("dbo", "Dim%", ImportOptions.ExcludeViews).TableNodes;

foreach (var table in sourceTables)
{
table.Name.Dump();
table.GetColumnList().Dump();
}

This will now loop over all the tables and this time dump the table name and then the column list.

6-example1-2

 

These two snippets demonstrate how to get started using LinqPad.

We can now take the second snippet and plug it into a BIML file:


<!--
<#@ import namespace="System.Data" #>
<#@ import namespace="Varigence.Hadron.CoreLowerer.SchemaManagement" #>
<#@ import namespace="Varigence.Hadron.Extensions.SchemaManagement" #>

<#
	string connectionstring = "Data Source=.;Initial Catalog=AdventureWorksDW2012;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;";

	var targetConnection = SchemaManager.CreateConnectionNode("SchemaProvider", connectionstring); 

	IEnumerable<AstTableNode> sourceTables = targetConnection.ImportDB("dbo", "Dim%", ImportOptions.ExcludeViews).TableNodes;
#>
-->

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
	<Connections>
		<OleDbConnection ConnectionString="Data Source=.;Initial Catalog=AdventureWorksDW2012;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;" Name="Source"></OleDbConnection>
		<OleDbConnection ConnectionString="Data Source=.;Initial Catalog=MyDW;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;" Name="Target"></OleDbConnection>
	</Connections>
	<Packages>
		<Package ConstraintMode="Linear" Name="Extract">
			<Connections>
				<Connection ConnectionName="Source"></Connection>
				<Connection ConnectionName="Target"></Connection>
			</Connections>
			<Tasks>
			<!--
			<# foreach(var table in sourceTables)
			{
			#>
			-->
			<ExecuteSQL ConnectionName="Target" Name="Truncate Import Table <#=table.Name#>">
				<DirectInput>
					TRUNCATE TABLE Import.<#=table.Name#>
				</DirectInput>
			</ExecuteSQL>
				<Dataflow Name="Copy data into Source <#=table.Name#>">
					<Transformations>
						<OleDbSource ConnectionName="Source" Name="Source - <#=table.Name#>">
							<DirectInput>
								SELECT
									<#=table.GetColumnList()#>
								FROM
									<#=table.Name#>
							</DirectInput>
						</OleDbSource>
						<OleDbDestination ConnectionName="Target" Name="Target - <#=table.Name#>">
							<ExternalTableOutput Table="Import.<#=table.Name#>"></ExternalTableOutput>
						</OleDbDestination>
					</Transformations>
				</Dataflow>
			<!--
			<#
			}
			#>
			-->
			</Tasks>
		</Package>
	</Packages>
</Biml>

This BIML script will then generate a package which will use the AdventureWorksDW2012 database as a source database and copy all the data from each of the Dim tables into an equivalent table in the Import schema of the MyDW database.

Writing C# – Example 2

In this example I will show you how you could use LinqPad to write a reusable methods which can then be in your BIML files.

The example that I will use is the GetSqlServerExtendedProperties method which was provided by Dave Stein (b | t), it simply queries the extended properties on a given table.

Set LinqPad to use “C# Program” mode and paste in the following code:


AstOleDbConnectionNode targetConnection;

void Main()
{
	targetConnection = SchemaManager.CreateConnectionNode("SchemaProvider", "Data Source=.;Initial Catalog=AdventureWorksDW2012;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;"); 

	var sourceTables = targetConnection.ImportDB("dbo", "Dim%", ImportOptions.ExcludeViews).TableNodes; 

	foreach (var element in sourceTables)
	{
		GetSqlServerExtendedProperties(element);
	}

	foreach (var element in sourceTables)
	{
		var skip = element.Annotations.FirstOrDefault (a => a.Tag.Equals("ETL_Should_Skip"));

		if (annot.Text.Equals("1"))
		{
			//skip the table as it is marked to skip
			continue;
		}
		else
		{

		}
	}
}

AstNode GetSqlServerExtendedProperties(AstTableNode table)
{
	AstNode returnValue = (AstNode)table;

	var query = string.Format("SELECT name, value FROM fn_listextendedproperty(NULL, 'schema', '{0}', 'table', '{1}', NULL, NULL)", table.SchemaName, table.Name);

	var extendedProperties = ExternalDataAccess.GetDataTable(targetConnection.ConnectionString, query);

	foreach (DataRow ep in extendedProperties.Rows)
	{
		returnValue.AddAnnotation(AnnotationType.Tag, ep["value"].ToString(), ep["name"].ToString());
	}

	return returnValue;
}

This snippet will query the extended properties of all the tables in the dbo schema that starts with Dim and then loop through each of them. If it finds one called ETL_Should_Skip (lines 16 and 18) then it will skip it and go to the next one in the sequence.

This could be useful for example if you wanted to control which tables should go into your package without having this hard coded, or having to add a “control table” to your database.

The BIML would look like this:

<!--
<#@ import namespace="System.Data" #>
<#@ import namespace="Varigence.Hadron.CoreLowerer.SchemaManagement" #>
<#@ import namespace="Varigence.Hadron.Extensions.SchemaManagement" #>

<#+
AstOleDbConnectionNode conn;
#>

<#
	string connectionstring = "Data Source=.;Initial Catalog=AdventureWorksDW2012;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;";

	conn = SchemaManager.CreateConnectionNode("SchemaProvider", connectionstring); 

	IEnumerable<AstTableNode> sourceTables = conn.ImportDB("dbo", "Dim%", ImportOptions.ExcludeViews).TableNodes;
#>
-->

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
	<Connections>
		<OleDbConnection ConnectionString="Data Source=.;Initial Catalog=AdventureWorksDW2012;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;" Name="Source"></OleDbConnection>
		<OleDbConnection ConnectionString="Data Source=.;Initial Catalog=AdventureWorksDW2012;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;" Name="Target"></OleDbConnection>
	</Connections>
	<Packages>
		<Package ConstraintMode="Linear" Name="Extract">
			<Connections>
				<Connection ConnectionName="Source"></Connection>
				<Connection ConnectionName="Target"></Connection>
			</Connections>
			<Tasks>
			<!--
			<# 

			foreach (var element in sourceTables)
			{
				GetSqlServerExtendedProperties(element);
			}

			foreach(var table in sourceTables)
			{
				var skip = table.Annotations.FirstOrDefault (a => a.Tag.Equals("ETL_Should_Skip"));

				if (skip != null && skip.Text.Equals("1"))
				{
					//skip the table as it is marked to skip
					continue;
				}

			#>
			-->
			<ExecuteSQL ConnectionName="Target" Name="Truncate Import Table <#=table.Name#>">
				<DirectInput>
					TRUNCATE TABLE dbo.<#=table.Name#>
				</DirectInput>
			</ExecuteSQL>
			<Dataflow Name="Copy data into Source <#=table.Name#>">
				<Transformations>
					<OleDbSource ConnectionName="Source" Name="Source -"
						<#=table.Name#>">
						<DirectInput>
							SELECT
							<#=table.GetColumnList()#>
							FROM
							<#=table.Name#>
						</DirectInput>
					</OleDbSource>
					<OleDbDestination ConnectionName="Target" Name="Target -"
						<#=table.Name#>">
						<ExternalTableOutput Table="Import."
							<#=table.Name#>">
						</ExternalTableOutput>
					</OleDbDestination>
				</Transformations>
			</Dataflow>
			<!--
			<#
			}
			#>
			-->
			</Tasks>
		</Package>
	</Packages>
</Biml>

<!--
<#+
AstNode GetSqlServerExtendedProperties(AstTableNode table)
{
	AstNode returnValue = (AstNode)table;

	var query = string.Format("SELECT name, value FROM fn_listextendedproperty(NULL, 'schema', '{0}', 'table', '{1}', NULL, NULL)", table.SchemaName, table.Name);

	var extendedProperties = ExternalDataAccess.GetDataTable(conn.ConnectionString, query);

	foreach (DataRow ep in extendedProperties.Rows)
	{
		returnValue.AddAnnotation(AnnotationType.Tag, ep["value"].ToString(), ep["name"].ToString());
	}

	return returnValue;
}
#>
-->

Conclusion

If you find yourself writing lots of C# code which you’re using within BIML then it is definitely work trying out LinqPad to see if it makes your life easier.

 

BIML – Parameter Binding with SSIS 2012

For any SSIS project that contains more than a few packages chances are you have a master (or coordinator) package that controls execution of various packages within your project. In this post I’ll show you how to use Package Parameters with BIML.

When I first started searching around on how to do this, I can across this sample on the BIMLScript.com site: http://bimlscript.com/Snippet/Details/73

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
 <Packages>
   <Package Name="PackageParameters" ConstraintMode="Parallel">
     <Parameters>
       <Parameter Name="PackageParamStr" DataType="String">Test</Parameter>
     </Parameters>
     <Variables>
       <Variable Name="TestPackageParam" DataType="String" EvaluateAsExpression="true">@[$Package::PackageParamStr]</Variable>
     </Variables>
   </Package>
 </Packages>
</Biml>

In this sample the value of the parameter is being assigned to a package variable, which is then used within the package. Unfortunately I couldn’t get this working so had to keep looking for another solution. Trying to use the parameter directly proved to be an exercise in experimentation. After a few different attempts it turned out to be very simple, here is an example of using a Package Parameter with an OLE DB Source:

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
 <Packages>
  <Package Name="PackageParameters" ConstraintMode="Parallel">
   <Parameters>
    <Parameter Name="PackageParamStr" DataType="String">Test</Parameter>
   </Parameters>
   <Tasks>
    <Dataflow Name="DFT - Move some data around">
     <Transformations>
      <OleDbSource ConnectionName="Source" Name="SRC - Client">
       <DirectInput>EXEC ETL.GetSomeData @sp_param1 = ?</DirectInput>
       <Parameters>
        <Parameter VariableName="PackageParamStr" Name="@sp_param1" />
       </Parameters>
      </OleDbSource>
     </Transformations>
    </Dataflow>
   </Tasks>
  </Package>
 </Packages>
</Biml>

Line 13 shows how to use the package parameter as a parameter for the stored procedure being called by the OLE DB Source.

Why Bother with Biml?

Also check out an updated post on – More reasons you should adopt Biml

If you haven’t heard of Biml before, it stands for Business Intelligence Markup Language. Basically it allows you to describe Microsoft Business Intelligence objects (SSIS Packges, SSAS Cubes) in XML which are then generated into the actual objects.

At first reading you would be forgiven for thinking that this is just going to add another step in an already long winded process to build a data warehouse. That is what I used to think until I saw it in action a few weeks ago at SQLRally.

What does it look like?

Here is a trivial example which will just generate two project level connection managers, and one SSIS package which has a single Execute SQL task.

1-what-it-looks-like

As you can see, the XML is very expressive which makes it very easy to read and understand. After running the Biml generation tool the following appears (as if by magic) in the Integration Services project.

2-solution-explorer

And the package itself looks like this:

3-the-product

Mixing C# and Biml

What makes Biml really powerful is the ability to mix C# with the Biml code to generate dynamic packages. Take another trivial example:

4-biml-with-c-sharp

You can see the c# code in lines 7, 8 and 19. This Biml script will loop through the <package> element five times and in turn create five packages.

5-solution-with-dyn-packages

So why bother?

In addition to being able to embedded trivial looping mechanisms, you can get a lot more creative and sophisticated. For example, you can execute sql queries and then loop over the results. This would allow you to generate dynamic packages based on some metadata store.

At SQLRally Davide Mauri went into a lot of depth on implementing design patterns with Biml to automate the monkey work, and I think he absolutely right! He has very helpfully posted his slides and demo code on his blog, and I would highly recommend that you have a read through as it is very educational.

What I’m working on

The shop I’m currently working for stores a lot of data in SalesForce, so I’m looking at how I could implement some of the design patterns that Davide has shown with SQL Server as the source system and use SalesForce instead.

Stay tuned! (And let me know if this would be of interest to you).

References

http://bimlscript.com – A great resource of tutorials, from getting started to in-depth

http://www.varigence.com/Products/Biml/Capabilities – These are the original developers, they have their own IDE

http://bidshelper.codeplex.com – This handy addon for BIDS / SSDT BI enables you to compile your Biml into SSIS packages. If you are a Microsoft BI developer you should really already be using this!

SQLRally Nordic – My Reflections

This week I attended SQLRally Nordic edition which was held in Stockholm and I must start by saying how grateful I am to my boss for sending me. Second, thanks to all the Swedish people that spoke English to me, I had some really nice conversations with people, most of them started off “You’re Australian, but you live in London?”.

Monday: Pre-Con: Data Warehouse Modeling – Making the Right Choices

This workshop was the main reason I came to SQLRally. It was an entire day talking about data warehouse modeling! Both Davide and Thomas give a very thorough overview of the entire process end-to-end. This part of the description really stood out to me before the event:

In this workshop, Thomas Kejser and Davide Mauri will share all the information they learned since they started working with data warehouses, giving you the guidance and tips you need to start your BI project in the best way possible―avoiding errors, making implementation effective and efficient, paving the way for a winning Agile approach, and helping you define how your team should work so that your BI solution will stand the test of time

I was particularly interested in hearing about how to take an agile approach to building a data warehouse, and in future posts I’m going to cover more of what I learnt.

To summarize their approach:

  • Implement BI Design Patterns (e.g. loading staging tables, dimensions and facts, these are all well known and repeatable)
  • Automate everything, after you’ve standardized your solution with the above design patterns. Davide showed some really excellent demos of how to use BIML to create a bunch of SSIS packages based on meta data
  • Focus on the “business” processes – after automating all the “technical” processes (the repeatable and well known parts) you will have more time to focus on the parts which are business specific

Their key mantra throughout the whole day was: avoid the monkey work.

Tuesday’s Top Session: Index Impact Analysis (with Klaus Aschenbrenner)

It is a classic problem that everyone who works with SQL Server faces, how do I measure the impact of adding an index.

In this session Klaus showed how easy it is to:

  • Capture a workload of real transactions on a production OLTP database
  • How to use this workload to establish a baseline in a test environment
  • How to use the baseline to measure the actual impact of adding an index

His approach (using the Distributed Replay feature of SQL Server) is definitely one that anyone who was listening to him could reproduce. This is something that I’m going to be looking into further when I’m back in the office!

Wednesday’s Top Session: Automating Your DW Development (with Davide Mauri)

This was a more in-depth session on BIML (Business Intelligence Markup Language) and BI Design Patterns. During Monday’s pre-con Davide talked at a high level about design patterns and how to automate their implementation, however only gave a few simple examples.

In this session he showed a number of different design patterns and then demoed how they could be implemented and automated using BIML, as well as another custom tool which he developed specifically for creating packages which deal with Slowly Changing Dimensions.

The other thing that made this session stand out was the presentation style, Davide is a very energetic speaker, he spoke very quickly and still didn’t finish all of his slide deck. There was no point when I thought “come on, hurry up”. 10/10 in my book.

In Summary

I can definitely say that my expectations were met and exceeded. The pre-con was excellent, I learnt a lot of very practical things which I can start doing straight away, as well as a good amount of theory to backup the “why” I should start doing these things.

The sessions I attended were varied, but overall the quality was excellent. There was one session I ended up in which was way over my head and in hindsight I wish I had recognized this in the first 5 minutes and gone to another session. My take away from this is to actually take note of the session level before it starts.