Improving .Net Performance

Over the past couple of weeks we’ve noticed that our Reporting service is terribly slow.

splunk-dblogger

The above chart is part of our APM (Application Performance Monitoring) solution built on top of MiniProfiler and Splunk (see this blog post for more details). Our reporting service works by processing messages which are queued up by various other services onto a queue in RabbitMQ. What this chart is showing is that the reporting service is working at a very constant and slow pace. It will eventually get through the backlog of messages, but it takes it a few hours each morning while the CPU sits at around 100%.

Instead of the chart above, what I’d rather see is a big spike in activity that is sustained for a (much) shorter period of time!

In this post I’m going to show how I used the built-in tooling in Visual Studio 2013 to find and diagnose the bottlenecks in our code and then visualize the improvement in performance using Splunk and MiniProfiler.

Finding and fixing performance bottlenecks

I’ve known for a long time that Visual studio has built-in performance sampling tools, but have rarely used them. I found Sasha Goldshtein’s (blog|Twitter) Pluralsight course on Measuring .Net Performance an excellent overview of how to get started with them.

To measure the performance I first queued up a significant number of messages for the reporting service to process then started the Visual Studio Profiler.

Profiling CPU Usage

I started by setting the project I wanted to profile as the “Startup project”. Then from the Analyze menu, selected Performance and Diagnostics, which brings up the following window

1-performance-and-diagnostics

I then selected Performance Wizard under Available Tools and then clicked the Start button. This brings up a dialog where you can select a few options, I selected CPU sampling as the profiling method, then the Reporting Service project as the profiling target, then clicked finished. The Reporting service then started and Visual Studio started profiling, after about 5 minutes I stopped the profiling.

Viewing Profiler Reports and fixing bottlenecks

This is what the first profiling report looked like

2-first-perf-report

As you can see from the Hot Path, the Infrastructure.PubSub.EzNQ.BusItemTypeNameSerializer.DeSerialize method gets called lot and is potentially very slow! The good news here is that this is our code, which means I can click on this row and visual studio will take me to this method, and even highlight the expensive code.

3-first-hot-path

The class in question is an implementation of an interface which is part of EasyNetQ (a very nice .Net API for RabbitMQ which all of our services use). The interface ITypeNameSerializer let’s us customize our queue names and tells EasyNetQ what type it should deserialize a message into.

As shown above there is a clear bottleneck in this code, the fix turned out to be pretty straightforward, I created the typesCache Dictionary at the class level, and then updated the DeSerialize method to:

public Type DeSerialize(string serializedName)
{
  var typeName = serializedName.Split('_')[1];

  if (typesCache.ContainsKey(typeName))
  {
    return typesCache[typeName];
  }
  else
  {
    var type = AppDomain.CurrentDomain.GetAssemblies()
                .SelectMany(a => a.GetTypes())
                .FirstOrDefault(t => t.Name == typeName);

    typesCache[typeName] = type;

    return type;
  }
}

After making these changes I ran the profiler again to see what difference that had made, here are the profile results for the second run through

4-second-hot-path

As you can see the Infrastructure.PubSub.EzNQ.BusItemTypeNameSerializer.DeSerialize method doesn’t get a mention, looks like the fix was successful. But now a call to DataAccess.SimpleDataHelper.GetDatabase in the DbReporter class is in the hot path.

Clicking onto the ReportingService.DbReporter.WriteReportToDb row brings up the code for that method

5-writereporttodbThis c# code probably has any DBA wriggling in their seat right about now (sorry!), this method is writing each message that it processes into a corresponding table in our database using the likes of c# dynamic and SimpleData. It’s pretty powerful and very neat as we don’t have to write lots of code to persist all the different reporting events to the database.

Sadly there isn’t much that can really be done about the .Insert call, however after reading through the SimpleData documentation I realized that I don’t need to continually make calls to .GetDatabase, but rather can do this once in the constructor and SimpleData will take care of opening and closing the connections as needed.

After making these two fixes the profiler results after a third run don’t indicate anything specific that can be “fixed”, so it was time to push it up to our testing environment and do some real testing.

Comparing Performance

As I mentioned above our custom APM solution is built on top of Splunk and MiniProfiler. There are a number of advantages that we get from this that we wouldn’t get from the likes of New Relic or Stackify. One advantage is that we can add our own meta data to all the performance data we generate.

We put a Version number into all the performance data that we generate, this comes from the NuGet version number generated by TeamCity as part of our build process. Because we put this Version number in it means comparing performance between versions becomes trivial.

In Splunk I created the following dashboard, which allows me or any of my fellow developers at DrDoctor to select an Application and Transaction and then a baseline and comparison version.

comparing-performance

Judging by the results shown above, it’s working at roughly twice the speed, I’d have hoped for it to be a bit faster than that, but still I’m not complaining!

Unexpected consequences

Our Reporting service does two things it writes all IReportItem messages which it processes to their respective table in the database and the second thing is write all events into a log file for Splunk to index.

While doing some local testing I noticed that even writing to a log file was slow! The maximum speed I saw was ~50messages/second just to write to a log file! After making the fixes above the processing speed went up to 5,000 messages/second! Not insignificant that’s for sure.

6-queue speed

Advertisements

Send logs to Splunk with log4net UdpAppender

Last week I deployed a new Api hosted as an Azure WebApp, we wanted some reporting events to come out of it into our Splunk instance so we could keep an eye on whether it is working as expected. I started off by using the Splunk C# SDK as it looked nice and easy to add into our app.

A very trivial example of using the Splunk SDK would look like this:

public class SplunkLogger
{
	private readonly Splunk.Service service;
	private readonly Splunk.Receiver receiver;

	public SplunkLogger()
	{
		this.service = new Splunk.Service("myawesomesplunkinstance", 8089, "https");
		this.service.Login("", "");
		this.receiver = new Splunk.Receiver(service);
	}

	public void Log(string msg)
	{
		var args = new Splunk.ReceiverSubmitArgs
		{
			Source = "mysource",
			SourceType = "mysource-event",
			Index = "myindex"
		};

		receiver.Submit(args, msg);
	}
}

The Splunk SDK should have worked, but since there was A LOT of traffic at different times throughout the day the vast majority of the HTTP requests to the Splunk API just timed out.


System.Net.WebException: The operation has timed out
  at System.Net.HttpWebRequest.GetRequestStream(TransportContext& context)
  at System.Net.HttpWebRequest.GetRequestStream()
  at Splunk.HttpService.Send(String path, RequestMessage request)
  at Splunk.Service.Send(String path, RequestMessage request)
  at Splunk.Receiver.Submit(String indexName, Args args, String data)
  at Splunk.Receiver.Submit(Args args, String data)
  at Splunk.Receiver.Submit(ReceiverSubmitArgs args, String data)

After I discovered that it constantly just timed out I started to look around for an alternative to using the Splunk REST Api endpoint.

I’ve always noticed that Splunk lets you send events data to it via TCP/UDP but have never had cause or reason to use it. However given that our API was hosted as an Azure Web App and logging to the file system wasn’t an option I thought this would be the perfect time to try it out. As it turns out I was not disappointed.

 Setting up Splunk

The first thing you will need to do is configure your Splunk instance to listen out for events from a Udp port.

  1. From the Splunk settings, select Data Inputs1-data-inputs
  2. Click “Add new” UDP input2-add-new-udp
  3. This will bring up the Add Data wizard, enter the port you want Splunk to listen on (this will also need to go into your log4net config)3-add-data-wizard
  4. The next page “Input Settings” let you (optionally) specify various things about the data you are going to be sending. There are two particularly useful settings here, the sourcetype and index4-input-settings
  5. Click Review and then Submit

Splunk is now listening on that UDP port.

Important Note: Make sure that all firewall ports between your producer and Splunk are open to allow UDP traffic on the specified port! Otherwise it will silently fail.

Introducing the log4net UdpAppender

The first step of getting this working from your application is adding the log4net NuGet package to your project, next up open your log4net configuration and add the following <appender>

<appender name="SplunkAppender" type="log4net.Appender.UdpAppender">
  <remoteAddress value="<your-splunk-server>" />
  <remotePort value="<your-udp-port>" />
  <layout type="log4net.Layout.PatternLayout">
    <ConversionPattern value="%m%n" />
  </layout>
</appender>
<root>
  <level value="DEBUG"></level>
  <appender-ref ref="SplunkAppender"></appender-ref>
</root>

Now from your c# code you use log4net as you normally would:

var logger = log4net.LogManager.GetLogger("SplunkLogger");
logger.Info("logged to splunk!");

And you should start seeing your events in Splunk.

If you need to verify that log4net is sending events over UDP you can use Wireshark. Here I have set it up to capture packets from my wifi adapter and monitoring port 9977

5-wireshark

Measuring Application Performance with Mini Profiler and Splunk


At DrDoctor we use New Relic for performance monitoring of our web applications. However web only makes up a small part of our overall system. We make extensive use of a messaging based system and have lots of micro-services which communicate over a messaging bus.

After adding New Relic to our Windows services we soon realized that they don’t really do a good job of supporting them. The main reason for this afar as I can see from reading their docs is that New Relic depends on an active HttpContext for profiling applications.

Having used New Relic to pinpoint the performance problems in our web app, I started to think about what it would take to build something similar but for a Windows Service. This blog post is showing off a proof-of-concept that I’m currently developing.

Introducing Mini Profiler

Mini Profiler is a lightweight c# library written by the folk at StackExchange. It was designed to show performance metrics for web applications, but is flexible enough that I was able to plumb it into our windows services. Doing this is pretty easy:

MiniProfiler.Settings.ProfilerProvider = new StackExchange.Profiling.SingletonProfilerProvider(); //note: not thread safe
MiniProfiler.Settings.Storage = new Log4NetStorage(container.Resolve<ILogger>()); //this is the castle-windsor wrapper interface
MiniProfiler.Settings.SqlFormatter = new SimpleSqlFormatter();

The next step was to create my own custom storage mechanism which would output the results of a profiling session to a log file. This was not entirely trivial, as the profiler session is a hierarchy of steps. Here is how I implemented the Save method of IStorage:

public void Save(MiniProfiler profiler)
{
    var timings = new Stack<Timing>();

    if (profiler.Root.HasChildren)
    {
        var children = profiler.Root.Children;
        for (var i = children.Count - 1; i >= 0; i--) timings.Push(children[i]);
    }
    else
    {
        timings.Push(profiler.Root);
    }

    var intro = string.Format(@"{{""Timestamp"":""{0}"",""Application"":""{1}"",""Status"":""Starting"",""TransactionId"":""{2}"",""Transaction"":""{3}""}}",
                    profiler.Started.ToString(DATE_FORMAT),
                    applicationName,
                    profiler.Root.Id,
                    profiler.Root.Name);

    logger.Info(intro);

    decimal runningDuration = 0;

    while (timings.Count > 0)
    {
        var timing = timings.Pop();
        var name = timing.Name;

        decimal customTimingDuration = 0;

        StringBuilder sb = new StringBuilder();

        if (timing.HasCustomTimings)
        {
            foreach (var item in timing.CustomTimings)
            {
                customTimingDuration = item.Value.Sum(v => v.DurationMilliseconds ?? 0);

                decimal customOffset = runningDuration;

                foreach (var item2 in item.Value)
                {
                    customOffset += item2.DurationMilliseconds ?? 0;
                    logger.Info(string.Format(@"{{""Timestamp"":""{0}"",""Application"":""{1}"",""Status"":""Processing"",""TransactionId"":""{2}"",""Transaction"":""{3}"",""Action"":""{4}"",""CommandString"":""{5}"",""Duration"":""{6}""}}",
                        profiler.Started.AddMilliseconds(Convert.ToDouble(customOffset)).ToString(DATE_FORMAT),
                        applicationName,
                        profiler.Root.Id,
                        profiler.Root.Name,
                        string.Format("{0}/{1}", name, item.Key),
                        item2.CommandString,
                        item2.DurationMilliseconds));
                }
            }
        }

        var offset = Convert.ToDouble(timing.DurationWithoutChildrenMilliseconds - customTimingDuration);
        var msg = string.Format(@"{{""Timestamp"":""{0}"",""Application"":""{1}"",""Status"":""Processing"",""TransactionId"":""{2}"",""Transaction"":""{3}"",""Action"":""{4}"",""Duration"":""{5}""}}",
            profiler.Started.AddMilliseconds(Convert.ToDouble(runningDuration)).ToString(DATE_FORMAT),
            applicationName,
            profiler.Root.Id,
            profiler.Root.Name,
            name,
            offset);

        logger.Info(msg);
        if (sb.Length > 0)
        {
            logger.Info(sb.ToString());
        }
        if (timing.HasChildren)
        {
            var children = timing.Children;
            for (var i = children.Count - 1; i >= 0; i--) timings.Push(children[i]);
        }

        runningDuration += timing.DurationWithoutChildrenMilliseconds + customTimingDuration;
    }

    var end = string.Format(@"{{""Timestamp"":""{0}"",""Application"":""{1}"",""Status"":""Finished"",""TransactionId"":""{2}"",""Transaction"":""{3}""}}",
                    profiler.Started.AddMilliseconds(Convert.ToDouble(runningDuration)).ToString(DATE_FORMAT),
                    applicationName,
                    profiler.Root.Id,
                    profiler.Root.Name);

    logger.Info(end);

}

This isn’t particularly pretty, and I’ll be the first to admit that using string.Format to create JSON is really bad practice, but I wanted something that would be lightweight and work with log4net, which is our current logging library. Really what I should do is adopt semantic logging and use something like Serilog. But I’ll save that for another day.

The next step was to create my own ISqlFormatter, as the ones that come with Mini Profiler included a lot more detail than what I wanted to record, this was very trivial:

public class SimpleSqlFormatter : ISqlFormatter
{
    public string FormatSql(string commandText, List<SqlTimingParameter> parameters)
    {
        return commandText;
    }
}

With these two classes written plumbing it into my application became very easy and I soon had performance results being outputted as JSON into a log file, which I could then import into Splunk.

With the plumbing done, its time to actually instrument our code. With Mini Profiler what you do is call MiniProfiler.Start() and then .Stop() at the beginning and end of a session. In addition to this, you can call .Step() anywhere in between and this create a kind of sub-profiler, if for example you wanted to explicitly measure the performance of a particular part of your code.

When using Mini Profiler you will want to break your code into individual sessions, eg on a website this would be a single web request. With the idea being that you want to be able to measure and compare performance over time.

As mentioned above, the services that I want to add this to all communicate over a messaging bus. So the logical place to call .Start() and .Stop() is in the class that is invoked when a new message arrives on the bus. I also wrapped our database connection code with the Mini Profiler instrumentation code so that it can measure time spent executing SQL and what that SQL was.

Logger Output

After wiring everything up and running through a few test cases, this is a sample of the events that were produced:

{"Timestamp":"05/12/15 13:41:50.22 Z","Application":"ApplicationA","Status":"Starting","TransactionId":"2c9d0f76-cbca-4d4f-b736-48d09a3b75c1","Transaction":"Processor/MessageA"}
{"Timestamp":"05/12/15 13:41:50.24 Z","Application":"ApplicationA","Status":"Processing","TransactionId":"2c9d0f76-cbca-4d4f-b736-48d09a3b75c1","Transaction":"Processor/MessageA","Action":"Dispatcher/sql","CommandString":"select ######### from ##### WHERE (###### = @p1 AND ##### = @p2)","Duration":"18.9"}
{"Timestamp":"05/12/15 13:41:50.22 Z","Application":"ApplicationA","Status":"Processing","TransactionId":"2c9d0f76-cbca-4d4f-b736-48d09a3b75c1","Transaction":"Processor/MessageA","Action":"Dispatcher","Duration":"669.8"}
{"Timestamp":"05/12/15 13:41:50.93 Z","Application":"ApplicationA","Status":"Processing","TransactionId":"2c9d0f76-cbca-4d4f-b736-48d09a3b75c1","Transaction":"Processor/MessageA","Action":"EasyNetQ/Publish/MesageB","Duration":"34"}
{"Timestamp":"05/12/15 13:41:50.96 Z","Application":"ApplicationA","Status":"Processing","TransactionId":"2c9d0f76-cbca-4d4f-b736-48d09a3b75c1","Transaction":"Processor/MessageA","Action":"EasyNetQ/Publish/MessageC","Duration":"4.3"}
{"Timestamp":"05/12/15 13:41:50.96 Z","Application":"ApplicationA","Status":"Processing","TransactionId":"2c9d0f76-cbca-4d4f-b736-48d09a3b75c1","Transaction":"Processor/MessageA","Action":"EasyNetQ/Publish/MessageD","Duration":"5.2"}
{"Timestamp":"05/12/15 13:41:50.97 Z","Application":"ApplicationA","Status":"Finished","TransactionId":"2c9d0f76-cbca-4d4f-b736-48d09a3b75c1","Transaction":"Processor/MessageA"}

Building Splunk Dashboards

Since the data is being outputted as JSON it is really easy to import into Splunk, I’m not going to cover this here. With data in Splunk it’s time to turn the raw data into some useful charts.

The first one is a stacked area chart, which shows the total time spent for a transaction over time. High values on this chart aren’t necessarily indicative of performance problems, as it could just be that we were experiencing high throughput of a certain transaction.

1-transaction-overview

A variation on the chart above would be to show the average duration instead of the total time, that would be more indicative of performance problems.

The next chart shows the total time spent on processing for each different transaction in the selected time period. Once again this chart isn’t necessarily indicative of a performance problem. But does show where our system spends most of its time.

2-total-time

The next chart shows the average duration of a transaction, this in the context of the previous chart could start to highlight potential problem areas. A variation on this chart would be to show it stacked with the number of times the transaction was executed.

3-average-duration

The next chart shows the maximum time it took to execute a SQL statement. This is very high level but still useful in seeing SQL queries that might need to be tuned. Averages would probably be a better values to show here.

4-slowest-queries

The last chart is a stacked area chart of time spent on the different actions that were measured, over time. This is more useful when looked at in the context of a single transaction type.

5-transaction-details

As an added bonus, Splunk lets you click on any of the charts and drill through to the raw events.

4-event-drilldownAnd here is what the dashboard looks like with everything in place, note also the filters at the top of the page, so I can drill into the different applications and individual transactions – they also cross filter 🙂

6-dashboard

This is still very much a proof-of-concept, I’m hoping to get this running in production in the next couple of weeks. When it’s running in production I’ll post a follow up.

Storing certificates more securely when using Google APIs, OAuth 2.0 and .Net

Recently, I’ve been replacing bit.ly with the Google URL shortener service for a .Net project. It’s a pretty simple service, and the documentation shows very clearly how to get going.

Google have 3 different ways of authenticating using OAuth 2.0 (details), the method I am using is as a “Service Account”. Their code sample is thus:

String serviceAccountEmail = "SERVICE_ACCOUNT_EMAIL_HERE";
var certificate = new X509Certificate2(@"key.p12", "notasecret", X509KeyStorageFlags.Exportable);

ServiceAccountCredential credential = new ServiceAccountCredential(
  new ServiceAccountCredential.Initializer(serviceAccountEmail)
  {
    Scopes = new[] { PlusService.Scope.PlusMe }
  }.FromCertificate(certificate));

// Create the service.
var service = new UrlshortenerService(new BaseClientService.Initializer()
{
  HttpClientInitializer = credential,
  ApplicationName = "API Sample",
});

The key thing that I want to draw your attention to is line 2, this implies (to me) that you are probably goign to keep the certificate in source control along with your code. This isn’t good, as it could lead quite easily to mistakenly making it public (e.g. pushing the project to GitHub).

A better solution would be to attempt to load the certificate from the local certificate store on the machine.

So that means that line 2 above, would be replaced with the following:

string certificateIssuerName = "blahblahblah123456.apps.googleusercontent.com";

X509Store store = new X509Store(StoreLocation.LocalMachine);
store.Open(OpenFlags.ReadOnly);
var cert = store.Certificates.Find(X509FindType.FindByIssuerName, certificateIssuerName, false)[0];
store.Close();

Things to know:

  1. NB: Very important, you need to make sure you call the Open method on the store, otherwise you will find that calling the Find method will never return you a certificate.
  2. When importing the certificate into your certificate store, make sure you include the private key, otherwise your code won’t work.

Accessing Google APIs through a proxy with .Net

Recently I’ve been replacing the use of Bit.ly as a link shortener with Google’s link shortener service (goo.gl) in one of our .Net projects.

It’s pretty easy to get going with their handy .Net NuGet package and sample code, however what wasn’t obvious was how to pass the HTTP requests through a proxy. After a bit of digging around, it turned out to be pretty easy:

The first thing that needs to be done is to create our own ProxySupportedHttpClientFactory class, which inherits from the Google APIs HttpClientFactory class and override the CreateHandler method to return a different handler:

public class ProxySupportedHttpClientFactory : HttpClientFactory
{
  protected override HttpMessageHandler CreateHandler(CreateHttpClientArgs args)
  {
    var proxy = new WebProxy("http://proxyserver:8080", true, null, null);

    var webRequestHandler = new WebRequestHandler()
    {
      UseProxy = true,
      Proxy = proxy,
      UseCookies = false
    };

    return webRequestHandler;
  }
}

The next step is to then use our new ProxySupportedHttpClientFactory when creating the UrlshortenerService:


var credentials = ....

var linkService = new Google.Apis.Urlshortener.v1.UrlshortenerService(new BaseClientService.Initializer()
{
  HttpClientInitializer = credentials,
  ApplicationName = "MyURLThingy",
  HttpClientFactory = new ProxySupportedHttpClientFactory()
});

And that’s it.

No More Interruptions – Integrating Codealike and HipChat

I’ve recently started using Codealike, a service that tracks various metrics while I’m coding. The data it collects is then presented in a bunch of really useful ways to help determine when I’m being most productive, as well as the places our code base I spend most of my time and various other things.

One of the metrics they calculate as part of this process is how “focused” you are and from this they determine whether or not someone should interrupt you – they have three different levels, No Activity, Can Interrupt and Cannot Interrupt.

One of the worst things that can happen is being interrupted while you’re “in the zone” or as Codealike put it “on fire”.

1-may-i-interrupt

Codealike does provide a webpage that you could put on a display in your office which displays your current status (you can view my current status), however whilst I have two monitors at work I tend you use both of them.

We use HipChat at work as our IM of choice, so, during the last week I’ve started to set my status as Do Not Disturb when Codealike thinks I’m in the zone.

Introducing: Codealike IM Updater

To remove the interruption of having to update my status in HipChat when I’m in the zone I set out to build the Codealike IM Updater. It’s a simple application that periodically checks my status according to Codealike and update my status on HipChat accordingly.

2-im-updaterAs you can see, it is very simple – all the user needs to do is supply their Codealike username, their HipChat API Token and Email address and how you want to map the different levels from Codealike to HipChat. Optionally you can also specify an optional message that is displayed next to your name in HipChat.

The code is available on GitHub and if anyone would like to extend it to work with other IM services then I will happily accept pull requests.

Nancy Testing – Razor View with Referenced Models

The last few weeks I’ve been working with Nancy (an awesome web framework for .Net), the core goal of the framework is the super-duper-happy path:

The goal of the framework is to stay out of the way as much as possible and provide a super-duper-happy-path to all interactions.

This means that everything in Nancy is setup to have sensible defaults and conventions, instead of making you jump through hoops and go through configuration hell just to get up and running. With Nancy you can go from zero to website in a matter of minutes. Literally.

From the experience that I’ve had with Nancy to date, I would say it easily lives up to this goal for both development and testing. It is very easy to get started writing tests and most of what you need to know is covered in the docs.

However, on Friday I started experiencing a problem with some of our existing tests which check that the Razor view has been rendered correctly – by counting the number of rows in a <table>.

The cause of the failing test was that I changed the ViewModel to include a List of objects which live outside of the Web App assembly. To get my Nancy project to work when deployed I had to update the web.config file to include a list of assemblies for Nancy to reference when rendering the Razor views.

This morning I came back this problem afresh and found the solution 😀 Reading through the Razor View Engine docs on GitHub there is mention of the IRazorConfiguration:

You can specify assemblies and default namespaces that Razor needs to use whilst compiling the views by bootstrapping your own IRazorConfiguration implementation, thus removing the need to add the @using statements to each view. This step is totally optional if you don’t require additional references or namespaces in your view.

After trying various things I came across two different ways to get my tests to pass:

1. Update the App.Config file

This is the simple and easy solution, unfortunately I discovered this solution after the harder one below. On Friday I made the mistake of updating the web.config file which was in my Tests project and my tests kept failing – no surprise there.

What I should have done was update my App.Config file, after I did this all the tests passed.

Add the following section to the configSections group:

<section name="razor" type="Nancy.ViewEngines.Razor.RazorConfigurationSection, Nancy.ViewEngines.Razor" />

Then add the following to the end of the config file just before </configuration>:

<razor>
  <assemblies>
    <add assembly="Domain" />
    <add assembly="Some.OtherAssembly" />
  </assemblies>
</razor>

With that simple change my view tests passed. I guess the web.config file was added when I added the Nancy NuGet package.

2. Implement the IRazorConfiguration interface

The slightly harder solution is to implement the IRazorConfiguration interface and reference this when setting up the ConfigurableBootstrapper – the advantage of this is that you have finer controller over what is going on, the downside is that you will need to include this everywhere you are using the ConfigurableBootstrapper.

You will need to create a RazorConfigurationTestingStub which implements the IRazorConfiguration, mine looked like this:

internal class RazorConfigurationTestingStub : IRazorConfiguration
{

    public bool AutoIncludeModelNamespace
    {
        get { return true; }
    }

    public IEnumerable<string> GetAssemblyNames()
    {
        return new List<string>() { "Domain", "Some.OtherAssembly" };
    }

    public IEnumerable<string> GetDefaultNamespaces()
    {
        return Enumerable.Empty<string>();
    }
}

As you can see the list of Assembly names is the same list which was included in the App.Config file (in my Tests project) and the same list which is included in the web.config file (in my Nancy Web App project).

When setting up the ConfigurableBootstrapper you will need to include the stub as the implementation for the IRazorConfiguration dependency:

var bootstrapper = new ConfigurableBootstrapper(cfg =>
{
    cfg.ApplicationStartup((container, pipelines) =>
        {
            CookieBasedSessions.Enable(pipelines);
        });
    cfg.ViewFactory<TestingViewFactory>();
    cfg.RootPathProvider(new TestRootPathProvider());
    cfg.Module<Module.UnderTest>();
    cfg.Dependency<IThing>(thingInstance);
    <strong>cfg.Dependency<IRazorConfiguration>(new RazorConfigurationTestingStub());</strong>
});

After I did this all the tests passed.

BIML Tip – Use LinqPad to write C#

To see any real benefit of using BIML you will need to combine it with C# to generate dynamic packages. The only problem is, unless you fork out $1000’s for MIST then your only alternative is BIDSHelper. This is a great tool, but one of the biggest problems is that you loose IntelliSense for C#. Lately I’ve been using *LinqPad to prototype and test C# code which I then paste into my BIML files.

In this tip I’m going to show you how you can use LinqPad to speed up your BIML Scripting experience.

Just a quick note before we get started: To get the added benefit of IntelliSense in LinqPad you have to fork out $39USD for the premium edition, however as you will see this is definitely a worthwhile expense. Even if you don’t purchase the premium version you can still follow along, you’ll just miss out on the auto-complete. (I have not being paid for this endorsement).

Setting Up LinqPad

You can download LinqPad from http://www.linqpad.net/

Once you have installed and launched LinqPad you will see an empty query window.

1-main-query-window

To be able to use the BIML engine and Varigence extensions we’ll need to add a reference to the required assemblies and their namespaces. To do that, press F4, this will bring up the Query Properties window.

The first thing that you will need to do is add the assembly references. Click the Browse button and locate BimlEngine.dll (for me this is in C:\Program Files (x86)\BIDS Helper 2012). I also added a reference to WindowsBase.dll which you can do by clicking the Add button then searching for that assembly.

2-additional-references

The next thing to do is setup the namespaces, click the “Additional Namespace Imports” tab, and paste the following:


System.Data
System.Data.SqlClient
Varigence.Hadron.CoreLowerer.SchemaManagement
Varigence.Hadron.Extensions.SchemaManagement
Varigence.Languages.Biml
Varigence.Languages.Biml.Connection
Varigence.Languages.Biml.Table
Varigence.Hadron.Extensions

3-additional-namespace-imports

Click OK to close the Query Properties window.

Language Modes in LinqPad

LinqPad gives you a few different options of writing C#.

4-language-modes

 

I’ve found the most useful modes to write C# to be used with BIML is either the “C# Program” or “C# Statements” mode.

The “C# Program” mode is useful when you want to write different methods which can be reused in your BIML. Whereas “C# Statements” mode is useful when you just want to write some C# to directly paste into your BIML, you can’t break this into different methods.

Writing C# – Example 1

In this first example I’ll show you how you could use LinqPad to write some very simple C# code that will use the Varigence extension methods to query the metadata of your database.

This is useful in BIML when you want to generate either a number of different packages to perform over a series of tables, or multiple sequences in a single package over a series of tables.

Set LinqPad to use “C# Statements” mode and paste in the following code:

string connectionstring = "Data Source=.;Initial Catalog=AdventureWorksDW2012;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;";

var targetConnection = SchemaManager.CreateConnectionNode("SchemaProvider", connectionstring); 

IEnumerable<AstTableNode> sourceTables = targetConnection.ImportDB("dbo", "Dim%", ImportOptions.ExcludeViews).TableNodes; 

sourceTables.ToList().ForEach(x=>x.Name.Dump());

This will simply use the SchemaManager and ImportDB extension methods provided by Varigence to dump the names of all the tables in the AdventureWorksDW2012 dbo schema that start with Dim

5-example1-1

 

 

The next thing to do is make this a bit more useful, paste the following code:


string connectionstring = "Data Source=.;Initial Catalog=AdventureWorksDW2012;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;";

var targetConnection = SchemaManager.CreateConnectionNode("SchemaProvider", connectionstring);

IEnumerable<AstTableNode> sourceTables = targetConnection.ImportDB("dbo", "Dim%", ImportOptions.ExcludeViews).TableNodes;

foreach (var table in sourceTables)
{
table.Name.Dump();
table.GetColumnList().Dump();
}

This will now loop over all the tables and this time dump the table name and then the column list.

6-example1-2

 

These two snippets demonstrate how to get started using LinqPad.

We can now take the second snippet and plug it into a BIML file:


<!--
<#@ import namespace="System.Data" #>
<#@ import namespace="Varigence.Hadron.CoreLowerer.SchemaManagement" #>
<#@ import namespace="Varigence.Hadron.Extensions.SchemaManagement" #>

<#
	string connectionstring = "Data Source=.;Initial Catalog=AdventureWorksDW2012;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;";

	var targetConnection = SchemaManager.CreateConnectionNode("SchemaProvider", connectionstring); 

	IEnumerable<AstTableNode> sourceTables = targetConnection.ImportDB("dbo", "Dim%", ImportOptions.ExcludeViews).TableNodes;
#>
-->

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
	<Connections>
		<OleDbConnection ConnectionString="Data Source=.;Initial Catalog=AdventureWorksDW2012;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;" Name="Source"></OleDbConnection>
		<OleDbConnection ConnectionString="Data Source=.;Initial Catalog=MyDW;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;" Name="Target"></OleDbConnection>
	</Connections>
	<Packages>
		<Package ConstraintMode="Linear" Name="Extract">
			<Connections>
				<Connection ConnectionName="Source"></Connection>
				<Connection ConnectionName="Target"></Connection>
			</Connections>
			<Tasks>
			<!--
			<# foreach(var table in sourceTables)
			{
			#>
			-->
			<ExecuteSQL ConnectionName="Target" Name="Truncate Import Table <#=table.Name#>">
				<DirectInput>
					TRUNCATE TABLE Import.<#=table.Name#>
				</DirectInput>
			</ExecuteSQL>
				<Dataflow Name="Copy data into Source <#=table.Name#>">
					<Transformations>
						<OleDbSource ConnectionName="Source" Name="Source - <#=table.Name#>">
							<DirectInput>
								SELECT
									<#=table.GetColumnList()#>
								FROM
									<#=table.Name#>
							</DirectInput>
						</OleDbSource>
						<OleDbDestination ConnectionName="Target" Name="Target - <#=table.Name#>">
							<ExternalTableOutput Table="Import.<#=table.Name#>"></ExternalTableOutput>
						</OleDbDestination>
					</Transformations>
				</Dataflow>
			<!--
			<#
			}
			#>
			-->
			</Tasks>
		</Package>
	</Packages>
</Biml>

This BIML script will then generate a package which will use the AdventureWorksDW2012 database as a source database and copy all the data from each of the Dim tables into an equivalent table in the Import schema of the MyDW database.

Writing C# – Example 2

In this example I will show you how you could use LinqPad to write a reusable methods which can then be in your BIML files.

The example that I will use is the GetSqlServerExtendedProperties method which was provided by Dave Stein (b | t), it simply queries the extended properties on a given table.

Set LinqPad to use “C# Program” mode and paste in the following code:


AstOleDbConnectionNode targetConnection;

void Main()
{
	targetConnection = SchemaManager.CreateConnectionNode("SchemaProvider", "Data Source=.;Initial Catalog=AdventureWorksDW2012;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;"); 

	var sourceTables = targetConnection.ImportDB("dbo", "Dim%", ImportOptions.ExcludeViews).TableNodes; 

	foreach (var element in sourceTables)
	{
		GetSqlServerExtendedProperties(element);
	}

	foreach (var element in sourceTables)
	{
		var skip = element.Annotations.FirstOrDefault (a => a.Tag.Equals("ETL_Should_Skip"));

		if (annot.Text.Equals("1"))
		{
			//skip the table as it is marked to skip
			continue;
		}
		else
		{

		}
	}
}

AstNode GetSqlServerExtendedProperties(AstTableNode table)
{
	AstNode returnValue = (AstNode)table;

	var query = string.Format("SELECT name, value FROM fn_listextendedproperty(NULL, 'schema', '{0}', 'table', '{1}', NULL, NULL)", table.SchemaName, table.Name);

	var extendedProperties = ExternalDataAccess.GetDataTable(targetConnection.ConnectionString, query);

	foreach (DataRow ep in extendedProperties.Rows)
	{
		returnValue.AddAnnotation(AnnotationType.Tag, ep["value"].ToString(), ep["name"].ToString());
	}

	return returnValue;
}

This snippet will query the extended properties of all the tables in the dbo schema that starts with Dim and then loop through each of them. If it finds one called ETL_Should_Skip (lines 16 and 18) then it will skip it and go to the next one in the sequence.

This could be useful for example if you wanted to control which tables should go into your package without having this hard coded, or having to add a “control table” to your database.

The BIML would look like this:

<!--
<#@ import namespace="System.Data" #>
<#@ import namespace="Varigence.Hadron.CoreLowerer.SchemaManagement" #>
<#@ import namespace="Varigence.Hadron.Extensions.SchemaManagement" #>

<#+
AstOleDbConnectionNode conn;
#>

<#
	string connectionstring = "Data Source=.;Initial Catalog=AdventureWorksDW2012;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;";

	conn = SchemaManager.CreateConnectionNode("SchemaProvider", connectionstring); 

	IEnumerable<AstTableNode> sourceTables = conn.ImportDB("dbo", "Dim%", ImportOptions.ExcludeViews).TableNodes;
#>
-->

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
	<Connections>
		<OleDbConnection ConnectionString="Data Source=.;Initial Catalog=AdventureWorksDW2012;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;" Name="Source"></OleDbConnection>
		<OleDbConnection ConnectionString="Data Source=.;Initial Catalog=AdventureWorksDW2012;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;" Name="Target"></OleDbConnection>
	</Connections>
	<Packages>
		<Package ConstraintMode="Linear" Name="Extract">
			<Connections>
				<Connection ConnectionName="Source"></Connection>
				<Connection ConnectionName="Target"></Connection>
			</Connections>
			<Tasks>
			<!--
			<# 

			foreach (var element in sourceTables)
			{
				GetSqlServerExtendedProperties(element);
			}

			foreach(var table in sourceTables)
			{
				var skip = table.Annotations.FirstOrDefault (a => a.Tag.Equals("ETL_Should_Skip"));

				if (skip != null && skip.Text.Equals("1"))
				{
					//skip the table as it is marked to skip
					continue;
				}

			#>
			-->
			<ExecuteSQL ConnectionName="Target" Name="Truncate Import Table <#=table.Name#>">
				<DirectInput>
					TRUNCATE TABLE dbo.<#=table.Name#>
				</DirectInput>
			</ExecuteSQL>
			<Dataflow Name="Copy data into Source <#=table.Name#>">
				<Transformations>
					<OleDbSource ConnectionName="Source" Name="Source -"
						<#=table.Name#>">
						<DirectInput>
							SELECT
							<#=table.GetColumnList()#>
							FROM
							<#=table.Name#>
						</DirectInput>
					</OleDbSource>
					<OleDbDestination ConnectionName="Target" Name="Target -"
						<#=table.Name#>">
						<ExternalTableOutput Table="Import."
							<#=table.Name#>">
						</ExternalTableOutput>
					</OleDbDestination>
				</Transformations>
			</Dataflow>
			<!--
			<#
			}
			#>
			-->
			</Tasks>
		</Package>
	</Packages>
</Biml>

<!--
<#+
AstNode GetSqlServerExtendedProperties(AstTableNode table)
{
	AstNode returnValue = (AstNode)table;

	var query = string.Format("SELECT name, value FROM fn_listextendedproperty(NULL, 'schema', '{0}', 'table', '{1}', NULL, NULL)", table.SchemaName, table.Name);

	var extendedProperties = ExternalDataAccess.GetDataTable(conn.ConnectionString, query);

	foreach (DataRow ep in extendedProperties.Rows)
	{
		returnValue.AddAnnotation(AnnotationType.Tag, ep["value"].ToString(), ep["name"].ToString());
	}

	return returnValue;
}
#>
-->

Conclusion

If you find yourself writing lots of C# code which you’re using within BIML then it is definitely work trying out LinqPad to see if it makes your life easier.

 

Debugging Custom SSIS Components

In the past I have written extensively about how to build custom components for SQL Server Integration Services, these posts have always been focused on the ‘happy path’, if you’re not familiar with this phrase it refers to the path through your application that works exactly as expected. Often times in development we have to deal with the sad path, or when things aren’t working as we would like or expect.

In this post I will show two techniques that can be used to help with debugging custom SSIS components.

Attaching the Visual Studio Debugger

You will need to have Visual Studio open with the source code for the custom component and have SSDT open with the package that uses the custom component.

From Visual Studio select the Debug menu then Attach to Process

4-debug-menu

This will bring up the following window showing a list of available processes for debugging:

5-attach-debugger-to

In the list of available processes select the instance of devenv.exe that corresponds to your SSIS solution, then click Attach. Once attached any breakpoints that you define will be hit, from where you can debug your component.

This method works just fine for when you want to debug the setup and configuration of your package, that is, anything the users of your component will be doing prior to executing the package.

The disadvantage of this method is that once you actually start executing the package the debugger will detach itself. The reason for this is because execution of the package happens within a different process. Continue reading for a better method.

Launching the Debugger from in Code

Another way to automatically start debugging is to add the following line of code:

Debugger.Launch();

Into the particular method that you want to debug or into the constructor.

1-adding-debugger-launch

Here you can see I’ve added the Debugger.Launch() code into the constructor for my HDFS Task Component, Now when I open a package in SSDT that uses this component, or if I drag a new instance of the task into an existing package the following window will appear:

2-jit-debugger

Here I can select how I want to debug the application. During development of custom components I always have an instance of visual studio open with my component solution and you can see it listed as the third option in the image above. By selecting this option and clicking Yes I will be taken back to that instance of Visual Studio which will then attach itself to my instance of SSDT. It will then break execution.

3-back-in-visual-studio

At this point I can set any breakpoints where I want to break future execution. From here press F5 to continue. Visual Studio will remain attached to SSDT and any breakpoints you’ve defined will interrupt execution and bring you back here.

The disadvantage of this method is that the “Do you want to debug this application” window will keep appearing at various points during package design unless you attach the debugger.

The advantage of using this method is that once you actually start executing your package you will again be prompted to attach the debugger, which means you will be able to debug the component during package execution.

Make sure you either remove this line of code before shipping it for actual use or wrap it around the #if DEBUG / #endif compiler directives as shown above and remember to use the Release configuration for production use!

Accessing SalesForce using the Partner API and C#

SalesForce exposes a number of WebService APIs which can be used to access objects in an organization (see here for details). In this post I’m going to show how from a C# application you can use the Partner API.

The Partner API is designed for those who want to develop applications which are agnostic to the objects in a given SalesForce account. This means that it will work for multiple organizations, which is in contrast to the Enterprise API which only works for a single organization and gives a strongly typed interface into that single organization.

I used the Partner to build an application which dynamically generates SQL Server Integration Services packages based on SalesForce meta data. You can read more about it here.

This post is for other people who are interested in accessing SalesForce via C# to retrieve metadata about objects and perform queries.

Getting Setup

If you don’t have access to SalesForce you can sign up for a developer account at http://developer.force.com/

Once you have an account, sign in to SalesForce and click on Setup in the top left.

Then expand the Develop menu and click API.

1-api-menu

This will then show a list of API WSDLs that can be downloaded. The one that we are interested in is the Partner API.

2-partner-api

Right click on the Generate Partner WSDL link and save it to your computer as PartnerAPI.xml

Open Visual Studio and create a new Console application, I’ve called mine SalesForcePartnerDemo.

Right click on your project and select Add Service Reference, then click Advanced, then click Add Web Reference.

This will bring up the Add Web Reference window, in the URL text box enter the file path for the Partner API WSDL file that you downloaded, then give it the name SFPartnerAPI then click Add Reference.

Visual Studio will then generate code which can be used to access the API as well as provide some classes which we can use.

Logging Into SalesForce

To login to SalesForce you will need your email address, password and the security token. (Resetting your Security Token)

The follow code snippet will perform the login action:

SFPartnerAPI.SforceService partnerApi = new SFPartnerAPI.SforceService();

SFPartnerAPI.LoginResult loginResult = partnerApi.login("emailaddress", "passwordsecuritytoken");

partnerApi.SessionHeaderValue = new SFPartnerAPI.SessionHeader();
partnerApi.SessionHeaderValue.sessionId = loginResult.sessionId;
partnerApi.Url = loginResult.serverUrl;

When users are using this application they will need to enter their username and then their password combined with the security token.

Listing SalesForce Objects

The next thing to do is to list all of the objects that are in the logged in users SalesForce account:

var accountObjects = partnerApi.describeGlobal();

foreach (var item in accountObjects.sobjects)
{
  Console.WriteLine(string.Format("{0} ({1})", item.name, item.label));
}

This snippet will retrieve all of the objects in SalesForce and write to the console the underlying name of the object and the label (or display name).

3-account-objects

Retrieving meta data about a specific object is not much more complicated:

var opportunityObject = partnerApi.describeSObject("Opportunity");

foreach (var item in opportunityObject.fields)
{
  Console.WriteLine(string.Format("{0} ({1}) - {2}", item.name, item.label, item.type));
}

foreach (var item in opportunityObject.childRelationships)
{
  Console.WriteLine(string.Format("{0}", item.relationshipName));
}

This snippet will list details on each of the fields within the object and any child relationships that have been defined.

4-opportunity-fields

5-opportunity-childrelationships

Querying Objects

The final thing to do is to write a query and see the result. For my example I’m going to query the Opportunity object and return all of the objects that are “Won”.

string query = "SELECT Id, Name, StageName, Amount FROM Opportunity WHERE IsWon = True";

var queryResult = partnerApi.query(query);

foreach (var item in queryResult.records)
{
  Console.WriteLine(string.Format("Id: {0}, Name: {1}, StageName: {2}, Amount: {3}", item.Any[0].InnerText, item.Any[1].InnerText, item.Any[2].InnerText, item.Any[3].InnerText));
}

Issuing queries is very straightforward once you know what column names to use, that is where retrieving the meta data of an object comes in handy.

6-query-results

(NB: This is data from my developer account – not real!)

Wrap Up

As you can see using the Partner API is very straightforward and exposes everything that you need to get started building data driven applications which access SalesForce.