ASP.NET Core Web API Performance - Throughput for Upload and Download

After working with the new ASP.NET Core server Kestrel and the HttpClient for a while in a number of projects I run into some performance issues. Actually, it was a throughput issue.
It took me some time to figure out whether it is the server or the client responsible for the problems. And the answer is: both.

Here are some hints to get more out of your web applications and Web APIs.

The code for my test server and client are on GitHub: https://github.com/PawelGerr/AspNetCorePerformance

In the following sections we will download and upload data using different schemes, storages and parameters measuring the throughput.

Download data via HTTP

Nothing special, we download a 20 MB file from the server using the default FileStreamResult:

[HttpGet("Download")]
public IActionResult Download()
{
    return File(new MemoryStream(_bytes), "application/octet-stream");
}

The throughput on my machine is 140 MB/s.
For the next test we are using a CustomFileResult with increased buffer size of 64 KB and suddenly get a throughput of 200 MB/s.

Upload multipart/form-data via HTTP

The ASP.NET Core introduced a new type IFormFile that enables us to receive multipart/form-data without any manual work. For that we create a new model with a property of type IFormFile and use this model as an argument of a Web API method.

public class UploadMultipartModel
{
    public IFormFile File { get; set; }
    public int SomeValue { get; set; }
}

-------------

[HttpPost("UploadMultipartUsingIFormFile")]
public async Task<IActionResult> UploadMultipartUsingIFormFile(UploadMultipartModel model)
{
    var bufferSize = 32 * 1024;
    var totalBytes = await Helpers.ReadStream(model.File.OpenReadStream(), bufferSize);

    return Ok();
}

-------------

public static async Task<int> ReadStream(Stream stream, int bufferSize)
{
    var buffer = new byte[bufferSize];

    int bytesRead;
    int totalBytes = 0;

    do
   {
       bytesRead = await stream.ReadAsync(buffer, 0, bufferSize);
        totalBytes += bytesRead;
    } while (bytesRead > 0);
    return totalBytes;
}

Using the IFormFile to transfer 20 MB we get a pretty bad throughput of 30 MB/s. Luckily we got another means to get the content of a multipart/form-data request, the MultipartReader.
Having the new reader we are able to improve the throughput up to 350 MB/s.

[HttpPost("UploadMultipartUsingReader")]
public async Task<IActionResult> UploadMultipartUsingReader()
{
    var boundary = GetBoundary(Request.ContentType);
    var reader = new MultipartReader(boundary, Request.Body, 80 * 1024);

    var valuesByKey = new Dictionary<string, string>();
    MultipartSection section;

    while ((section = await reader.ReadNextSectionAsync()) != null)
    {
        var contentDispo = section.GetContentDispositionHeader();

        if (contentDispo.IsFileDisposition())
       {
            var fileSection = section.AsFileSection();
            var bufferSize = 32 * 1024;
            await Helpers.ReadStream(fileSection.FileStream, bufferSize);
        }
        else if (contentDispo.IsFormDisposition())
        {
            var formSection = section.AsFormDataSection();
            var value = await formSection.GetValueAsync();
            valuesByKey.Add(formSection.Name, value);
        }
    }

    return Ok();
}

private static string GetBoundary(string contentType)
{
    if (contentType == null)
        throw new ArgumentNullException(nameof(contentType));

    var elements = contentType.Split(' ');
    var element = elements.First(entry => entry.StartsWith("boundary="));
    var boundary = element.Substring("boundary=".Length);

    boundary = HeaderUtilities.RemoveQuotes(boundary);

    return boundary;
}

Uploading data via HTTPS

In this use case we will upload 20 MB using different storages (memory vs file system) and different schemes (http vs https).

The code for uploading data:

var stream = readFromFs
    ? (Stream) File.OpenRead(filePath)
    : new MemoryStream(bytes);

var bufferSize = 4 * 1024; // default

using (var content = new StreamContent(stream, bufferSize))
{
    using (var response = await client.PostAsync("Upload", content))
    {
        response.EnsureSuccessStatusCode();
    }
}

Here are the throughput numbers:

  • HTTP + Memory: 450 MB/s
  • HTTP + File System: 110 MB
  • HTTPS + Memory: 300 MB/s
  • HTTPS + File System: 23 MB/s

Sure, the file system is not as fast as the memory but my SSD is not that slow to get just 23 MB/s .... let's increase the buffer size instead of using the default value of 4 KB.

  • HTTPS + Memory + 64 KB: 300 MB/s
  • HTTPS + File System + 64 KB: 200 MB/s
  • HTTPS + File System + 128 KB: 250 MB/s

With bigger buffer size we get huge improvements when reading from slow storages like the file system.

Another hint: Setting the Content-Length on the client yields better overall performance.

Summary

When I startet to work on the performance issues my first thought was that Kestrel is to blame because it had not enough time to mature yet.  I even tried to place IIS in front of Kestrel so that IIS is responsible for HTTPS stuff and Kestrel for the rest. The improvements are not worth of mentioning. After adding a bunch of trace logs, measuring time on the client and server, switching between schemes and storages I realized that the (mature) HttpClient is causing issues as well and one of the major problem were the default values like the buffer size.

 


Entity Framework Core Migrations: Assembly Version Mismatch

If you have switched your .NET Core project from xproj to csproj (MSBuild) and updated the nuget packages then you may run into an issue when executing some of the dotnet ef-commands.

I got the following error after executing dotnet ef migrations list:

Could not load file or assembly 'Microsoft.EntityFrameworkCore, Version=1.1.0.0, Culture=neutral, PublicKeyToken=adb9793829ddae60' or one of its dependencies. The located assembly's manifest definition does not match the assembly reference. (Exception from HRESULT: 0x80131040)

The problem is that some of my (3rd party) dependencies are using version 1.1.0 and the others version 1.1.1. In a classic .NET 4.6 project we use assembly redirects to solve this kind of problems and the in .NET Core we do the same ...

Just create an app.config file with the following content:

<?xml version="1.0" encoding="utf-8"?>

<configuration>
    <runtime>
        <assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
            <dependentAssembly>
                <assemblyIdentity name="Microsoft.EntityFrameworkCore" culture="neutral" publicKeyToken="adb9793829ddae60" />
                <bindingRedirect oldVersion="0.0.0.0-1.1.1.0" newVersion="1.1.1.0" />"
            </dependentAssembly>
            <dependentAssembly>
                <assemblyIdentity name="Microsoft.EntityFrameworkCore.Relational" culture="neutral" publicKeyToken="adb9793829ddae60" />
                <bindingRedirect oldVersion="0.0.0.0-1.1.1.0" newVersion="1.1.1.0" />"
            </dependentAssembly>
            <dependentAssembly>
                <assemblyIdentity name="Microsoft.Extensions.Logging.Abstractions" culture="neutral" publicKeyToken="adb9793829ddae60" />
                <bindingRedirect oldVersion="0.0.0.0-1.1.1.0" newVersion="1.1.1.0" />"
            </dependentAssembly>
        </assemblyBinding>
    </runtime>
</configuration>

If you still getting errors than make sure you have the following items in your csproj-file

<ItemGroup>
    <PackageReference Include="Microsoft.EntityFrameworkCore.Tools" Version="1.1.0">
        <PrivateAssets>All</PrivateAssets>
    </PackageReference>
    <DotNetCliToolReference Include="Microsoft.EntityFrameworkCore.Tools.DotNet" Version="1.0.0" />
</ItemGroup>

 


Strongly-typed Configuration for .NET Core - with full Dependency Injection support

Configuration is one of the most prominent cornerstones in software systems, and especially in distributed systems. And it has been a point for discussions in .NET for quite some time.

In one of our projects we have built a solution that lets different applications in different companies exchange data, although being behind firewalls, using the open source Relay Server. But, to our surprise, one of the features our customer was amazed about was the library I've developed to make configuration easier to handle.

In Thinktecture.Configuration I've taken over the ideas, generalized them and added new features.

The basic idea is that .NET developers should be able to deal with configuration data by just using arbitrary classes.

The library consists of 3 main components: IConfigurationLoader, IConfigurationProvider and IConfigurationSelector. The IConfigurationLoader loads data from storage, e.g. JSON from the file system. The IConfigurationProvider uses IConfigurationSelector to select the correct piece of data like a JSON property and to convert this data to requested configuration.

Architecture

In short, the features of the lib are:

  • The configuration (i.e. the type being used by the application)
    • should be type-safe
    • can be an abstract type (e.g. an interface)
    • don't have to have property setters or other kind of metadata just to be deserializable
    • can be and have properties that are virtually as complex as needed
    • can be injected via DI (dependency injection)
    • can have dependencies that are injected via DI (i.e. via constructor injection)
    • can be changed by "overrides"
  • The usage of the configuration in a developer's code base should be free of any stuff from the configuration library
  • The extension of a configuration by new properties or changing the structure of it should be a one-liner

Use Cases

In this post I'm going to show the capabilities of the library by illustrating it with a few examples. In this concrete example I'am using a JSON file containing the configuration values (with Newtosonf.Json in the background) and Autofac for DI.

But the library is not limited to these. The hints are at the end of this post if you want to use a different DI framework, other storage than the file system (i.e. JSON files) or not use JSON altogether.

The first use case is a bit lengthy to explain the basics. The others will just point out specific features.

Want to see the code? Go to Thinktecture.Configuration

Nuget: Install-Package Thinktecture.Configuration.JsonFile.Autofac

1. One file containing one or more configurations

Shown features in this example:

  • one JSON file
  • multiple configurations
  • a configuration doesn't have to be on the root of the JSON file
  • a configuration has dependencies known by DI
  • a configuration gets injected in a component like any other dependency

We start with 2 simple configuration types.

public interface IMyConfiguration
{
    string Value { get; }
}

public interface IOtherConfiguration
{
    TimeSpan Value { get; }  
}

The configuration IMyConfiguration is required by our component MyComponent.

public class MyComponent
{
    public MyComponent(IMyConfiguration config)
    {
    }
}

 Configuration file configuration.json

{
    "My":
    {
        "Config": { value: "content" }
    },
    "OtherConfig": { value: "00:00:05" }
}

Now let's setup the code in the executing assembly and configure DI to make MyComponent resolvable along with IMyConfiguration .

var builder = new ContainerBuilder();
builder.RegisterType<MyComponent>().AsSelf();

// IFile is required by JsonFileConfigurationLoader to access the file system
// For more info: https://www.nuget.org/packages/Thinktecture.IO.FileSystem.Abstractions/
builder.RegisterType<FileAdapter>().As<IFile>().SingleInstance();

// register so-called configuration provider that operates on "configuration.json"
builder.RegisterJsonFileConfigurationProvider("./configuration.json");

// register the configuration.
// "My.Config" is the (optional) path into the config JSON structure because our example configuration does not start at the root of the JSON
builder.RegisterJsonFileConfiguration<MyConfiguration>("My.Config")
    .AsImplementedInterfaces() // i.e. as IMyConfiguration
    .SingleInstance(); // The values won't change in my example

// same difference with IOtherConfiguration
builder.RegisterJsonFileConfiguration<OtherConfiguration>("OtherConfig")
    .AsImplementedInterfaces();

var container = builder.Build();

 The concrete types MyConfiguration and OtherConfiguration are, as often when working with abstractions, used with DI only. Apart from that, these types won't show up at any other places. The type MyConfiguration has a dependency IFile that gets injected during deserialization.

public class MyConfiguration : IMyConfiguration
{
    public string Value { get; set; }

    public MyConfiguration(IFile file)
    {
        ...
    }
}

public class OtherConfiguration : IOtherConfiguration
{
    public TimeSpan Value { get; set; }  
}

 The usage is nothing special

// IMyConfiguration gets injected into MyComponent
var component = container.Resolve<MyComponent>();

// we can resolve IMyConfiguration directly if we want to
var config = container.Resolve<IMyConfiguration>();

2. Nesting

Shown features in this use case:

  • one of the properties of a configuration type is a complex type
  • complex property type can be instantiated by Newtonsoft.Json or DI
  • complex property can be resolved directly if required

In this example IMyConfiguration has a property that is not of a simple type. The concrete types implementing IMyConfiguration and IMyClass consist of property getters and setters only thus left out for brevity.

public interface IMyConfiguration
{
    string Value { get; }
    IMyClass MyClassValue { get; }
}

public interface IMyClass
{
    int Value { get; }  
}

The JSON file looks as following:

{
    "Value": "content",
    "MyClassValue": { "Value": 42 }
}

Having a complex property we can decide whether the type IMyClass is going to be instantiated by Newtonsoft.Json or DI.

With just the following line the type IMyClass is not introduced to the configuration library and is going to be instantiated by Newtonsoft.Json.

builder.RegisterJsonFileConfiguration<MyConfiguration>()
    .AsImplementedInterfaces()
    .SingleInstance();

With the following line we introduce the type to the config lib and DI but the instance of IMyClass cannot be resolved directly.

builder.RegisterJsonFileConfigurationType<MyClass>();

Should IMyClass be resolvable directly then we can use the instance created along with IMyConfiguration or let new instance be created.

// option 1: use the property of IMyConfiguration
builder.Register(context => context.Resolve<IMyConfiguration>().MyClassValue)
    .AsImplementedInterfaces()
    .SingleInstance();

// option 2: let create a new instance
builder.RegisterJsonFileConfiguration<MyClass>("MyClassValue")
    .AsImplementedInterfaces()
    .SingleInstance();

 3. Multiple JSON files

The configurations can be loaded from more than one file.

Configuration types are

public interface IMyConfiguration
{
    string Value { get; }
}

public interface IOtherConfiguration
{
    TimeSpan Value { get; }  
}

File myConfiguration.json

{
    "Value": "content"
}

File otherConfiguration.json

{
    "Value": "00:00:05"
}

Having two files we need a means to distinguish between them when registering the configurations. In this case we use RegisterKeyedJsonFileConfigurationProvider that returns a key that will be passed to RegisterJsonFileConfiguration.

var providerKey = builder.RegisterKeyedJsonFileConfigurationProvider("myConfiguration.json");
builder.RegisterJsonFileConfiguration<MyConfiguration>(providerKey)
    .AsImplementedInterfaces()
    .SingleInstance();

var otherKey = builder.RegisterKeyedJsonFileConfigurationProvider("otherConfiguration.json");
builder.RegisterJsonFileConfiguration<OtherConfiguration>(otherKey)
    .AsImplementedInterfaces()
    .SingleInstance();

4. Overrides

A configuration can be assembled from one base configuration and one or more overrides.

In this case we have two config files. One containing the default values of the configuration and the other containing values to override.

Default values come from baseConfiguration.json

{
    "Value":
    {
        "InnerValue_1": 1,
        "InnerValue_2": 2
    }
}

InnerValue_2 will be changed by the overrides.json

{
    "Value":
    {
        "InnerValue_2": 3
    }
}

The configuration 

public interface IMyConfiguration
{
    IInnerConfiguration Value { get; }  
}

public interface IInnerConfiguration
{
    int InnerValue_1 { get; }
    int InnerValue_2 { get; }
}

To specify overrides we need to provide more than one file path when registering the configuration provider. The overrides are applied in the same order they passed to RegisterJsonFileConfigurationProvider.

builder.RegisterJsonFileConfigurationProvider("baseConfiguration.json", "overrides.json");
builder.RegisterJsonFileConfiguration<MyConfiguration>()
    .AsImplementedInterfaces()
    .SingleInstance();

5. Extension of the configuration

Let's add a property to IInnerConfiguration from previous paragraph.

public interface IInnerConfiguration
{
    int InnerValue_1 { get; }
    int InnerValue_2 { get; }
    string NewValue { get; }
}

Add the corresponding property to the JSON file baseConfiguration.json

{
    "Value":
    {
        "InnerValue_1": 1,
        "InnerValue_2": 2,
        "NewValue": "content"
    }
}

That's it.

Working with different frameworks, storages and data models

Using another DI framework

To use a different DI framwork than Autofac use the package Thinktecture.Configuration.JsonFile instead of Thinktecture.Configuration.JsonFile.Autofac and implement the interface IJsonTokenConverter using your favorite DI framework.  The converter has just one method TConfiguration Convert<TConfiguration>(JToken token).

Load JSON from other media

To load JToken from other storages than the file system just implement the interface IConfigurationLoader<JToken,JToken>. For example, if the JSON configuration are in a database then inject the database context or a data access layer and select corresponding data rows.

Use different data models

If you are using other data model than JSON then reference the package Thinktecture.Configuration and implement the interfaces IConfigurationLoader<TRawDataIn,TRawDataOut>, IConfigurationProvider<TRawDataIn,TRawDataOut> and IConfigurationSelector<TRawDataIn,TRawDataOut>. It sounds like much but if you look into the code of the corresponding JSON-based classes you will see that the classes are pretty small and trivial.

Some final words...

Although the configuration is an important part of the software development it is not the most exciting one. Therefore, a software developer may be inclined to take shortcuts and work with meaningful hardcoded values. Thinktecture.Configuration gives you the means to work with .NET types without thinking too much how to load and parse the values. This saves time, improves the reusability of the components and the software architecture.


ASP.NET Core with IIS: Setup Issues

If you are planing to run an ASP.NET Core application with IIS then this blog post might be worth a glance.

These are a few issues I run into ...

1. Targets in .xproj-file

If the project started with RC1 or earlier version of .NET Core then check for correct targets. Open the .xproj file and search for the following line

<Import Project="$(VSToolsPath)\DotNet\Microsoft.DotNet.targets" 
        Condition="'$(VSToolsPath)' != ''" />

and replace it with

<Import Project="$(VSToolsPath)\DotNet.Web\Microsoft.DotNet.Web.targets" 
        Condition="'$(VSToolsPath)' != ''" />

2. The process path in web.config

If you get 502 after starting the web then take a look into windows event viewer. One of the errors you will probably see is

Application 'MACHINE/WEBROOT/APPHOST/YOUR-APP with physical root 'C:\webapp\publish\' created process with commandline '"dotnet" WebApp.Selfhost.dll' but either crashed or did not reponse or did not listen on the given port '28236', ErrorCode = '0x800705b4'

This error means that IIS is unable to start your app using the command dotnet. To remedy this issue open web.config and change the processPath from dotnet to C:\Program Files\dotnet\dotnet.exe.

<?xml version="1.0" encoding="utf-8"?>
<configuration>
    <system.webServer>
        <handlers>
            <add name="aspNetCore" path="*" verb="*" modules="AspNetCoreModule" resourceType="Unspecified" />
        </handlers>
        <aspNetCore processPath="C:\Program Files\dotnet\dotnet.exe"
            arguments=".\WebApp.Selfhost.dll"
            stdoutLogEnabled="false"
            stdoutLogFile=".\logs\stdout"
            forwardWindowsAuthToken="false" />
    </system.webServer>
</configuration>

3. When to call UseIISIntegration

If you still getting 502 then the possible cause of this error may be that the your application is listening on a different port as expected. This can happen if one of your configuration keys is Port. In this case your web is listening on this port instead of dynamically generated port.

The configuration of the WebHostBuilder causing the error can look as following:

var hostBuilder = new WebHostBuilder()
    .UseConfiguration(myConfig) // inserts config with key "Port"
    .UseIISIntegration()    // uses previously inserted port "by mistake"
    .UseKestrel()
    .UseStartup<Startup>();

To cure that just change the order of the calls because with .NET Core 1.1 the listening url when running with IIS will not been overwritten anymore.

var hostBuilder = new WebHostBuilder()
    .UseIISIntegration()
    .UseConfiguration(myConfig)
    .UseKestrel()
    .UseStartup<Startup>();

 


(ASP).NET Core Dependecy Injection: Disposing

After several years of using the same Dependency Injection (DI) framework like Autofac you may have a good understanding how your components, implementing the interface IDisposable, are going to be disposed.

With the nuget package Microsoft.Extensions.DependencyInjection the new .NET Core framework brings its own DI framework. It is not that powerful as the others but it is sufficient for simple constructor injection. Nonetheless, even if you don't need some advanced features you have to be aware of how the components are destroyed by this framework.

Let's look at a concrete example. Given are 2 classes, a ParentClass and a ChildClass:

public class ParentClass : IDisposable
{
	public ParentClass(ChildClass child)
	{
		Console.WriteLine("Parent created.");
	}

	public void Dispose()
	{
		Console.WriteLine("Parent disposed.");
	}
}

public class ChildClass : IDisposable
{
	public ChildClass()
	{
		Console.WriteLine("Child created");
	}

	public void Dispose()
	{
		Console.WriteLine("Child disposed.");
	}
}

At first we are using Autofac to resolve ParentClass:

var builder = new ContainerBuilder();
builder.RegisterType<ParentClass>().AsSelf();
builder.RegisterType<ChildClass>().AsSelf();
var container = builder.Build();

Console.WriteLine("== Autofac ==");
var parent = container.Resolve<ParentClass>();

container.Dispose();

With Autofac we are getting the following output:

== Autofac ==
Child created
Parent created.
Parent disposed.
Child disposed.

And now we are using .NET Core DI:

var services = new ServiceCollection();
services.AddTransient<ParentClass>();
services.AddTransient<ChildClass>();
var provider = services.BuildServiceProvider();

Console.WriteLine("== .NET Core ==");
var parent = provider.GetRequiredService<ParentClass>();

((IDisposable) provider).Dispose();

The output we get is:

== .NET Core ==
Child created
Parent created.
Child disposed.
Parent disposed.

Comparing the outputs we see that Autofac destroys the outer compontent (i.e. ParentClass) first and then the inner component (i.e. ChildClass). The .NET Core DI does not honor the dependency hierarchy and destroys the components in the same order they are created.

Most of the time the behavior of .NET Core DI is not a problem because the components just free internal resources and are done. But in some cases the outer component has to do something like to unregister from the inner component that may live on. If the inner component is/will not be disposed then all works fine; if not then we get ObjectDisposedException.

If you start a new project with .NET Core I suggest to stay with DI framework you are familiar with unless it is a sample application.

PS: Further information of how to switch from .NET Core DI to other frameworks in an ASP.NET Core application: Replacing the default services container  and ASP.NET Core with Autofac


.NET Abstractions - It's not just about testing!

With the introduction of .NET Core we got a framework that works not just on Windows, but on Linux and macOS as well. One of the best parts of .NET Core is that the APIs stayed almost the same compared to the old .NET, meaning developers can use their .NET skills to build cross-platform applications. The bad part is that the static types and classes without abstractions are still there as well.

A missing abstraction like an interface or an abstract base class means that the developers are unable to change the behavior of their own components by injecting new implementations into them - and with static types it is even worse, you can't inject them at all. Luckily, most of the time we don't have to and don't want to change all the behaviors of all components we use unless we want to unit test a component. To be able to unit test one, and only one, component we have to provide it with dependencies that are completely under our control. An abstraction serves this purpose.

More and more of our customers demand unit tests and some of them are using .NET Core to be able to run the applications on Windows and Linux. Unfortunately, there are no abstractions available supporting .NET Core or they do not have the design decisions I would like to work with.

Inspired by SystemWrapper and System.IO.Abstractions I decided to create Thinktecture.Abstractions with certain opinionated design decisions in mind.

Design decisions

Interfaces vs abstract classes

Both an interface and an abstract class have pros and cons when it comes to create an abstraction. By implementing an interface, we are sure that there is no code running besides ours. Furthermore, a class can implement more than one interface. With base classes we don't have that much flexibility but we are able to define members with different visibility and can implement implicit/explicit cast operators.

For Thinktecture.Abstractions I chose interfaces because of the flexibility and transparency. For example, if I start using base classes I could be inclined to use internal members preventing others to have access to some code. This approach would ruin the whole purpose of this project. Here is another example, imagine we are implementing a new stream because we are using interface the new stream can be both a Stream and a IStream. That means we don't even need to convert this stream back and forth when working with it. This would be impossible with a base class.

Example:

public class MyStream : Stream, IStream
{
    ...

}

Same signature

The abstractions have the same signature as the .NET types. The response type, not being a part of the signature by definition, is always an abstraction.

Example:

public interface IStringBuilder
{
    ...
    IStringBuilder Append(bool value);
}

Additionally, the methods with concrete types as arguments have overloads using abstractions, otherwise the developer is forced to make an unnecessary conversion just to pass the variable to the method.

Example:

public interface IMemoryStream : IStream
{
    ...

    
void WriteTo(IStream stream);
    void WriteTo(Stream stream);
}

Don't use reserved namespaces

The namespaces System.* and Microsoft.* should not be used to prevent collisions with types from the .NET team.

Conversion to abstraction

Conversion must not change the behavior or raise any exceptions. Using an extension method, we are able to convert a type without raising a NullReferenceException even if the .NET type is null. For easy usage the extension methods for all types are in namespace Thinktecture.

Example:

Stream stream = null;
IStream streamAbstraction = stream.ToInterface(); // streamAbstraction is null

Conversion back to .NET type

The abstractions offer a method to get the .NET type back to use it with other .NET classes and 3rd party components. The conversion must not raise any errors.

Example:

IStream streamAbstraction = ...
Stream stream = streamAbstraction.ToImplementation();

some3rdPartyComponent.Do(stream);

Support for .NET Standard Library (.NET Core)

The abstractions should not just support the traditional full-blown frameworks like .NET 4.5 and 4.6 but .NET Standard Library (.NET Core) as well.

Structure mirroring

The assemblies with abstractions are as small as the underlying .NET assemblies, i.e. Thinktecture.IO.Abstactions contains interfaces for types from System.IO only. Otherwise the abstractions will impose much more dependencies than the application actually needs.

The version of the supported .NET Standard Library of the abstractions is equal to the version of the underlying .NET assembly, e.g. Thinktecture.IO.Abstractions and System.IO support both .NET Standard 1.0.

Inheritance mirroring

The inheritance hierarchy of the interfaces is the same as the ones of the concrete types. For example, a DirectoryInfo derives from FileSystemInfo and so does the interface IDirectoryInfo extend IFileSystemInfo.

Adapters (Wrappers)

The adapters are classes that make .NET types compatible with the abstractions. Usually, there is no need to use them directly besides for setup of dependency injection in composition roots. The adapters are shipped with abstractions, i.e. in Thinktecture.IO.Abstractions are both the IStream and StreamAdapter. Moving the adapters into their own assembly can be considered as cleaner but not pragmatic because the extension method ToInterface() is using the adapter and it is virtually impossible to write components without the need to convert a .NET type to an abstraction.

Example:

// using the adapter directly
Stream stream = ...;
IStream streamAbstraction = new StreamAdapter(stream);

// preferred way
IStream streamAbstraction = stream.ToInterface();

No change in behavior

The adapters must not change the behavior of the invoked method or property nor raise any exception unless this exception is coming from the underlying .NET type.

Static members and constructor overloads

For easier use of adapters, they should provide the same static members and constructor overloads as the underlying type.

Example:

public class StreamAdapter : IStream
{
    public static readonly IStream Null;
    ...

}

public class FileStreamAdapter : IFileStream
{
    public FileStreamAdapter(string path, FileMode mode) { ... }
    public FileStreamAdapter(FileStream fileStream)  { ... }
    ...

}

Override methods of Object

The methods Equals, GetHashCode and ToString should be overwritten and the calls be delegated to the underlying .NET type. These methods often are used for comparison in collections like Dictionary<TKey, TValue> otherwise the adapter will change (or rather break) the behavior.

Missing parts (?)

Factories, Builders

The Thinktecture.Abstractions assemblies are designed to be as lean as possible without introduction of new components. Factories and builders can (and should) be built on top of these abstractions.

Mocks

There is no need for me to provide any mocks because there are very powerful libraries like Moq that can be used when testing.

Enhancements

In the near future there will be further abstractions like for HttpClient and components that are built on top of the abstractions and are offering improved API or behavior.

Summary

Working with abstractions gives us the possibility to decide what implementations should be used in our applications. Furthermore, it is easier (or possible in the first place - think of static classes) to provide and use new implementations, compose them and derive from them. When it comes to testing then we can do it without abstractions but we would test more than just one component leading to more complex tests and it would be rather integration tests than unit tests. The integration tests are slower and more difficult to setup because they could need access to the file system, the network or the database. Another (unnecessary) challenge would be to isolate the integration tests from each other because they run in parallel, in general. 


Entity Framework: Prevent redundant JOINs - watch your LINQ !

Fetching one record from a collection using navigational properties in Entity Framework may lead to unnecessary JOINs. To show the problem we need two tables Products and Prices.

EF Blog - Redundant Joins - DB Schema

The query shown below is fetching products along with their first price.

var products = ctx.Products
      .Select(p => new
      {
          p.Name,
          FirstPriceStartdate = p.Prices.OrderBy(price => price.Startdate).FirstOrDefault().Startdate,
          FirstPriceValue = p.Prices.OrderBy(price => price.Startdate).FirstOrDefault().Value,
      })
      .ToList();

Looks simple.
Lets look at the SQL statement or rather the execution plan.

EF Blog - Redundant Joins - Before Subselect

The table Prices is JOINed twice because of the two occurrences of the expression "p.Prices.OrderBy(...).FirstOrDefault()". The Entity Framework doesn't recognize that these expressions are identical but we can help. Just use a sub-select.

var products = ctx.Products
       .Select(p => new
       {
           Product = p,
           FirstPrice = p.Prices.OrderBy(price => price.Startdate).FirstOrDefault()
       })
      .Select(p => new
      {
          p.Product.Name,
          FirstPriceStartdate = p.FirstPrice.Startdate,
          FirstPriceValue = p.FirstPrice.Value,
      })
      .ToList();

That's it, the table Prices is JOINed only once now.

EF Blog - Redundant Joins - After Subselect

Having a complex query you may need multiple sub-select to select a navigational property of another navigational property. But in this case please write an email to your colleagues or a comment so the other developers understand what's going on otherwise your funny looking query will be refactored pretty soon :)  


Entity Framework: High performance querying trick using SqlBulkCopy and temp tables

Implementing database access with Entity Framework is pretty convenient, but sometimes the query performance can be very poor. Especially using navigational properties to load collections leads to significantly longer execution times and more I/O. To see the impact of the loading of a collection we have to take a look into profiling tools like SQL Server Profiler.

Let’s look at the following use case which was extrapolated from a customer project. We have three tables Products, Suppliers and Prices containing an entire price history.

Blog - EF - Using SqlBulkCopy and temp tables - DB

We want to select all products with their suppliers and future prices according to a filter criteria. The easiest approach is to use the navigational properties.

using(var ctx = new Entities())
{
    var products = ctx.Products
        .Where(p => p.Name.Contains(“chocolate”))
        .Select(p => new FoundProduct()
        {
            Id = p.Id,
            Name = p.Name,
            FuturePrices = p.Prices
                .Where(price => price.Startdate > DateTime.Today),
            Suppliers = p.Suppliers
                .Where(s => s.IsDeliverable)
    })
    .ToList();
}

For the simple looking query, depending on the complexity of the data and the amount of data in the database, the execution can take a while. There are multiple reasons the database won’t like this query. The Entity Framework has to make huge JOINs, concatenations and sorting to fetch the products, prices and suppliers at once thus the result set is much bigger than fetching the collections separately. Furthermore, it is more difficult to find optimal indexes because of the JOINs, the confusing execution plan and suboptimal SQL statements Entity Framework has to generate to fulfill our demands.

If you have been using EF you may be wondering why you didn't have this problem before. The answer is you didn't notice it because the tables or the result set have been small etc. Just assume an unoptimized query takes 200 ms, an optimized query 20 ms. Although one query is 10 times faster than the other both response times are considered 'fast' - and this often leads to assumptions that the query is perfect. Though, in reality the database needs much more resources to perform the unoptimized query. But that doesn't mean we have to change all our EF queries using navigational properties, be selective. Use profiling tools to decide what query should be tuned and what not.

Let's look at the execution plan for the query from above to get an idea what operator consumes the resources the most. Half of the resources are needed for sorting the data although we don't have any order-by clause in our query! The problem is that the data must have special sort order so the Entity Framework is able to process (materialize) the SQL result correctly.

Blog - EF - Using SqlBulkCopy and temp tables - Execution Plan

So, let's assume the result set is pretty big, the query takes too long and the profiling tool shows hundreds of thousands of reads that are needed to get our products.
The first approach would be to split the query. First we load the products, then the suppliers and the prices.

using(var ctx = new Entities())
{
    var products = ctx.Products
        .Where(p => p.Name.Contains(“chocolate”))
        .Select(p => new FoundProduct()
        {
            Id = p.Id,
            Name = p.Name
        })
        .ToList();

    var productIds = products.Select(p => p.Id);

    var futurePricesLookup = ctx.Prices
        .Where(p => productIds.Contains(p.ProductId))
        .Where(p => p.Startdate > DateTime.Today)
        .ToLookup(p => p.ProductId);

    var suppliersLookup = ctx.Suppliers
        .Where(s => productIds.Contains(s.ProductId))
        .Where(s => s.IsDeliverable)
        .ToLookup(p => p.ProductId);

    foreach(var product in products)
    {
        product.FuturePrices = futurePricesLookup[product.Id];
        product.Suppliers = suppliersLookup[product.Id];
    }   
}

Now we are going 3 times to the database but the result sets are a lot smaller, easier to profile and easier to find optimal indexes for. In a project of one of our customers the reads are gone from 500k down to 2k and the duration from 3 sec to 200 ms just by splitting the query.

For comparison using our simplified example with 100 products and 10k prices:

  • Original query needs 300 ms and has 240 reads
  • Split queries need (1+14+1) = 16 ms and has (2 + 115 + 4) =121 reads

 

This approach performs very well when the number of product IDs we use in the Where statement stays small, say < 50. But it isn't always the case.
Especially when implementing a data exporter we have to be able to handle thousands of IDs. Using that many parameters will slow down the query significantly. But what if we could insert all product IDs into a temporary table using SqlBulkCopy because with bulk copy there is almost no difference whether there are 100 IDs to insert or 10k. At first we want to create a few classes and methods to be able to bulk insert IDs of type Guid using just a few lines of code. The usage will look like this:

private const string TempTableName = "#TempTable";

using(var ctx = new Entities())
{
    // fetch products and the productIds

    RecreateTempTable(ctx);
    BulkInsert(ctx, null, TempTableName, () => new TempTableDataReader(productIds));

    // here come the queries for prices and suppliers
}

Before copying the IDs we need to create a temp table.

private void RecreateTempTable(Entities ctx)
{
    ctx.Database.ExecuteSqlCommand($@"
        IF(OBJECT_ID('tempdb..{TempTableName}') IS NOT NULL)
            DROP TABLE {TempTableName};

        CREATE TABLE {TempTableName}
        (
            Id UNIQUEIDENTIFIER NOT NULL PRIMARY KEY CLUSTERED
        );
    ");
}

The bulk insert is encapsulated into a generic method to be able to use it with all kind of data. The class BulkInsertDataReader<T> is a base class of mine to be able to implement the interface IDataReader very easily. The class can be found on GitHub: BulkInsertDataReader.cs

private void BulkInsert<T>(Entities ctx, DbContextTransaction tx, 
    string tableName, Func<BulkInsertDataReader<T>> getDatareader)
{
    SqlConnection sqlCon = (SqlConnection)ctx.Database.Connection;
    SqlTransaction sqlTx = (SqlTransaction)tx?.UnderlyingTransaction;

    using (SqlBulkCopy bulkCopy = new SqlBulkCopy(sqlCon, 
        SqlBulkCopyOptions.Default, sqlTx))
    {
        bulkCopy.DestinationTableName = tableName;
        bulkCopy.BulkCopyTimeout = (int)TimeSpan.FromMinutes(10).TotalSeconds;

        using (var reader = getDatareader())
        {
            foreach (var mapping in reader.GetColumnMappings())
            {
                bulkCopy.ColumnMappings.Add(mapping);
            }

            bulkCopy.WriteToServer(reader);
        }
    }
}

Using the generic BulkInsertDataReader we implement a data reader for inserting Guids.

public class TempTableDataReader : BulkInsertDataReader<Guid>
{
    private static readonly IReadOnlyCollection<SqlBulkCopyColumnMapping> _columnMappings;

    static TempTableGuidDataReader()
    {
        _columnMappings = new List<SqlBulkCopyColumnMapping>()
        {
            new SqlBulkCopyColumnMapping(1, "Id"),
        };
    }

    public TempTableGuidDataReader(IEnumerable<Guid> guids)
        : base(_columnMappings, guids)
    {
    }

    public override object GetValue(int i)
    {
        switch (i)
        {
            case 1:
                return Current;
            default:
                throw new ArgumentOutOfRangeException("Unknown index: " + i);
        }
    }
}

Now we have all IDs in a temporary table. Let’s rewrite the query from above to use JOINs instead of the method Contains.

using(var ctx = new Entities())
{
    // fetch products and the productIds
    // create temp table and insert the ids into it

    var futurePricesLookup = ctx.Prices
        .Join(ctx.TempTable, p => p.ProductId, t => t.Id, (p, t) => p)
        .Where(p => p.Startdate > DateTime.Today)
        .ToLookup(p => p.ProductId);

    var suppliersLookup = ctx.Suppliers
        .Join(ctx.TempTable, s => s.ProductId, t => t.Id, (s, t) => s)
        .Where(s => s.IsDeliverable)
        .ToLookup(p => p.ProductId);

    // set properties FuturePrices and Suppliers like before
}

Here the question comes up where the entity set TempTable comes from when you are using database-first approach? The answer is we need to edit the edmx file manually to introduce the temp table to Entity Framework. For that open the edmx file in an XML editor and copy the EntityContainer, EntityType and EntityContainerMapping at the right place like it is shown below.

Remark: The Entity Framework supports so called DefiningQuery we use to define the temp table but the EF-Designer of Visual Studio doesn't support this feature. The consequence of that is that some sections we define manually will be deleted after an update of the EF-model. In this case we need to revert these changes.

<edmx:Edmx Version="3.0">
    <edmx:Runtime>
        <!-- SSDL content -->
        <edmx:StorageModels>
            <Schema Namespace="Model.Store" Provider="System.Data.SqlClient">
                <EntityContainer Name="ModelStoreContainer">
                    <EntitySet Name="TempTable" EntityType="Self.TempTable">
                        <DefiningQuery>
                            SELECT #TempTable.Id
                            FROM #TempTable
                        </DefiningQuery>
                    </EntitySet>
                </EntityContainer>
                <EntityType Name="TempTable">
                    <Key>
                        <PropertyRef Name="Id" />
                    </Key>
                    <Property Name="Id" Type="uniqueidentifier" Nullable="false" />
                </EntityType>
            </Schema>
        </edmx:StorageModels>
        <!-- CSDL content -->
        <edmx:ConceptualModels>
            <Schema Namespace="Model" Alias="Self">
                <EntityContainer Name="Entities" annotation:LazyLoadingEnabled="true">
                    <EntitySet Name="TempTable" EntityType="Model.TempTable" />
                </EntityContainer>
                <EntityType Name="TempTabled">
                    <Key>
                        <PropertyRef Name="Id" />
                    </Key>
                    <Property Name="Id" Type="Guid" Nullable="false" />
                </EntityType>
            </Schema>
        </edmx:ConceptualModels>
        <!-- C-S mapping content -->
        <edmx:Mappings>
            <Mapping Space="C-S">
                <EntityContainerMapping StorageEntityContainer="ModelStoreContainer" CdmEntityContainer="Entities">
                    <EntitySetMapping Name="TempTable">
                        <EntityTypeMapping TypeName="Model.TempTable">
                            <MappingFragment StoreEntitySet="TempTable">
                                <ScalarProperty Name="Id" ColumnName="Id" />
                            </MappingFragment>
                        </EntityTypeMapping>
                    </EntitySetMapping>
                </EntityContainerMapping>
            </Mapping>
        </edmx:Mappings>
    </edmx:Runtime>
</edmx:Edmx>

That’s it. Now we are able to copy thousands of records into a temp table very fast and use this data for JOINs.


Mimicking $interpolate: An Angular 2 interpolation service

In an Angular 1 application we have been creating for one of our customers we used the $interpolate service to build a simple templating engine. The user was able to create snippets with placeholders within the web application to use these message fragments to compose an email to reply to a support request.

In Angular 2 there is no such service like $interpolate - but that is not a problem because we have got abstract syntax tree (AST) parsers to build our own interpolation library. Let’s build a component that takes a format string (with placeholders) and an object with properties to be used for replacement of the placeholders. The usage looks like this:

// returns ‘Hello World!’
interpolation.interpolate(‘Hello {{place.holder}}!’, { place: { holder: ‘World!’}});

At first we need to inject the parser from Angular 2 and we need to create a lookup to cache our interpolations.

constructor(parser: Parser) {
    this._parser = parser;
    this._textInterpolations = new Map<string, TextInterpolation>();
}

The class TextInterpolation is just a container for saving the parts of a format string. To get the interpolated string we need to call the function interpolate. The example from above will have 2 parts:

  • String 'Hello '
  • Property getter for {{place.holder}}

 

class TextInterpolation {
    private _interpolationFunctions: ((ctx: any)=>any)[];

    constructor(parts: ((ctx: any) => any)[]) {
        this._interpolationFunctions = parts;
    }

    public interpolate(ctx: any): string {
        return this._interpolationFunctions.map(f => f(ctx)).join('');
    }
}

Before we can create our TextInterpolation we need to parse the format string to get an AST.

let ast = this._parser.parseInterpolation(text, null);

if (!ast) {
    return null;
}

if (ast.ast instanceof Interpolation) {
    textInterpolation = this.buildTextInterpolation( ast.ast);
} else {
    throw new Error(`The provided text is not a valid interpolation. Provided type ${ast.ast.constructor && ast.ast.constructor['name']}`);
}

The AST of type Interpolation has 2 collections, one with strings and the other with expressions. Our interpolation service should support property-accessors only, i.e. no method calls or other operators.

private buildTextInterpolation(interpolation: Interpolation): TextInterpolation {
    let parts: ((ctx: any) => any)[] = [];

    for (let i = 0; i < interpolation.strings.length; i++) {
        let string = interpolation.strings[i];

        if (string.length > 0) {
            parts.push(ctx => string);
        }

        if (i < interpolation.expressions.length) {
            let exp = interpolation.expressions[i];

            if (exp instanceof PropertyRead) {
                var getter = this.buildPropertyGetter(exp);
                parts.push(this.addValueFormatter(getter));
            } else {
                throw new Error(`Expression of type ${exp.constructor && exp.constructor.name1} is not supported.`);
            }
        }
    }

    return new TextInterpolation(parts);
};

The strings don’t need any special handling but the property getters do. The first part of the special handling happens in the method buildPropertyGetter that fetches the value of the property (and the sub property) of an object.

private buildPropertyGetter(exp: PropertyRead): ((ctx: any) => any) {
    var getter: ((ctx: any) => any);

    if (exp.receiver instanceof PropertyRead) {
        getter = this.buildPropertyGetter(exp.receiver);
    } else if (!(exp.receiver instanceof ImplicitReceiver)) {
        throw new Error(`Expression of type ${exp.receiver.constructor && (exp.receiver).constructor.name} is not supported.`);
    }

    if (getter) {
        let innerGetter = getter;
        getter = ctx => {
            ctx = innerGetter(ctx);
            return ctx && exp.getter(ctx);
        };
    } else {
        getter = <(ctx: any)=>any>exp.getter;
    }

    return ctx => ctx && getter(ctx);
}

The second part of the special handling is done in addValueFormatter that returns an empty string when the value returned by the property getter is null or undefined because these values are not formatted to an empty string but to strings 'null' and 'undefined', respectively.

private addValueFormatter(getter: ((ctx: any) => any)): ((ctx: any) => any) {
    return ctx => {
        var value = getter(ctx);

        if (value === null || _.isUndefined(value)) {
            value = '';
        }

        return value;
    }
}

The interpolation service including unit tests can be found on GitHub: angular2-interpolation


.NET Core: Lowering the log level of 3rd party components

With the new .NET Core framework and libraries we have got an interface called Microsoft.Extensions.Logging.ILogger to be used for writing log messages. Various 3rd party and built-in components make very good use of it. To see how much is being logged just create a simple Web API using Entity Framework (EF) and the Kestrel server and in a few minutes you will get thousands of log messages.

The downside of such a well-known interface is that the log level chosen by the 3rd party developers may be unfitting for the software using it. For example, Entity Framework uses the log level Information for logging generated SQL queries. For the EF developers it is a good choice because the SQL query is an important information for them - but for our customers using EF this information is for debugging purposes only.

Luckily it is very easy to change the log level of a specific logging source (EF, Kestrel etc.). For that we need a simple proxy that implements the interface ILogger. The proxy is changing the log level to Debug in the methods Log and IsEnabled and calls the corresponding method of the real logger with new parameters.

public class LoggerProxy : ILogger
{
	private readonly ILogger _logger;

	public LoggerProxy(ILogger logger)
	{
		if (logger == null)
			throw new ArgumentNullException(nameof(logger));

		_logger = logger;
	}

	public void Log(LogLevel logLevel, int eventId, object state, 
		Exception exception, Func<object, Exception, string> formatter)
	{
		if (logLevel > LogLevel.Debug)
			logLevel = LogLevel.Debug;

		_logger.Log(logLevel, eventId, state, exception, formatter);
	}

	public bool IsEnabled(LogLevel logLevel)
	{
		if (logLevel > LogLevel.Debug)
			logLevel = LogLevel.Debug;

		return _logger.IsEnabled(logLevel);
	}

	public IDisposable BeginScopeImpl(object state)
	{
		return _logger.BeginScopeImpl(state);
	}
}

To inject the LoggerProxy we have to create another proxy that implements the interface Microsoft.Extensions.Logging.ILoggerFactory. The method we are interested in is CreateLogger that gets the category name as a parameter. The category name may be the name of the class requesting the logger or the name of the assembly. In this method we make the real logger factory create a logger for us and if this logger is for Entity Framework we return our LoggerProxy wrapping the real logger.

public class LoggerFactoryProxy : ILoggerFactory
{
	private readonly ILoggerFactory _loggerFactory;
	
	public LogLevel MinimumLevel
	{
		get { return _loggerFactory.MinimumLevel; }
		set { _loggerFactory.MinimumLevel = value; }
	}

	public LoggerFactoryProxy(ILoggerFactory loggerFactory)
	{
		if (loggerFactory == null)
			throw new ArgumentNullException(nameof(loggerFactory));

		_loggerFactory = loggerFactory;
        }

	public ILogger CreateLogger(string categoryName)
	{
		var logger = _loggerFactory.CreateLogger(categoryName);

		if (categoryName.StartsWith("Microsoft.Data.Entity.", StringComparison.OrdinalIgnoreCase))
			logger = new LoggerProxy(logger);

		return logger;
        }

	public void AddProvider(ILoggerProvider provider)
	{
		_loggerFactory.AddProvider(provider);
	}

	public void Dispose()
        {
		_loggerFactory.Dispose();
	}
}

Finally, we need to register the factory proxy with the dependency injection container.

public void ConfigureServices(IServiceCollection services)
{
	var factory = new LoggerFactoryProxy(new LoggerFactory());
	services.AddInstance(factory);
}

For now on the log messages coming from Entity Framework will be logged with the log level Debug.