Why Build Procedural Web Services?

#programming

I've been working in web development for quite a long period of time. I started in PHP back when you would use it to actually pre-process HTML. I moved (as many did) from PHP to a more structured backend language. In my case that language was C#, specifically using webforms. Quite a while later, as dotnet evolved into core we got things like build-in dependency injection which has (at this point) solidified the design patterns used when developing C# web apis.

You Probably Aren't Using OOP

During my thirteen year career, a large chunk using C#, building enterprise and business software I don't think I once ever actually wrote object oriented code in the object oriented language. It seems strange to think about, but let me describe what I'm talking about.

In general we tend to build real-time web services using a design like this.

interface layer -> business logic layer -> data/service layer

To get from one layer to another you would use dependency injection. This layer separation (which can sometimes be) taken even farther, is to make it easier to unit test your application. If you operate in a system that looks like this you are not writing object oriented code. The point of object oriented code is to derive logic and functionality through inheritance. If you create an ObjectLogic class and put within it the generic methods to operate on a specific object. Plus your interface layer doesn't need to know or care about the specifics of a single object since it operates on the parent than sure you can call yourself object oriented.

The vast, vast majority of the time that isn't the case. Your routing layer is normally context aware of the logic layer and where to direct requests into that layer. Ironically I could call what's being done with go and rust much more object oriented when deriving functionality from structs since you literally can't see the original objects when using traits or interfaces. They just can't really be called object oriented languages since they can't inherit from objects, only from shapes.

Now that doesn't mean there aren't object oriented patterns in C#. ODBC and ADO.NET are truly object oriented, as you generally don't get to see the specifics, only the abstractions. Just that most people aren't using the patterns to build applications.

Let's look at an example of what this can look like. Let's say your service handles operations against movies. You expose a few basic endpoints: Search, get single, create/update, delete. You have your interface layer, it accepts the requests assigns them to appropriate routes based on PATH and METHOD, it parses the request for that route and sends it to the movies logic class. That class would be injected as a transient into the interface layer. That means for each request you would get a new instance of the logic class.

The logic class would contain all the logic needed to operate on the CRUD actions for movies. Maybe it does some business logic of authorization validation for writes, maybe even for reads. Once that's done it would send the request down to the data layer through another transient. The data layer would likely be generalized for the data source. If there's enough business logic to require more than one data layer, that should go in your logic layer, even if it needs to be it's own class.

Inside the data layer you would likely have a singleton transport class that would handle connection pooling, but that's normally hidden from most abstractions.

These layers make it significantly easier to unit test. The more chaotic your function call graph the more difficult testing is. This is especially true when trying to mock HTTP calls or database functions. Calling something from another class is something you can DI swap during testing. I would argue that this layered system with DI exists solely to facilitate making it easier to get a higher amount of code coverage. Personally I think this is pointless, and therefore the design is pointless.

When working with an interface layer, you may have something that looks like this.

var agentGroup = app.MapGroup("/agents");
agentGroup.WithTags("Agents");

agentGroup.MapGet("", async (AgentLogic agentLogic) => await agentLogic.GetAgents());

agentGroup.MapGet(
    "{id}",
    async (AgentLogic agentLogic, [FromRoute] Guid id) => await agentLogic.GetAgent(id)
);

agentGroup.MapDelete(
    "{id}",
    async (AgentLogic agentLogic, [FromRoute] Guid id) => await agentLogic.DeleteAgent(id)
);

agentGroup.MapPut(
    "",
    async (AgentLogic agentLogic, [FromBody] Agent agent) =>
        await agentLogic.PutAgent(agent)
);

This is code that aligns with my REST-ish style. More importantly there are people who would want to unit test this. Unit testing this would be effectively unit testing dotnet core itself. There is no logic, only some DI and transport parsing. Hilariously testing it could be counter intuitive since in your unit test you won't actually setup any DI and when you try and run the application it may fail if the container isn't setup properly.

The data layer is similar. In my case I don't have a proper data layer since the example project uses dapper, but it has enough to prove the negative.

public async Task<Agent> PutAgent(Agent agent)
{
    await using var connection = await dataSource.OpenConnectionAsync();

    var sql =
        @"INSERT INTO Agent
(id, name, config) VALUES
(@Id, @Name, @Config)
ON CONFLICT(id) DO UPDATE SET name = excluded.name, config = excluded.config;
    ";
    if (agent.Id == null)
    {
        agent.Id = Guid.NewGuid();
    }

    await connection.ExecuteAsync(sql, agent);

    return agent;
}

Where I to have a data layer, that layer would do nothing but wrap the ExecuteAsync method. Having that layer is pointless in this case. I would unit test this function, but what I would test is the business logic of "if I don't provide an Id will one be generated for me." I would not unit test that the sql is correct or dapper is working correctly. Dapper has it's own unit tests.

Before I continue, I want to add, just because I don't think you should unit test this stuff doesn't mean you shouldn't test it. I just think end-to-end and integration tests are far superior methods, especially when a data source is involved.

When working in procedural languages you generally don't have inversion of control (IOC) containers. If you have no IOC you can't do traditional dependency injection. So how do you unit test procedural code?

The language I'm going to use for this example is Golang, but Rust has a similar workflow using traits. Both of these languages have a concept of thread safe "mostly global" state. Rust's Arc being the easiest to call out. Go's are a lot less clear, but quite a few functions call out safety. An example from the db docs.

The returned DB is safe for concurrent use by multiple goroutines and maintains its own pool of idle connections.

What you shouldn't do is add every struct with business logic to global state. That would balloon your state. Instead remember what I said earlier about DIs. Since most of the dependencies are transient (created during each request) you can just create any of those as you need to. Unless you need to add state to those dependencies I wouldn't even attach them to functions as you can always do that later.

Instead what you should put in your state is stuff that should be global singletons. The transport of the database, any config or data that shouldn't change, or a global cache. That list isn't exhaustive but it should paint the picture. Anything that you can't (or shouldn't) just recreate per-request should be part of your state.

state := routes.AppState{
  Db:             db,
  SearchTemplate: searchTemplate,
  PageTemplate:   pageTemplate,
  PostTemplate:   postTemplate,
}

http.HandleFunc("GET /", state.HomeHandler)
http.HandleFunc("GET /page/{id}", state.PageHandler)
http.HandleFunc("GET /post/{id}", state.PostHandler)

In my case I have the database in the state, and go templates. Saving the templates isn't really a huge CPU/memory saving but I prefer not to be wasteful. If you have different global state requirements for groups of routes, you can use different state objects and use struct embedding to compose them with commonalities (at least in go).

Now that you have injectable state in your routes you can make things testable. This is where interfaces (or traits) come in. Here's the interface I used for the db field.

type Data interface {
	Select(dest interface{}, query string, args ...interface{}) error
	Get(dest interface{}, query string, args ...interface{}) error
}

Go has implicit interfaces like typescript. What that means is that any struct that has the required shape can be used for that property. I got the function signatures directly from sqlx.DB. That means I can replace the state with whatever I want as long as it adheres to the interface. Rust has traits which perform the same function, but need to be manually declared. It's less convenient but the same idea. That flow means I can unit test whatever I want without really needing an IOC container.

So why would you want to do this?

That answer for me is simple, but for others it can be more complicated.

When you are developing large scale real-time APIs in dotnet core you can run into runaway performance issues that are non-trivial to fix. I've personally been bitten by the thread scaling issue tons of times. In C# green threads can only scale up or down one per second. This is an issue if you immediately slam the service with tons of requests that fan out. The first time I got hit with it we have a single endpoint that fanned out into something like nine and it was hit once per page load. This would cause the app to choke itself out trying to resolve tasks.

There is a method to mitigate that issue where you start the app with a minimum number of worker threads but it's a mitigation, not a real fix. I've had tons of issues over time with how dotnet wants you to deal with threads vs how it documents using them. I've had issues where wrapping something that should already be in a green thread in another green thread fixed deadlocking threads. Why did that work? I wish literally anyone could tell me.

There are also performance implications of the DI system. Transients have a cost, both in creation and in existence. While you may not care about performance, actually seeing it in real-time can be eye opening. We had an application that we needed to rewrite in a new language since we couldn't support the old language (there was literally a single person who could work on it).

We ran two experiments: The first was written by me in go (quite a bit of time ago) as a naive implementation, the second written by the maintainer of the original version written in dotnet. The dotnet version had two issues. The first is that we had a DI issue with one of our dependencies that blew up the performance where we were getting request durations of over a second. Once that was fixed the other issue is that it was literally ten times slower than the go version.

This wasn't a terribly complicated system. Unpack a binary format and send some data to a C library to process. Dotnet in that case literally took ten times longer in the best case scenario. Our load test target was 3,000 requests per second and while we could do that, it took tons of resources to do so. The go version was so efficient we had to scale out the istio ingress pods since we CPU capped them (this was when istio was still beta). We got to around 18,000 requests per second pretty easily.

This is something I find concerning with load test comparisons when people claim C# can be fast. It can be sure, but it takes significantly more CPU and memory over procedural languages. While really basic tests might return quickly, the moment you try and do something even remotely complicated, dotnet's performance tanks into the ground.

You might be thinking "well I don't need that kind of performance." That may be true, but you should also think about cost. When using resource bin-packing or serverless, every bit of RAM and CPU costs money. If you can take 50% more time to build applications but those only use half the CPU and RAM than you can save a ton of money over time (assuming you have clients). The RAM my website used when it was written in rust was a solid 10 MB over thousands of requests per second. I don't think I've ever seen a dotnet app work properly under 0.5 CPUs and 1 GB of RAM and even with those resources dotnet would struggle.

I used to think that the slowness was worth it for developer speed. When we first did that comparison we chose the dotnet version because of the cost of teaching developers vs the cost of just using dotnet since we didn't require more throughput. We didn't really want to migrate an application from one language basically nobody can support to another.

Given the benefit of time I don't think we were wrong at that time, but I also don't think we were right. Golang is insanely easy to learn and get productive with. The only thing I'd call a major issue at that time was a lack of generics. This has been resolved. In the case of rust that hasn't ever really been an issue. The issue with rust has been that async rust is actually crazy hard to do in a performant way (though to be fair, doing it badly will still be faster than C#).

In the past I would have said I'd prefer to use dotnet because of the DI system. It does in fact make unit testing a ton easier, so if you've got complicated business logic you shift around complexity. Either complexity in writing the code or complexity in writing tests. After having actually built projects in rust and go I've come to change my mind. Nowadays I'd rather put business logic in pure functions so you can test without any DI bootstrapping. IOC has a cost that can't be easily fixed, and functions are crazy easy to test. That will come with complexity in my code, but personally I feel it's a cheaper cost.

A benefit of specifically rust or go and not procedural languages in general is their philosophy of errors as values. This has major benefits in that it forces you to react to potential issues. The whole if err != nil thing may seem tedious but it leads to significantly more resilient software. I've run into issues where I sent bad SQL on requests and the page worked just fine since I handled the error properly. While that page didn't have any content it didn't lead to a bad user experience from arcane error handling.

In languages like C# it's common to have an exception handling middleware to globally catch stuff on the stack. The problem with that is that it's far too late to react. That's you giving up and shrugging about errors making it to your clients. You might not even know when exceptions could be thrown since most doesn't require notifying consumers of your methods that they could throw. In go or rust, unless it's a panic, you literally are forced to handle the error or pass it up the stack. If a library you are using does panic when you are using it properly, find another library.

Fin

Should you go out and rewrite everything you have? No probably not. I would however look at your greenfield projects and see if you can build something slightly less risky. If that works than sure, slow roll a rewrite. Start with go, it's a lot easier to get into since the language is real simple and the concurrency features are incredible. Rust is crazy fast but it's significantly more difficult. I would recommend looking at rust first for APIs directly used by clients as that's where you need the most safety and stability, but if you are fanning out a lot it might be worth using go because it's ease in threading.


Welcome to my website! I like to program.