Search This Blog

Friday, August 11, 2017

Shared Client Libraries - A Microservices Anti-Pattern, a study into why?


As I continue my adventures with demystifying Microservices for myself, I keep hearing of this anti-pattern, 'Shared Service Client' or 'Shared Binary Client' or even 'Client Library'. Sam Newman in his book about Microservices says, "I've spoken to more than one team who has insisted that creating libraries for your services is an essential part of creating services in the first place". I must admit that I have seen and implemented this pattern right through my career. The driver for generating a client has been DRY (Don't repeat yourself).  The creation of Service Clients and making them available for consumption by other applications and services is a prevalent pattern. An example is shown below where a shareable Notes-client is created by a Team as part of the project for the Notes Web Service. They might also provide a DTO (Data Transfer Object) library representing exchanged payload.

This BLOG is my investigation into why this concept of a Service Client or a Client library causes heart burn to Microservice advocates.

Service Clients over time

Feel free to skip this section totally, it's primarily me reminiscing.

In my university days, when we used to work on network assignments, the interaction between a server and client would primarily involve creating a Socket, connecting to the server and sending/receiving data. There really was no explicit contract that someone wishing to communicate with the server could use to generate a client. The remote invocation however was portable across language/platform. Any client that can open a socket could invoke the server method.

I subsequently worked with Java and RMI. One would take a stub, then generate client and server code from it. Callers higher in the stack would invoke 'methods' using the client which would then run across the wire and be invoked on the server. From a client's perspective, it was as if the call was happening in the local JVM. Magical! A contract of sorts was established. Heterogeneous clients did not really have a place in this universe unless you got dirty with JNI. Another model existed at roughly the same time with CORBA and ORBs. The IDL or Interface Definition Language served to provide a contract by which client and server could be generated for a language and platform that could generate corresponding code from the IDL. Heterogeneous Clients could thrive in this ecosystem. Later down the chain came WS*, SOAP, Axis, Auto WSDL to Whatever generation tools like RAD (Rational Application Developer) that created a Client for a SOAP service.I had used a tool called Rational Application Developer which was a 4 C/D install which would fail often but when it worked, you could click on a WSDL and see it generate these magical code snippets for the client.

The WS* bubble was followed by the emergence of an Architecture style known as REST and its poster child, HTTP. Multiple representation types were a major selling point. To define Resources and their supported operations, there was WADL. Think of WADL as a poor cousin of WSDL that even the W3C preferred to not standardize. REST like it not, many a time is used for making remote procedure calls than walking a Hypermedia graph via Links as prescribed by Richardson Maturity model and HATEOS. Many developers of a REST service create a Client library that communicates via HTTP to the REST resources and due to the simplicity of REST, writing the client code was not arduous and did not require generation tools. REST itself was a contract. WADL if you want to stretch it and HATEOS if you want to argue about it made their ways to provide something like an IDL. Heterogeneous clients thrived with REST. The point to note is IDL seems to have taken a back seat here. Enter GRPC, a high performance open source framework from google. Has an IDL via .proto files. Supports Heterogeneous/Polyglot clients. It's just fascinating how IDL's are here to stay.

Reasons why Service Clients get a bad name

To facilitate the discussion of Service Clients and why they do not work, consider the following Customer Service Application written in Java that uses Service clients for Notes, Order management, Product Search and Credit Card. These Clients are provided by individual service maintainers.

The Not-So-Thin Service Client or The Very "Opinionated Client"

Service Clients many times start serving a higher purpose than what they were originally designed for. Consider for example the following hypothetical Notes Client:

The Notes Client seems to do a lot of things:
  • Determining strategy around whether to call the Notes Service or fall back to the Database
  • Determining whether to invoke calls in parallel or serial
  • Providing a Request Cache and caching data
  • Translating the response from the service into something the client will understand and absorb
  • Providing functionality to Audit note creation (stretching here:-))
  • Providing Configuration defaults around Connection pools, Read Timeouts and other client properties
The argument is "Who but the service owner knows best how to interact with the service and the controls to provide?". There are a few things with this design that are happening:
  • It does not account for non-Java clients
  • Bundles in a bunch of logic into the client around Notes (yes stretching it here with a Notes example :-))
  • Imposes a Request Cache on the Consumer. Resource imposition.
  • Assumes that it knows best regarding how you would like to invoke the service from a serial/parallel perspective
  • Makes resource decisions for you regarding threads etc
  • Provides defaults that it thinks are best without making you even think of your SLAs

Logic Creep into the Client

I have seen this happen a few times where business logic creeps into the binary client. This is just bad design that really negates the benefits of the services architecture. Sometimes the logic is around things like fall backs, counters, request caching etc. While these make sense to abstract away from a DRY perspective, they introduce a coupling or dependency on the client from the consumer's perspective that making changes might could require all consumers having to update. There are some clients that have 'modes' of operation and logic around what will be done in a given mode. These can be painful if they needed to change.

Resource Usage

The library provides a Request Cache to cache request data. It will use threads, memory, connection pools and many other resources while performing its job. The Service consumer might not need a multi-threaded connection pool, it might only need to obtain Notes every once in a while and does not even need to keep the connection alive. The Service consumer might not even care about paralleling calls to obtain data, it might be fine with serial. The consumer is however automatically opted-in to these invading resource usages.

Black Box

The Service Client is not something the Service Consumer wrote. On one hand service consumers care grateful that they have a library that takes away the effort around remote communication  but on the other hand, they have absolutely no control over what the library is doing. When something goes wrong, the consumer team needs to start debugging the Service Client logic and  other details, something they never owned or have in depth knowledge about. Phew!

Overloaded Responsibility

The Service owners perspective is really to generate a service that fulfills the business purpose. Their job is not to provide clients for the multitude of consumers that might want to talk to them. Writing client code to provide resiliency, monitoring and logging on communicating with the service just does not appear to be the responsibility for the service owner.

Hydra Client and Stack Invasion

Consider the following figure for this discussion:
A service client often brings along with it dependencies on libraries that the service maintainers have chosen for their stack. A consumer of the Service client are often forced to deal with managing out dependencies that would probably have never wanted in their stack but are forced to absorb them. As an example, consider a team that is invested in Spring and using Spring MVC and RestTemplate as their stack but have to suffer dependency management hell because they need to interface with a team whose Service Client is written using Jersey and need to deal with dependencies that might not be compatible. Multiply this with other Service Clients and the dependencies they bring in or go ahead and run with a Jersey 1 and Jersey 2 Client dependent libraries in your classpath to experience first hand how it would feel :-).

Heterogeneity and Polyglot Clients

"First you lose true technology heterogeneity. The library typically has to be in the same language, or at the very least run the same platform" - Sam Newman

Expecting the team generating a service to have to maintain service clients for different languages and platforms is a maintenance hell and simply not scalable. If they do not provide a client, then the ability to support a polyglot environment is stifled. When service clients start having business logic in them, it becomes really hard for service consumers not using the service team provided client to generate their own due to the complexity of logic involved. This can be a serious problem as it curbs innovation and technological advances.

Strong Coupling

The Service Client when done wrong can lead to a strong coupling between the service and its consumer. This coupling would mean that making even the smallest of changes to the service would involve significant upgrade effort from the consumers. What this does is stymie innovation on the service and taxes the service consumers into some upgrade cycle advocated by the service providers.

Benefits of the Service Client

The primary benefits that Service clients provide is DRY (Don't repeat yourself). A team that develops a service might quite well develop a client to test their service.  They might also provide integration tests with multiple versions of binary clients to ensure backward compatibility of their APIs. Their responsibility though ends there really, they really should have no obligation to distribute this client to others but the question we find a service consumer asking is; If this client is already created by the service owners and is available for free, why should I need to build one?

Remember that the Service creators are knowledgeable around the expected SLA of their service end points and the client they provide has meaningful defaults and optimizations for the same. A Service owner might augment the client library with things like Circuit Breakers, Bulkheading, Metrics, Caching and Defaults that will make the Client work really well with the service.

If a companies stack is very opinionated, for example, a non polyglot java and Spring Boot stack where every team uses RestTemplate for HTTP invocation with some discipline around approved versions of third party jars considering backward compatibility etc, then this model could very well hum along.

Client Generation

To facilitate the choice for service consumers to generate their own client, an IDL needs to be made available.  Tools like Swagger Code Gen and GRPC gen are great examples where these could be made available.  The auto-generated clients are good but as a word of caution, they aren't always optimized for high load and/or might need customization for sending custom company specific data anyways, diminishing their value as generated clients. Examples of company specific data might be things like custom trace identifiers, security info etc etc. Generated client code also will not have Circuit Breaking, Bulkheading and Monitoring features available on it effectively requiring each service consumer to write their own and with the tools available for them.

Parting Thoughts and Resources

We want to promote a loosely coupled architecture with distributed systems, i.e., between services and their consumers. This enables changes on the service to evolve without impacting consumers. A Service itself must NOT be platform/language biased and support a polyglot consumer ecosystem.

If Service owners were to design their services with the mentality that they have no control around the stack of the consumer apart from a transport protocol, it promotes better design of the service where logic does not creep into the client.

Service Clients, if created by Service owners must be 'thin' libraries devoid of any business logic that can safely be absorbed by consumers. If the client libraries are doing too much, you ought to re-think your service responsibilities and if needed consider introducing an intermediary service. I would also recommend that if a Service client is created for consumption by consumers, then limit the third party dependencies used by the client to prevent the Hydra Client effect.

It is also important to note that if a Services team creates a Client, that they do not force the consumers to use it. If the choice to use a different stack exists, then the Consumer is empowered to use the client generated by the Service maintainer or develop one independently. To a great degree service consumer teams maintaining their own clients keeps the service maintainers in check regarding compatibility and prevents logic creeping into client tier.
  • Ben Christiansen  has a great presentation on the Distributed Monolith that is must see for anyone interested in this subject. 
  • Read the Microservices book by Sam Newman. Pretty good data!
Keep following these postings as we walk into the next generation of services and tools to enable them!

If you have thoughts for or against Service Clients, please do share in the comments. 

No comments: