David R. Heffelfinger

  Ensode Technology, LLC


Mitigating Risks of a Microservices Architecture

In the “Demystifying Microservices for Java EE Developers” guide I wrote for Payara, I list several advantages and disadvantages of a microservices architecture. In the guide I list the disadvantages, but don’t really go into how to mitigate or even eliminate the risks. In this post I’ll go over these disadvantages and how to reduce the risks that they present.

Additional operational overhead

When developing an application as a series of microservices, there will be some operational overhead since, instead of deploying one application into production, several small applications (i.e. your microservices) will need to be deployed. To decrease the costs of the operational overhead, a few approaches can be taken.

Deploy to the cloud

When deploying to the cloud, the cloud vendor can provide some of their resources to help with operation tasks for your application, freeing your team from having to perform these tasks. Additionally, most cloud vendors provide elasticity, meaning that your applications can be scaled up or down on demand, as their load increases or decreases, this helps with the scalability of your application as a whole.

Implement a DevOps approach

Your development team assists with operational task such as deployment and monitoring

Use an automated deployment tool

There are several tools in the market that can automate deployments, some free and open source, some commercial. Bamboo and Jenkins are two examples that are popular in the Java world. Puppet is another example, popular with languages typically used with Linux, such as Python or Ruby.

Use an Automated Performance Management (APM) Tool

Monitoring performance becomes a lot harder if you deploy your application as a series of independent modules (microservices). An automated performance management tool can help with this. Some examples include AppDynamics, New Relic, Vector and Prometheus.

Use a log aggregation tool

When deploying an application as a series of microservices, instead of having a single log file to monitor, we typically have several, maybe hundreds of log files to monitor. Doing this by hand is not practical. To mitigate this risk a log aggregation tool such as Splunk, GrayLog or Loggly should be used.

Increased Debugging Complexity

When debugging an application following a microservices architecture, it isn’t always obvious which of your microservices is causing the problem. Some user action (i.e. saving data on an HTML form) could trigger invocations to several microservices, making it harder to pinpoint the cause of the issue.

Additional Tooling

Tools such as log aggregation tools and performance management tools may help.

Implement a policy of collective code ownership

“Collective code ownership” means that nobody owns code, ownership is shared across all teams and all developers in your organization. If a user reports a problem to your team, but it turns out that the problem is not with your code, but with another service that your code depends on, then if there is a policy of collective code ownership your team can fix the problem themselves, instead of waiting for the other team to get around to it.

Correlation Identifiers

When invoking your microservices, generate a correlation id, and pass it around as you invoke your microservices, then have your microservices log the correlation id as they are invoked. This will make it easier to trace microservices invocations when going through log files.

Distributed Transactions

Distributed transactions happen when we start a transaction, and while that transaction is in progress, an invocation to a microservice over the network takes place. Typically we want to avoid distributed transactions as there is a high probability they will timeout and rollback.

Implement your microservices as atomic units

Commit all your transactions before making invocations across the network.

Use compensating transactions

If you commit a transaction that depends on a call to a microservices, and the call fails, then initiate a compensating transaction to revert the changes made by the original transaction.

Susceptibility to the fallacies of distributed computing

L Peter Deutsch came up with the Fallacies of Distributed computing, microservices being inherently distributed, are susceptible to these fallacies.

Implement the circuit breaker design pattern

Modeled after an electrical circuit, the way the circuit breaker design pattern works is that your code attempts to make an invocation over the network, if the call fails, your code retries for a predetermined number of times, if the invocation does not succeed after repeated attempts, then the circuit breaker trips, and your code can handle the failure gracefully (how to do this depends on your specific application requirements).

Use a load balancer

A load balancer such as nginx or F5 can distribute the load across several instances of your services. Most load balancers provide failover capabilities as well.

Deploy to the cloud

The elasticity feature of most cloud providers will help mitigate susceptibility to the fallacies of distributed computing.


« July 2017 »

© David R. Heffelfinger