How to Make a Perforated Monolith

My first encounter face-to-face with distributed applications was at a company processing large amounts of data. The company's core product was a distributed application, and my initial, naive inclination was to run the the application locally so that I could understand it better. I didn't want to scale it on my laptop. I didn't want to test a heavy load on my laptop. I simply wanted to step through the process of a piece of data as it entered the system, was processed, and arrived at its final destination. It took a few days before the realization hit: running this distributed application locally was simply not possible.

The application was composed of many clusters of services running on premises. Provisioning another instance of the application required a request to operations to perform the arduous tasks of setting up new networks, servers, and configurations, many of them involving manual changes on live virtual machines. With the understanding that testing was necessary, the company had created a stubbed implementation of the important APIs that simulated the application response, for access by integration tests. It was essentially a huge mock of all the services—a shadow of the real implementation for providing expected outputs, not for furthering understanding of the system, and certainly not to enhance the development of new features.

Nowadays we have containerization, and frameworks such as Docker Compose allow us to create a recipe for Docker to start up a bunch of containers on our local machine. Docker creates its own little network, and the various distributed pieces spin up and talk to each other. But a question kept nagging me: “why?” If you take all the core logic in all the services, separated from the necessary connectivity overhead, the codebase is hardly bigger than one of the traditional stand-alone applications everyone used to create a few decades ago. Code size is not the impediment to running a distributed system standalone—it is the unwieldy connectivity arrangement that assumes the components must be separated at deployment time, not that they merely can.

To me containers and Kubernetes are implementation details—the implementing technology, not the architecture. Instead of implementing all the cloud plumbing up front and then trying to reproduce in a local, single-machine deployment; why can't one simply create a well-designed application where Service A simply calls Service B, run it locally, and then split it up when deploying to the cloud? This led to my idea of a “perforated monolith” (a term inspired by Jason Katzer) and my just-released Flange Cloud™ Java framework which makes this idea a reality.

Fabled Evolution from Monolith to Microservices

Hardly anyone you talk to nowadays promotes creating a monolith. (Except, of course, the recognized experts, as we'll see below.) Like the straw specter of the “waterfall model” as opposed to “agile development”, the dreaded epithet “monolith application” is invoked as an argument against any design that doesn't include days of lost productivity creating Terraform templates and provisioning Kubernetes: “But surely you don't want to build a monolith, do you?"

The fabled story of evolution to microservices usually goes like this: Once up on a time there was an application. After several years of iteration, the application had become simply too big and too hard to manage. The poor team (there was just one team) could hardly add a single new features without things breaking all over the place, and deploying a new version took weeks. The application looked like this:

One of the developers proposed that perhaps the more pressing problem was not that the application was a monolith, but that it was a mess: it did not cleanly separate concerns, it wasn't well-modularized along domain lines, and it could use a good dose of refactoring (perform daily for a month and then see the doctor again). Nevertheless the team leader had read all the latest blog posts about how microservices were better than monoliths, and made the decision to revamp the application deployment paradigm, yielding the following:

A Microservices Mess — A mess of microservices.

At this point the application was no less of a mess; but now it required several teams to manage the separate services. Teams now needed to implement RESTful APIs (or at least something purporting to be a RESTful API) between the services. A team needed to design and run some sort of service discovery (itself a cluster for redundancy). Kubernetes needed to be deployed and managed or nothing would run. (But with the new architecture, one of the new teams could use Python if it wanted, the team leader pointed out!)

“When Should You Use a Monolith?”

A microservices architecture undeniably brings some important benefits:

🎉 Services can be independently developed.
🎉 Services independently scalable.
🎉 Services independently deployable.
😐 Services can be written in Python/Node.js.
😐 Services can communicate using RESTful APIs.

There are two considerations, however, that often get glossed over, dismissed as irrelevant (“because what's the alternative—a monolith?”), or left out of the discussion altogether. The first is that microservices, especially those implemented using containers, bring a ton of drawbacks; here are just a few:

😱 Implement service discovery.
😱 Deploy load balancers and auto-scaling rules.
😱 Install sidecars.
😱 Manage a Kubernetes cluster.

The other consideration is that many of the benefits of a microservices architecture can be gained without containers—and now that Flange Cloud is released, these benefits can be achieved even within a monolith!

Everything in software design is a tradeoff, and the system architect must determine whether the benefits of microservices outweigh the drawbacks. Let's start with a rule of thumb by an expert in cloud-native architectures, Jason Katzer, in his description of “When Should You Use a Monolith?”:

If you anticipate the size of your development team to stay under 15 people for the next 5 years, and expect to have less than 10 million active users, you may want to keep things easy by sticking with the monolith. — Jason Katzer, Learning Serverless, Chapter 2. Microservices (O'Reilly, 2020) [emphasis added]

“When Might Microservices Be a Bad Idea?”

The related question is when one should avoid microserves. Again I defer to the experts—in this case Sam Newman, perhaps one of the most recognized authorities in microservices design. He places the reasons for “When Might Microservices Be a Bad Idea?” into four categories:

Unclear Domain
Startups
Customer-Installed and Managed Software
Not Having a Good Reason!

Doing microservices just because everyone else is doing it is a terrible idea. — Sam Newman, Monolith to Microservices (O'Reilly, 2020)

The Istio service mesh provides a case study: although Istio was initially designed using a microservices architecture, the Istio team later regretted its decision and migrated back to a monolithic structure. The Monolith Strikes Back: Why Istio Migrated From Microservices to a Monolithic Architecture explains that “… although the Istio team was aware of Sam Newman’s microservice recommendations, they didn’t give appropriate weight to the guidance …”. In particular the team didn't realize the pain that microservices would impose on their customers who had to deploy Istio using microservices.

In the end, Istio switched back to a monolith architecture, but importantly retained a modularized design:

… internally Istio still maintains the logical separation between some of its original control plane components and … each capability is exposed as a discrete API. — The Monolith Strikes Back: Why Istio Migrated From Microservices to a Monolithic Architecture

A “Perforated” Monolith

Istio's experience supports the fundamental realization that led to my inventing Flange Cloud: a loosely-coupled, modular design is orthogonal to whether the application uses a “monolithic” or “microservices” deployment. If I create a well-designed application, with clearly separated concerns and cleanly separated modules, I should be able to run the application locally or split out the components as distributed services.

A couple of days before I made the Flange framework public, I discovered that Jason Katzer (cited above) had promoted this exact precept in a section of his Learning Serverless book. He describes a situation in which he proposed to a client the radical idea of creating a monolith, but maintaining a modular design—“perforated” so to speak, so as to facilitate splitting out the modules into microservices if needed:

My recommendation took all of the best of microservices while avoiding the downsides. The functionality would be rewritten as if it were a microservice. There would be a clear separation of concerns and a well-defined and specified contract between the two components; as a cherry on top, it would be perforated for future separation when it would inevitably be required to be split out of the monolith. [emphasis added]

…

You can build your monolith with the patterns of microservices but without their plumbing and overhead. … Take all of the principles espoused in this book: clean separation of concerns, loosely coupled and highly independent services, and consistent interfaces. Keep these in the same monolithic app, and never compromise on these rules. The result will be a monolith that is baked to perfection and ready to be carved up later. — Learning Serverless, Chapter 2. Microservices (O'Reilly, 2020)

The word Jason chose, “perforated”, aligns precisely with the philosophy of Flange Cloud; I immediately adopted the adjective as part of the term “perforated monolith” to describe the type of application one can build with Flange. However in his description above, Jason gives no indication that “splitting … the monolith” could be anything but the painful, error-prone drudgework it has always been historically. Flange Cloud changes all that: Flange allows you to keep your well-architected, perforated monolith—automatically “splitting” your monolith and deploying it transparently to the cloud as cloud-native components. Flange will add the “plumbing” as needed.

You can run and test your perforated monolith on your machine as much as much as you like.
You can tell Flange to deploy your perforated monolith to the cloud, where you can run it directly from your cloud platform.
You can even tell Flange to run your application driver component locally, yet have it transparently access the distributed components “across the perforations” to the cloud.

Before diving into how to use Flange, let's take a moment to examine what a “perforated monolith” looks like in reality.

Minding the Dependencies

One of the fundamental rules that allows Flange Cloud to split apart a monlith is that services interfaces must be in separate projects than the service implementations. Surely every modern developer has been taught to program to an interface, not an implementation, as stressed by Erich Gamma and others. Yet many monolith applications in the field using existing application frameworks conflate the service interface and implementation, or place both in the same project.

This practice undoubtedly has arisen because no one conceived that another service might invoke the service locally. Yet in Flange Cloud this is not only a possibility but the default development pattern. Service A needs to be able to invoke Service B without being tied to the implementation of Service B. While the implementations of Service A and Service B might both be running in the same JVM locally, after Flange Cloud deployment it's likely the implementation of Service B will not be a part of Service A's deployment. Thus the interface for Service B must be in a separate project, producing a separate artifact, so that Service A is no longer coupled to Service B's implementation and can be deployed separately, as shown in the following diagram.

Interfaces in Separate Projects — Placing service interfaces in separate projects.

Minding the Network at the Perforations

A perforated monolith must be designed to be mindful that a network could potentially be present in the “perforations”—the devisions between the services. As is famously explained in the “Fallacies of Distributed Computing”, the network doesn't work like a local method invocation: the network has latency and overall is not reliable.

The problem with older technologies such as Java Remote Method Invocation (RMI) is not that they remotely invoke methods. Indeed when a service in a traditional microservices architecture invokes another service over a RESTful API, it is similarly remotely invoking a method of sorts. This reality is even more evident when invoking cloud functions such as AWS Lambda: proprietary data is sent and received in binary streams (typically but not required to be encoded as JSON) to invoke what are essentially disassociated, one-off method instances.

The difference between traditional RMI and cloud function invocation is the invocation pattern: the connection behaves differently, so the client must perform extra duties, such as:

Setting and detecting a timeout on the reply.
Promoting fault-tolerance by providing fallback data on timeouts.
Detecting and handling errors coming from a broken connection (or broken service).

Instead of objecting that the network behaves differently, so local invocations can't be reprovisioned as-is in the cloud; Flange makes the bold prescription of simply designing local invocations as if they were remote to begin with!

Minding an Unreliable Network

Modern asynhcronous and reactive programming frameworks have already been evolving in this direction. JavaScript developers daily use the Promise object for asynchronous communication, even for local invocations. The analogous construct to a JavaScript Promise in Java is the CompletableFuture<T> class.

Consider a MessageService interface that has a method getGreeting(), which returns some String ("Hello, World!", perhaps?). Here is how one might invoke that method from another module in a monolith using the old-style RMI paradigm, without being mindful of the network:

String greeting = messageService.getGreeting();
System.out.println(greeting);

But what if the monolith module were mindful that the service it is invoking might be accessed across the network? In this case, instead of returning a String, the MessageService.getGreeting() method would return a CompletableFuture<String>. The calling module would be mindful of the network, even though the invocation might occur locally:

try {
  messageService.getGreeting()  //`CompletableFuture<String>`
      .completeOnTimeout("Hi, Everybody!", 5, SECONDS);
      .thenAccept(System.out::println);
} catch(CompletionException completionException) {
  Throwable cause = completionException.getCause();
  //TODO handle the error
}

This improved approach adds a timeout, and even provides default data to use if a timeout occurs. It also handles errors, which might include the connection breaking. (There are various flexible options for handling errors with CompletableFuture<T>.)

Another improvement to make to the system would be to add a circuit breaker. However it's interesting to note that circuit-breaker functionality is not so much a concern of the invoking application or of the service being invoked, but rather a concern of the connection itself. That is, circuit-breaker functionality should be able to be defined for the “perforation” between the services. And as Flange Cloud is “managing the perforations”™, Flange Cloud will eventually be able to insert circuit breaker functionality, bulkhead functionality, caching, and other connection concerns automatically, without needing to pollute the application logic!

Minding the Marshalling

An additional requirement of an invocation that has the potential to be remote is to send and receive data that is able to be marshalled across a connection, even though in a local deployment it will still be passed as a reference. (“Marshallling” refers to serializing and unserializing an object so that it can be passed over a network connection.) Passing an unmarshallable object with references to the local system, for example, cannot be done in Flange Cloud—it's not something you can do in traditional microservices, either. Remember that the idea of a “perforated monolith” is to design local modules as if it they were microservices. This prohibits trying to pass the following sort of data, for example:

searchService.findText(FileSystems.getDefault(), "needle"); //prohibited!

Not only is a Java FileSystem unable to be marshalled across a connection, it has an inordinate number of local references—not to mention that it references a system resource, the local file system, which might not be present on the other side of the “perforation”.

Java recently introduced the record type, which is an immutable value class that is ideal for transferring data across a connection. (See Baeldung's Java 14 Record Keyword for a quick and understandable example of a Person record.) Java has many other appropriate data objects which Flange Cloud can marshal just fine; including Java collections such as List<E>, the modern Java date/time classes, Locale, and even Optional<T>.

Flange Cloud™: “Build a monolith. Deploy cloud native.”

If you follow these principles and build a well-designed monolith application with “perforations”, allowing it to be later split apart into microservices, how then do you go about doing the actual splitting? The answer is Flange Cloud.

This morning I released the first version of Flange Cloud™, a framework to transparently transform and deploy a well-designed monolith application to a distributed, elastic, highly-scalable cloud-native application. The overall Flange™ framework is a collection of lightweight libraries meant to replace older, bloated frameworks unsuitable for cloud deployement. The Flange suite of projects will include those for dependency injection, configuration, and internationalization. The Flange Cloud project in particular provides a means to split out a perforated monolith and deploy it to the cloud.

The following is a general conceptual outline for using Flange Cloud. For more extensive instructions see the Flange Cloud Quick Start guide.

Service Interface

Annotate a service interface such as CalendarService with @CloudFunctionApi. This tells Flange that the service will be invoked as a cloud function (e.g. Lambda in AWS). Use marshallable parameter types and return a CompletableFuture<T>.

@CloudFunctionApi
public interface CalendarService {

  CompletableFuture<Optional<String>> findHolidayName(LocalDate date);
…

Service Implementation

Annotate a service implementation such as CalendarServiceImpl with @CloudFunctionService. This tells Flange that the service will be implemented as a cloud function (e.g. Lambda in AWS). If the service implementation in turn depends on another service such as OtherService, add an annotation @ServiceConsumer(OtherService.class) indicating the interface of the other service. You will probably also want to inject the other service interface in the constructor, as shown in the example.

@CloudFunctionService
@ServiceConsumer(OtherService.class)
public class CalendarServiceImpl implements CalendarService {

  private final OtherService otherService;

  public CalendarServiceImpl(OtherService otherService) {
    this.otherService = requireNonNull(otherService);
  }

  @Override
  public CompletableFuture<Optional<String>> findHolidayName(LocalDate date) {
    if(date.getDayOfYear() == 1) {
      return CompletableFuture.completedFuture(Optional.of("New Year"));
    }
    //TODO check for other holidays
    return CompletableFuture.completedFuture(Optional.empty());
…

Flange Cloud Application

You can create an application “driver” program to invoke your functions for testing locally or invoking functions in the cloud. Flange Cloud provides an extremely lightweight FlangeCloudApplication interface to facility creating an application that recognizes command-line arguments for invoking deployed cloud functions, just as deployed Flange services invoke other Flange services in the cloud. The interface provides a convenient FlangeCloudApplication.start(…) method to execute your application from the main(…) entrypoint. The following example shows a simple application that implements FlangeCloudApplication and uses a CalendarService that may be local or in the cloud.

@ServiceConsumer(CalendarService.class)
public class HolidayApp implements FlangeCloudApplication {

  private final CalendarService calendarService;

  public HolidayApp(CalendarService calendarService) {
    this.calendarService = requireNonNull(calendarService);
  }

  @Override
  public void run() {
    //TODO access `calendarService`
  }

  public static void main(String[] args) {
    FlangeCloudApplication.start(HolidayApp.class, args);
  }

}

Executing the application locally or in the cloud is explained in a section below.

Building a Flange Project

Individual Maven projects may be built together as aggregatess of a composite project, or may be developed independently as separate projects. Flange doesn't care if you store your entire application in a single source code repository or in multiple repositories.

Performing a build and preparing for deploying is done in typical Maven fashion:

mvn clean package

Local Application Execution, Local Service Invocation

To execute your application locally, invoking your services locally, simply run your application as you would normally, either from the command line or within your IDE. It's a monolith. It runs as you would expect it to. Feel free to use your debugger and step through all the services as they invoke either other locally, within the same JVM.

Deploy to the Cloud

Use the cloud deploy command of the Flange command-line interface (CLI) script flange.sh to deploy your services to the cloud. You may invoke the Flange CLI in a service implementation Maven project to deploy an individual service to the cloud, or invoke the Flange CLI in the root directory of an aggregate Maven project to deploy all services to the cloud. The following example assumes you have set up a deployment environment named dev, as described in the documentation.

flange.sh cloud deploy dev

Once you deploy your services to the cloud, they automatically know how to invoke other cloud services they depend on. In order to manually invoke a deployed Flange cloud function service, you can create a local application serving as a “driver”, as explained above. (The other option for manually invoking a deployed Flange cloud function service is to use a low-level cloud invocation, which requires knowledge of the underlying Flange Cloud marshalling technique and which is outside the scope of this overview. Future versions of Flange Cloud will add support for API gateways and static web front-ends, all of which will be able to run locally or be deployed to the cloud.)

Local Application Execution, Cloud Service Invocation

You can also execute your application locally, but have it invoke your deployed services in the cloud (assuming you have already deployed the services as explained above). This requires that your application use the FlangeCloudApplication facility explained previously. Execute it using the Flange CLI bin/flange.sh exec. Indicate the aws platform using the --flange-platform option, specify the deployment environment using the --flange-env option. The following example shows how to execute one of the applications from the published examples, invoking services deployed on the AWS platform in the dev deployment environment.

flange.sh exec -- \
    dev.flange.example.cloud.hellouser_faas.app.HelloUserFaasApp \
    --flange-platform aws --flange-env dev

Flange Cloud Future

The first official Flange version, Flange 0.1.0, was released today. Flange is in early, rapid development and will soon gain many capabilities. Currently Flange Cloud supports deployment of services as Cloud Functions (FaaS) on the Amazon Web Services (AWS) cloud platform. Eventually support for Google Cloud and Microsoft Azure is envisioned.

Flange Cloud initially deployes services as cloud functions. Upcoming versions will allow services to be deployed as containerized microservices, expose service interfaces as RESTful APIs, and support event-driving architectures using queues—all with cloud-native support while still being able to run locally as a monolith for development.

A Flange Cloud Introduction Presentation is available online that provides an overview of many of these topics in slideshow format. For more extensive instructions, see the Flange Cloud Quick Start