Categories
Architecture Software

HATEOAS in the context of microservices

Worked example: a retail product platform

Suppose that we have a tech company called Products-R-Us. It provides a platform for promoting retail products: people can create product catalogs to manage the products they sell, manage stock inventory and faciliate payments to sell their stock. We can easily imagine that this would have at least 3 different (micro)services built to handle each responsibility:

  • CatalogService – create and manage products and catalogs
  • InventoryService – manage stock inventory
  • PaymentService – manage payments

Products-R-Us exposes RESTful APIs to each of these services so that developers can build their own tools and UI around their platform, and they are fairly successful with this business strategy.

They soon hire data scentists to help expand their business using machine learning models. The goal is to build a product recommendations engine: clients can find out what products tend to be bought by people who bought product X. This results in a new microservice: RecommendationsService. This too will be exposed as a REST HTTP service.

So the question is this: how much data should we be exposing on our endpoints? Take GET /recommendations/products/:product_id. The endpoint will be used in a context where the client will present the recommended product to the customer. So on its face, we should be including things like:

  • product_name
  • thumbnail
  • price
  • merchant

But there are many use cases that would require us to fetch information from other microservices:

  • A client wanting to only recommend products that are in stock might ask to include in_stock as a property from the InventoryService.
  • A client wanting to show the available ways to pay for the product (e.g. debit card, Paypal, payment plan, etc…) may want payment_options as a property from the PaymentsService.
  • A client wanting to only recommend products that have good ratings may want to include average_rating and total_reviews as properties on each response object (with data from the CatalogService).
  • In the future we may have a DiscountsService which calculates applicable discounts to products offered on the platform. A client wanting to recommend products with said discounts may want a discounts property. Would we need to add this on later?

and so on, and so forth. If we implemented each of these features, not only would the microservice be bloated, but it would also have an incredibly brittle contract. The data it returns is largely dependent on downstream services which can vary for wildly very different reasons to this recommendations engine. A change in the way that stock is presented by the InventoryService would completely break the recommendations engine, when these two concepts are entirely unrelated from each other!

With HATEOAS, we can keep this coupling loose and let our response format be fairly lean. If we simply have references to the various resources, then we can let them vary independently, and allow clients to simply crawl the links for the data that they need.

// GET /recommendations/products/1
// Content-Type: application/vnd.productsrus+json
{
    "results": [
        {
            "product_id": "2",
            "relevance_score": 0.9,
            "_links": {
                "product": "/products/2",
                "reviews": "/reviews/products/2",
                "stock": "/inventory/products/2",
                "recommendations": "/recommendations/products/2"
            }
        },
        {
            "product_id": "70",
            "relevance_score": 0.4,
            "_links": {
                "product": "/products/70",
                "reviews": "/reviews/products/70",
                "stock": "/inventory/products/70",
                "recommendations": "/recommendations/products/70"
            }
        }
    ],
    "_links": {
        "products": "/products?id=2&id=70",
        "reviews": "/reviews/products?id=2&id=70",
        "stock": "/inventory/products?id=2&id=70"
    }
}

A client integrating this API would hit the endpoint to get the list of recommendations, and then follow the _links to build up any information that is necessary for enhancing their recommendation or presenting data in their view layer. If they want the full product information for the nth product returned, they visit data[n]._links.product. If they want all of the product information for all of the products returned, rather than hitting each individual product link, they can visit _links.products to get an appropriate collection of just the relevant products.

Notice the benefits that we get here:

  • Adding references to more resources is simply a case of adding more _links to the response. So if I want to add payment_options in the future, I just give a _link to that resource in the PaymentsService or CatalogService (whichever is most relevant).
  • There is a clear separation between data that is crucial to the service and metadata that is potentially useful (but very use-case specific).
  • Because the client just follows the links to get to the desired resources, you have more flexibility to evolve the links if you need to in the future. Note that it is okay to evolve the API, and this is a particularly forgiving way of doing so.
  • We’re giving the client the whole proverbial "fishing rod": by hitting the relevant microservice for their desired information, they get everything they need when they need it, without us having to directly support it in this service. Crucially: if a downstream service has a new requirement resulting in adding a property to a resource, that change doesn’t ripple through to this service.
  • Also, a client that is primarily concerned with the recommendations API will not be affected by changes to the URL structure of the downstream services. This is why we don’t simply leave the product_id as a seed and tell the client to read the documentation to find other services – this abstracts yet another detail of the system from the client, making it easier to use. Other clients which do have a knowledge of said services won’t be so lucky though, but this still minimizes some of the risk involved in changing your links if you need to.

This is only possible however, if the underlying microservices provide a suitable level of querying for these desired resources. Given that we’re just talking about filtering by ID or multiple IDs though, this isn’t too bad — your clients will thank you for making your services easy to query!

Furthermore, if it is absolutely necessary that an endpoint return all of the data in one go, then a composite service can be built which aggregates the data together into a single endpoint (by following the links and combining the data together before returning the response). This effectively means building a facade to hide the complexity of the link crawling and composition; however, we gain the benefit of keeping our original service small, stable and flexible.