Worked example: a retail product platform
Suppose that we have a tech company called Products-R-Us. It provides a platform for promoting retail products: people can create product catalogs to manage the products they sell, manage stock inventory and faciliate payments to sell their stock. We can easily imagine that this would have at least 3 different (micro)services built to handle each responsibility:
CatalogService
– create and manage products and catalogsInventoryService
– manage stock inventoryPaymentService
– manage payments
Products-R-Us exposes RESTful APIs to each of these services so that developers can build their own tools and UI around their platform, and they are fairly successful with this business strategy.
They soon hire data scentists to help expand their business using machine learning models. The goal is to build a product recommendations engine: clients can find out what products tend to be bought by people who bought product X. This results in a new microservice: RecommendationsService
. This too will be exposed as a REST HTTP service.
So the question is this: how much data should we be exposing on our endpoints? Take GET /recommendations/products/:product_id
. The endpoint will be used in a context where the client will present the recommended product to the customer. So on its face, we should be including things like:
product_name
thumbnail
price
merchant
But there are many use cases that would require us to fetch information from other microservices:
- A client wanting to only recommend products that are in stock might ask to include
in_stock
as a property from theInventoryService
. - A client wanting to show the available ways to pay for the product (e.g. debit card, Paypal, payment plan, etc…) may want
payment_options
as a property from thePaymentsService
. - A client wanting to only recommend products that have good ratings may want to include
average_rating
andtotal_reviews
as properties on each response object (with data from theCatalogService
). - In the future we may have a
DiscountsService
which calculates applicable discounts to products offered on the platform. A client wanting to recommend products with said discounts may want adiscounts
property. Would we need to add this on later?
and so on, and so forth. If we implemented each of these features, not only would the microservice be bloated, but it would also have an incredibly brittle contract. The data it returns is largely dependent on downstream services which can vary for wildly very different reasons to this recommendations engine. A change in the way that stock is presented by the InventoryService
would completely break the recommendations engine, when these two concepts are entirely unrelated from each other!
With HATEOAS, we can keep this coupling loose and let our response format be fairly lean. If we simply have references to the various resources
, then we can let them vary independently, and allow clients to simply crawl the links for the data that they need.
// GET /recommendations/products/1
// Content-Type: application/vnd.productsrus+json
{
"results": [
{
"product_id": "2",
"relevance_score": 0.9,
"_links": {
"product": "/products/2",
"reviews": "/reviews/products/2",
"stock": "/inventory/products/2",
"recommendations": "/recommendations/products/2"
}
},
{
"product_id": "70",
"relevance_score": 0.4,
"_links": {
"product": "/products/70",
"reviews": "/reviews/products/70",
"stock": "/inventory/products/70",
"recommendations": "/recommendations/products/70"
}
}
],
"_links": {
"products": "/products?id=2&id=70",
"reviews": "/reviews/products?id=2&id=70",
"stock": "/inventory/products?id=2&id=70"
}
}
A client integrating this API would hit the endpoint to get the list of recommendations, and then follow the _links
to build up any information that is necessary for enhancing their recommendation or presenting data in their view layer. If they want the full product information for the nth
product returned, they visit data[n]._links.product
. If they want all of the product information for all of the products returned, rather than hitting each individual product link, they can visit _links.products
to get an appropriate collection of just the relevant products.
Notice the benefits that we get here:
- Adding references to more resources is simply a case of adding more
_links
to the response. So if I want to addpayment_options
in the future, I just give a_link
to that resource in thePaymentsService
orCatalogService
(whichever is most relevant). - There is a clear separation between data that is crucial to the service and metadata that is potentially useful (but very use-case specific).
- Because the client just follows the links to get to the desired resources, you have more flexibility to evolve the links if you need to in the future. Note that it is okay to evolve the API, and this is a particularly forgiving way of doing so.
- We’re giving the client the whole proverbial "fishing rod": by hitting the relevant microservice for their desired information, they get everything they need when they need it, without us having to directly support it in this service. Crucially: if a downstream service has a new requirement resulting in adding a property to a resource, that change doesn’t ripple through to this service.
- Also, a client that is primarily concerned with the recommendations API will not be affected by changes to the URL structure of the downstream services. This is why we don’t simply leave the
product_id
as a seed and tell the client to read the documentation to find other services – this abstracts yet another detail of the system from the client, making it easier to use. Other clients which do have a knowledge of said services won’t be so lucky though, but this still minimizes some of the risk involved in changing your links if you need to.
This is only possible however, if the underlying microservices provide a suitable level of querying for these desired resources. Given that we’re just talking about filtering by ID or multiple IDs though, this isn’t too bad — your clients will thank you for making your services easy to query!
Furthermore, if it is absolutely necessary that an endpoint return all of the data in one go, then a composite service can be built which aggregates the data together into a single endpoint (by following the links and combining the data together before returning the response). This effectively means building a facade to hide the complexity of the link crawling and composition; however, we gain the benefit of keeping our original service small, stable and flexible.