lundi 19 septembre 2016

How to measure Bulk APIs mean performance?

Implementing Bulk APIs obfuscates metrics at APM (e.g. NewRelic) because users of a given API can request by sending an arbitrary number of parameters, so it's metrics depends upon the size of workload request.

In this scenario, is it a good approach client-side parallelize all requests in order to have a single parameter on server side API?

It should costs almost nothing by using HTTP/2, in terms of request overhead.

Example of scenario

Bulk API request

  1. Client needs to fetch IDs 1,2,3,4,5 from server API (e.g. cars API)
  2. Client do one HTTP GET request to https://server_url/api/cars?ids=1,2,3,4,5
  3. Server gather all necessary data and responds in just one HTTP response.
  4. Client parses all response body and do what need be done.

In this approach, APM gives some metrics about my API's mean-time, but it varies accordingly with API payload.

Non-bulk API requests

  1. Client needs to fetch IDs 1,2,3,4,5 from server API (e.g. cars)
  2. Client multiplex requests and make 5 concurrent requests to server:
  3. Client do HTTP GET to https://server_url/api/cars/1
  4. Client do HTTP GET to https://server_url/api/cars/2
  5. Client do HTTP GET to https://server_url/api/cars/3
  6. Client do HTTP GET to https://server_url/api/cars/4
  7. Client do HTTP GET to https://server_url/api/cars/5
  8. Whenever all requests returns, client parse responses, merges the data and do what need be done.

In this approach, APM gives the exact (or near exact) metric about my API's mean-time, with always the same payload size.

Considerations

Consideration 1: Inside server-side code, delegating API requests to a inner tier and monitor each requests to where the data should be fetched may be a solution (e.g unique and parallel selects to a database). It is about parallelizing requests internally and expose a single bulk API. But, this question is about API design.

Consideration 2: I guess it's a scenario for only HTTP GET method, since a POST request (e.g. an form upload - multipart/form-data) may have a variable workload in major of use cases.

Is this a good approach?

Aucun commentaire:

Enregistrer un commentaire