📜 ⬆️ ⬇️

The story of how not to design API

Once I helped a comrade who needed to integrate data on free and occupied housing from the property management system with the site of his client. To my delight, this system had an API. But, unfortunately, it was arranged very badly.

image

I decided to write this article not in order to criticize the system that will be discussed, but in order to tell about the errors encountered in the development of the API, and suggest ways to correct these errors.

Situation overview


The organization in question used the Beds24 system to manage the living space. Information about what was free and what was busy was synchronized with various housing reservation systems (such as Booking, AirBnB and others). The organization was engaged in the development of the site and wanted the search to display only information about rooms that were free during the specified period of time and were suitable for capacity. This task looked very simple, since Beds24 provides an API for integration with other systems. In fact, it turned out that the developers of this API made a lot of mistakes when designing it. I propose to analyze these errors, identify specific problems and talk about how to approach the development of the API in these situations.

Problem # 1: request body format


Since the client is only interested in information about whether, say, a hotel room is free or busy, we are only interested in accessing the API endpoint /getAvailabilities . And, although a call to such an API should lead to obtaining data on the availability of rooms, this call actually looks like a POST request, because the author of the API decided to equip it with the ability to receive filters in the form of a JSON request body. Here is a list of possible query parameters and examples of the values ​​they take:

 {   "checkIn": "20151001",   "lastNight": "20151002",   "checkOut": "20151003",   "roomId": "12345",   "propId": "1234",   "ownerId": "123",   "numAdult": "2",   "numChild": "0",   "offerId": "1",   "voucherCode": "",   "referer": "",   "agent": "",   "ignoreAvail": false,   "propIds": [       1235,       1236   ],   "roomIds": [       12347,       12348,       12349   ] } 

Let's go over this JSON object and talk about what's wrong here.

  1. Dates ( checkIn , lastNight and checkOut ) are in YYYYMMDD format. There is absolutely no reason not to use the ISO 8601 standard format ( YYYY-MM-DD ) when converting dates to strings, since this is a widely used standard for representing dates. It is familiar to many developers, and it’s exactly what many JSON parsers expect to receive as input. In addition, there is a feeling that the lastNight field is redundant, since there is a checkOut field, which is always represented by a date, one day ahead of the date specified in lastNight . In connection with the disadvantages noted above, I suggest, when designing similar APIs, to strive to always use the standard ways of representing dates and try not to burden the API users with the need to work with redundant data.
  2. All identifier fields, as well as the numAdult and numChild , are numeric, but are represented as strings. In this case, there is no apparent reason for representing them as strings.
  3. Here you can see the following pairs of fields: roomId and roomIds , as well as propId and propIds . The presence of the roomId and propId is redundant, since both can be used to pass identifiers. In addition, you can notice a problem with types here. Notice that the roomId field is a string, and you must use numeric identifiers in the roomIds array. This can lead to confusion, problems with parsing, and, moreover, it means that on the server some operations are performed with strings, and some with numbers, despite the fact that these lines and numbers are used to represent the same data.

I would like to suggest that API developers try not to complicate the lives of those who will use these APIs, allowing errors when designing APIs like the ones described above. Namely, it is worth striving for standard data formatting, so that they would not be redundant, so that different types of data would not be used to represent homogeneous entities. And it is not necessary, indiscriminately, to represent everything in the form of strings.

Problem # 2: response body format


As already mentioned, we are only interested in the endpoint of the API /getAvailabilities . Let's look at what the answer to this endpoint looks like, and talk about the shortcomings made during its formation. Remember that when accessing the API, we are interested in a list of identifiers of objects that are free for a given period of time and can accommodate a given number of people. Below is an example of the request body to the API and an example of what it issues in response to this request.

Here is the query:

 {   "checkIn": "20190501",   "checkOut": "20190503",   "ownerId": "25748",   "numAdult": "2",   "numChild": "0" } 

Here is the answer:

 {   "10328": {       "roomId": "10328",       "propId": "4478",       "roomsavail": "0"   },   "13219": {       "roomId": "13219",       "propId": "5729",       "roomsavail": "0"   },   "14900": {       "roomId": "14900",       "propId": "6779",       "roomsavail": 1   },   "checkIn": "20190501",   "lastNight": "20190502",   "checkOut": "20190503",   "ownerId": 25748,   "numAdult": 2 } 

Let's talk about response problems.

  1. In the response body, the ownerId and numAdult suddenly became numbers. And in the request it was necessary to specify them as strings.
  2. The list of real estate is presented in the form of object properties, the keys of which are room identifiers ( roomId ). It would be logical to expect that such data would be output as an array. For us, this means that in order to get a list of available rooms, you need to roomsavail over the entire object, while checking that the objects nested in it have certain properties, like roomsavail , and not paying attention to something like checkIn and lastNight . Then it would be necessary to check the value of the roomsavail property, and, if it is greater than 0, one could conclude that the corresponding object is available for booking. And now let's look at the roomsavail property. Here are the options for his presentation in the response body: "roomsavail": "0" and "roomsavail": 1 . See the pattern? If the rooms are occupied, the value of the property is represented by a string. If free - it turns into a number. This can lead to many problems in languages ​​that are strictly related to data types, since in them the same property should not take values ​​of different types. In connection with the foregoing, I would like to suggest that developers use arrays of JSON objects to represent certain data sets, and not use uncomfortable constructions in the form of key-value pairs for this purpose, like the one we are considering here. In addition, you need to make sure that the fields of homogeneous objects do not contain data of different types. A properly formatted server response might look like the one below. Please note that when presenting data in this form, information about the rooms does not contain duplicate data.

 {   "properties": [       {           "id": 4478,           "rooms": [               {                   "id": 12328,                   "available": false               }           ]       },       {           "id": 5729,           "rooms": [               {                   "id": 13219,                   "available": false               }           ]       },       {           "id": 6779,           "rooms": [               {                   "id": 14900,                   "available": true               }           ]       }   ],   "checkIn": "2019-05-01",   "lastNight": "2019-05-02",   "checkOut": "2019-05-03",   "ownerId": 25748,   "numAdult": 2 } 

Problem number 3: error handling


Here is how the error handling in the API considered here is organized: the system sends answers with code 200 to all requests - even if an error has occurred. This means that the only way to distinguish a normal response from an error message with an error message is to parse the response body and check for the presence of the error or errorCode . The API provides only the following 6 error codes.


API Beds24 Error Codes

I suggest to everyone who reads this, try not to return a response with code 200 (successful processing of the request) in the event that something went wrong during the processing of the request. You can take such a step only if it is provided by the framework on which you are developing the API. Returning adequate response codes allows API clients to know in advance whether they need to parse the response body or not, and exactly how to do this (that is, whether to parse the server’s usual response or an error object).

In our case, the API can be improved in this direction in two ways: you can either provide a special HTTP code in the range 400-499 for each of the 6 possible errors (this is best done), or return the error code 500, which allows the client, at least, should know before parsing the response body that it contains information about the error.

Problem number 4: "instructions"


The following are "instructions" for using APIs from project documentation:

Please read the following instructions when using the API.

  1. Calls to the API should be designed so that during their execution you would have to send and receive the minimum amount of data.
  2. API calls are executed one at a time. You must wait for the next call to the API to complete before making the next call.
  3. If you need to make multiple calls to the API, between them should provide for a pause of several seconds.
  4. API calls need to be performed not too often, keeping the level of calls at the minimum level necessary to solve client tasks.
  5. Excessive use of the API within a 5-minute period will result in blocking your account without further notice.
  6. We reserve the right to block access to the system for customers who, in our opinion, overuse the API. This is done at our discretion and without additional notice.

While points 1 and 4 look quite reasonable, I cannot agree with other points of this instruction. Consider them.

  1. Item number 2. If you are developing a REST API, then it is assumed that this will be a stateless API. The independence of API calls from previous calls to it is one of the reasons that REST technology has found widespread use in cloud applications. If a certain system module does not maintain state, it can be easily re-deployed in case of an error. Systems based on such modules are easily scaled when the load on them changes. When designing a RESTful API, you should ensure that it is an API that does not depend on the state, and that those who use it do not have to worry about something like executing only one query at a time.
  2. Item number 3. This item looks rather strange and ambiguous. I cannot understand the reason why this item of the instruction was written, but I get the feeling that it tells us that during the processing of a request, the system performs certain actions, and if it is “distracting” with another request, sent at the wrong time, it can disrupt her work. In addition, the fact that the author of the manual speaks of “several seconds” does not allow us to find out the exact duration of the pause that needs to be maintained between successive requests.
  3. Items number 5 and number 6. It refers to “excessive use of the API”, but no criteria for “excessive use” are provided. Maybe it is 10 requests per second? Or maybe 1? In addition, some web projects can have huge amounts of traffic. If, without any adequate reasons and without notifications, to close them access to the API they need, their administrators will most likely refuse to use such APIs. If you happen to write such instructions, use clear wording in them and put yourself in the place of users who will have to work with your system, guided by your instructions.

Problem number 5: documentation


This is how the API documentation looks.


API Beds24 documentation

The only problem with this documentation is its appearance. It would look much better if it were well formatted. Especially in order to show the possible appearance of such documentation, I, using Dillinger , and spending less than two minutes on it, made the following version of it. In my opinion, it looks much better than the above.


Improved documentation

To create such materials is recommended to use special tools. If we are talking about simple documents similar to the above, then for their design is quite enough something like a regular markdown file. If the documentation is more complicated, then for its design it is best to use tools like Swagger or Apiary .

By the way, if you want to take a look at the documentation for the API Beds24 - look here .

Problem number 6: security


The documentation for all API endpoints states the following:

To use these functions, access to the API must be enabled. This is done in the menu SETTINGS → ACCOUNT → ACCOUNT ACCESS.

However, in reality, anyone can access this API, and, taking advantage of some of the calls, obtain information from it without providing any credentials. For example, this also applies to requests for the availability of certain accommodations. We are talking about this in another part of the documentation.

Most JSON methods require an API key to access the account. The API access key can be set using SETTINGS → ACCOUNT → ACCOUNT ACCESS.

In addition to an incomprehensible explanation of authentication issues, it turns out that the user must create the key to access the API himself (this is done, by the way, by manually filling in the corresponding field, some means for automatic key creation are not provided). The key length must be between 16 and 64 characters. If you allow users to create their own keys for accessing the API, this can lead to the appearance of highly insecure keys that can be easily picked up. In such a situation, problems associated with the contents of keys are also possible, since you can enter anything in the key field. In the worst case, this can lead to an attack on the service using a SQL injection method or something similar. When designing an API, do not allow users to create keys to access the API themselves. Instead, generate keys for them automatically. The user should not be able to change the contents of such a key, but, if necessary, he should be able to generate a new key, recognizing the old key as invalid.

In the case of requests that require authentication, we see another problem. It lies in the fact that the authentication token must be sent as part of the request body. Here is how it is described in the documentation.


An example of authentication in the Beds24 API

If the authentication token is transmitted in the request body, this means that the server will need to parse the request body before it reaches the key. After that, he retrieves the key, performs the authentication, and then decides - what to do with the request - to fulfill it or not. If the authentication succeeds, the server will not be subject to additional load, since in this case the request body would still have to be parsed. But if you failed to authenticate the request, the valuable processor time will be wasted to parse the request body for nothing. It would be better to send an authentication token in the request header using something like the Bearer authentication scheme. With this approach, the server will need to parse the body of the request only if the authentication is successful. Another reason why it is recommended to use a standard scheme like Bearer for authentication is the fact that most developers are familiar with such schemes.

Problem number 7: performance


This problem is the last one on my list, but it does not diminish its importance. The fact is that it takes a little more than a second to execute a request to the API in question. In modern applications such delays may be unacceptable. As a matter of fact, here you can advise everyone who is engaged in the development of the API, not to forget about performance.

Results


Despite all the problems that we talked about here, the API in question allowed us to solve the problems facing the project. But developers took quite a lot of time to understand the API and implement everything they need. In addition, they had to write rather complex code to solve simple problems. If this API were designed properly, the work would be done faster, and a turnkey solution would be easier.

Therefore, I would like to ask all those who design APIs to think about how the users of their services will work with it. Make sure that the API documentation fully describes their capabilities, so that it is understandable and well designed. Control the naming of entities, pay attention to the fact that the data that your API issues or accepts is clearly structured, so that it is easy and convenient to work with them. In addition, do not forget about security and correct error handling. If, when designing an API, to take into account all that we talked about, then to work with it you will not need to write something like those strange “instructions” that we discussed above.

As already mentioned, this material is not intended to discourage readers from using Beds24 or any other system with a poorly designed API. My goal was to, by showing examples of errors and approaches to solving them, give recommendations, following which everyone could improve the quality of their developments. I hope this material will attract the attention of programmers who have read it to the quality of the solutions they develop. And that means there will be more good APIs in the world.

Dear readers! Have you encountered poorly designed APIs?

Source: https://habr.com/ru/post/436888/