There is some storage of access codes, each of which has two properties - expiration time and the number of attempts to confirm this code. After any of these parameters have somehow expired, the code can no longer be used, and it must be removed from the database. Here I see two strategies, how to deal with it:

  • Delete the code when requesting this code from the database. Pros: no background tasks, minuses: race conditions (someone can get a record with retries = 1 at the moment when someone tries to confirm the same code in parallel) and a bulk code of simply getting a record
  • Write a background-worker who will constantly crawl the base to find outdated entries. Pros: recordings take place as usual (checks remain on the validity of the recording, but there is no need to cause its removal), minuses: some lag and the existence of background workers, all the same race conditions.

Both strategies seem foolish to me, and I cannot figure out how to solve the problem correctly (although it seems to me that there is a more obvious solution that can minimize the minimum). The application itself is distributed, but the use of the service of distributed locks is quite acceptable (however, in itself, its use will save only from race conditions).

What is the most appropriate strategy for removing such records?

  • This is usually done by a script that is started by the cron service. - Vanyamba Electronics
  • @VanyambaElectronics, this is just the second option with Write background-worker, - Grundy
  • @VanyambaElectronics specific platform is java, so there is no problem to isolate the thread with the scheduler, the problem is that it creates a background load that is difficult to calculate correctly, and generally adds a background context to the application. - etki
  • Option from WordPress: in the next request to perform limited in scope maintenance. For example, check a maximum of N other keys, and delete if necessary. - Sergiks
  • @Sergiks, we cannot lose records under any circumstances (in the case of AOF, this scenario is likely), moreover, the radish specifically drives a stake into other things such as scalability (this is solved by some kind of aero-spike, but all data is stored only in full database is a political decision). - etki

2 answers 2

Code verification should be performed by transaction with simultaneous decrements of attempts. This will help avoid race condition.

 Начать транзакцию Получить ключ Нет ключа? – вернуть FALSE Счётчик оставшихся попыток больше 1? Нет – удалить запись и вернуть TRUE Уменьшить счётчик Вернуть TRUE Конец транзакции 

If the database does not support Lock'i ​​you can simulate their mechanism, writing in the add. field of random value as the key of an exclusive client session.


Option that did not fit

Keep codes in Redis, indicating the expiration time. In the value to keep the number of uses and expiration time.

Overdue keys will be deleted by Redis itself or by a script for an overdue request (if Redis for some reason helps, and somehow the value will survive). Also delete the keys at the last, according to the counter, valid use attempt.

UPd. Scalability is possible - see the proposed solution for distributed lock'ov: distributed locks with Redis .

  • I described (albeit not very fully) the above variant with simultaneous receipt of the same code by two consumers, during which the last attempt is used de facto by two applications. This is an unlikely situation, but I just want to write this component initially armor-piercing and not to do at the same time an opaque architecture. - etki
  • one
    The problem of scalability is not in distributed locks, but the fact that this thing is single-threaded and does not imply sharding out of the box, so you can rest on it, and de facto it is a classic SPOF. Again, aerospike solves all these problems, but I don’t have to store and process records in the caching layer. In addition, redlock is an unreliable algorithm (just a single server failure is enough for locks to start to fail), for this there are consul, etcd (actually paxos, vr, raft) and so on. - etki
  • one
    In the case of locks on a normal sql database, there should not be such a problem (for example, there is a SELECT ... FOR UPDATE in the postgrease), provided that the application does not break every time into a different server. In the case of radish - so that the last attempt could not be used several times - it is enough to check the return of the decrement (as a last resort, the radish returns different codes to delete the record and attempt to delete the non-existing record). Sharding in radish at the level of the application is not screwed to too hemorrhoid, the truth about the reservation I do not dare to say. - NumminorihSF

As a result of a long headache, which I invented for myself, it was decided to do cleansing background workers plus deleting on the fly where it is possible - this decision arose simply due to the fact that the project implies a paginal output, and filtering pages on the fly turns into a very wild hell and (theoretically) can impede the progress of implementation. Therefore, we have a combined (and, I will not hide, painful) decision that is certainly not expected magically.

As for the perfect infrastructure, then Sergiks is really right - all the pain disappears if the extra records are filtered at the level of the storage itself, whether it is redis, couchbase or aerospike. Theoretically, this level can be reproduced by an additional layer of managers inside the application, but this is just as painful a procedure, and it remains only to hope that the practice of creating records with ttl will be more widely used in databases.