Hello! I have a chat, messages are stored in python objects and then downloaded to a database for storage. But the fact is that if I did some editing on the server and decided to reload, then all the data will disappear, of course, but this is bad.

The output that I immediately saw was a cache, perhaps Redis. But this is a whole separate service that will work and consider essentially the same database. And as it seems to me, the load will be strong given the sending "read the message" and the preservation of this status. And from here there will be signs when sending a message, if there are a lot of people.

Please tell me what to do in this case? I'm wrong and Redis will work very quickly, almost not a bit slower than storing in python objects inside the program's memory? Or some other method to solve this problem.

Tell me, how to correctly use the cache in this case?

  • It may be easier to intercept sigterm and reset objects in the database at the end of the process - Mike
  • The database will probably explode from such a massive discharge. I think the most adequate is still a cache. But can someone tell me how to apply it more intelligently? - iproger
  • Why do you need a separate cache? You and so, apparently, the hot data is already in memory. Consider it as a cache. You can synchronously write to the database every change to your cache, and not lose data, you can organize a drain on SIGTERM as suggested by @Mike, you can add a separate thread that will run forever, look at dirty objects in the cache, and write them to the database. Which option to choose depends on the load profile. With the cache in Radish, you will suffer no less, especially if you refuse to write to the database synchronously. - Pavel Gurkov
  • I made the cache, because requests to postgre were blocking, and the chat started to fail with a large volume of users. Do you think if you connect to SIGTERM it will not be a crutch? Type at the end of the process - to merge data into the database (nothing will fall ??) and when you start to load everything back into memory? - iproger
  • If you use a driver, for example psycopg2, then you can use an Momoko asynchronous wrapper for a tornado. With this thing, a connection pool is created that can be expanded for large request queues. Momoko works fine in asynchronous thread-safe mode. Even with long requests, a tornado handler is ready to accept other requests and everything works quite fast. But, in any case, pg, like any other relational database management system for the implementation of the chat - a bad choice in terms of hayload. - Dr.Paster

1 answer 1

  1. Add a SIGINT and SIGTERM handler;
  2. Bind the function to reset the data in the database to the completion handler;
  3. Run at least 2 tornado instances with Nginx as a balancer.

How it will work:

  • 2 or more instances of the application are launched;
  • when updating, restart instances in turn;
  • upon completion, first block the processing of new requests by the instance, then follow the procedure for resetting data from memory to the database.

The result will be a smooth restart / update of the application, not noticeable to users.

About the chat:

Redis is much more efficient and reliable in implementing the functions of the simplest messenger: the status of Online / Offline, personal / group messages in the WebSockets bundle implementation - it will be very fast and with a large margin of safety. If you have a lot of users, it is better to connect to the RabbitMQ application.

  • But if you run 2 instances, then the data will be in different processes, right? - iproger
  • Right. It is important at the design stage to consider the mechanisms of data transfer between the instances, and it is better to abstract, if possible at all, from storing data in a non-shared version. If you wish, you can use third-party services to store shared data if you do not plan horizontal scaling. For example, how will your project work on 10 servers with a total of 150 simultaneously running instances? If you are engaged in the separation of data at the level of instances - this is crap. Redis is the simplest thing that comes to the rescue, and if there is a lot of data and you need cool - LevelDB. - Dr.Paster
  • Thank you very much! I will go to Redis! - iproger