The problem is this:
- Suppose there is server A, it can process 10,000 clients and listens to messages from server B.
- For each client, messages are unique, that is, there is no need to send one message to everyone.
It is necessary to provide horizontal scaling. The problem is that we do not know to which server A the client is connected and therefore we do not know where to send the message from B.
Why did I decide that I need a separate topic for each user, and not just a userId field? In this case, we lose the ability to scale out, if we have 30K clients, one server holds 10K clients, then the same 30K messages will still be sent to each of the 3 servers, which then need to be filtered by userId.
From what I thought about in my head:
- Rabbitmq, Kafka is not suitable because of the need to create many topics.
- Redis - pubsub does not scale, when publish sends a message to all the cluster.
- Hezelcast - almost it, if I correctly understood from the docks, then only a subscription message is sent to all, which is very cool when there are a lot of messages in one topic, but it’s bad when there is a large stream of unsubscribing messages and few messages.
Here, as I understand it, you need sharding. So far, I have stopped at the following: when subscribing, the server writes a topic and its address into the radish. Publisher pulls the address of the server on the topic and directly sends him a message via grpc. So I use the sharding mechanism in radish. What I don't like is the bike; need to track the server that fell / changed the address, which is fraught with bugs.
Maybe someone faced with similar problems, maybe there is some kind of pubsub system that solves the described task? Or mb any tools that help to do sharding?