Please describe the database architecture for the messenger. As I imagine, there should be two tables - the first stores the users, the second stores the correspondence on the users' id
.
Closed due to the fact that the question is too common for the participants user194374, aleksandr barakin , Kromster , D-side , Nick Volynkin β¦ July 11 '16 at 12:28 .
Please correct the question so that it describes the specific problem with sufficient detail to determine the appropriate answer. Do not ask a few questions at once. See βHow to ask a good question?β For clarification. If the question can be reformulated according to the rules set out in the certificate , edit it .
- Depends on the requirements of the messenger. Maybe you need a table for friends. Maybe you need a table for stickers. Maybe you need a table for a photo. You can continue for a long time - Victor Evlampyev
- Yes, by the way, and how to store photos in a table? - lounah
- @Schepalin usually store only the path to them or generally only the name of the file β forming the path to them automatically. The images themselves are stored either on the hard disk or in some storage (S3, Swift, etc.) - cheops
1 answer
In the simplest case, you can actually use two tables, users users
and messages messages
, which may contain the following fields (in addition to these fields, additional ones may be added)
users ----- id - ΠΈΠ΄Π΅Π½ΡΠΈΡΠΈΠΊΠ°ΡΠΎΡ ΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°ΡΠ΅Π»Ρ name - ΠΈΠΌΡ ΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°ΡΠ΅Π»Ρ ... messages -------- id - ΠΈΠ΄Π΅Π½ΡΠΈΡΠΈΠΊΠ°ΡΠΎΡ ΡΠΎΠΎΠ±ΡΠ΅Π½ΠΈΡ to_id - Π²Π½Π΅ΡΠ½ΠΈΠΉ ΠΊΠ»ΡΡ Π΄Π»Ρ ΡΠ²ΡΠ·ΠΈ Ρ ΡΠ°Π±Π»ΠΈΡΠ΅ΠΉ users (ΡΠΎΠΎΠ±ΡΠ΅Π½ΠΈΠ΅ Π°Π΄ΡΠ΅ΡΠΎΠ²Π°Π½ΠΎ) from_id - Π²Π½Π΅ΡΠ½ΠΈΠΉ ΠΊΠ»ΡΡ Π΄Π»Ρ ΡΠ²ΡΠ·ΠΈ Ρ ΡΠ°Π±Π»ΠΈΡΠ΅ΠΉ users (ΡΠΎΠΎΠ±ΡΠ΅Π½ΠΈΠ΅ ΠΎΡΠΏΡΠ°Π²Π»Π΅Π½ΠΎ) content - ΡΠΎΠ΄Π΅ΡΠΆΠΈΠΌΠΎΠ΅ ΡΠΎΠΎΠ±ΡΠ΅Π½ΠΈΡ created_at - Π²ΡΠ΅ΠΌΡ ΡΠΎΠ·Π΄Π°Π½ΠΈΡ ΡΠΎΠΎΠ±ΡΠ΅Π½ΠΈΡ updated_at - Π²ΡΠ΅ΠΌΡ ΡΠ΅Π΄Π°ΠΊΡΠΈΡΠΎΠ²Π°Π½ΠΈΡ ΡΠΎΠΎΠ±ΡΠ΅Π½ΠΈΡ ...
However, in practice, the size of the table of messages
grows very quickly and when it reaches a volume of the order of several hundred megabytes, even to the indexed table, the queries can be slow. One way to optimize the layout is to take all the textual information of the message into a separate table of contents
over a one to one connection
messages -------- id - ΠΈΠ΄Π΅Π½ΡΠΈΡΠΈΠΊΠ°ΡΠΎΡ ΡΠΎΠΎΠ±ΡΠ΅Π½ΠΈΡ to_id - Π²Π½Π΅ΡΠ½ΠΈΠΉ ΠΊΠ»ΡΡ Π΄Π»Ρ ΡΠ²ΡΠ·ΠΈ Ρ ΡΠ°Π±Π»ΠΈΡΠ΅ΠΉ users (ΡΠΎΠΎΠ±ΡΠ΅Π½ΠΈΠ΅ Π°Π΄ΡΠ΅ΡΠΎΠ²Π°Π½ΠΎ) from_id - Π²Π½Π΅ΡΠ½ΠΈΠΉ ΠΊΠ»ΡΡ Π΄Π»Ρ ΡΠ²ΡΠ·ΠΈ Ρ ΡΠ°Π±Π»ΠΈΡΠ΅ΠΉ users (ΡΠΎΠΎΠ±ΡΠ΅Π½ΠΈΠ΅ ΠΎΡΠΏΡΠ°Π²Π»Π΅Π½ΠΎ) content_id - Π²Π½Π΅ΡΠ½ΠΈΠΉ ΠΊΠ»ΡΡ Π΄Π»Ρ ΡΠ°Π±Π»ΠΈΡΡ contents (ΡΠΎΠ΄Π΅ΡΠΆΠΈΠΌΠΎΠ΅ ΡΠΎΠΎΠ±ΡΠ΅Π½ΠΈΡ) created_at - Π²ΡΠ΅ΠΌΡ ΡΠΎΠ·Π΄Π°Π½ΠΈΡ ΡΠΎΠΎΠ±ΡΠ΅Π½ΠΈΡ updated_at - Π²ΡΠ΅ΠΌΡ ΡΠ΅Π΄Π°ΠΊΡΠΈΡΠΎΠ²Π°Π½ΠΈΡ ΡΠΎΠΎΠ±ΡΠ΅Π½ΠΈΡ ... contents -------- id - ΠΈΠ΄Π΅Π½ΡΠΈΡΠΈΠΊΠ°ΡΠΎΡ ΡΠΎΠΎΠ±ΡΠ΅Π½ΠΈΡ content - ΡΠΎΠ΄Π΅ΡΠΆΠΈΠΌΠΎΠ΅ ΡΠΎΠΎΠ±ΡΠ΅Π½ΠΈΡ
As a result, the size of the messages
table grows very slowly, requests to it are executed quickly, even taking into account the fact that you have to extract the contents from the contents
table.
- oneThank! So did, this is exactly the answer that I needed - lounah
- one@cheops sorry, but the takeaway is absolutely useless, in this particular case it makes no sense to put the messages in a separate table, and the size of the table also does not matter. you just get an extra join, although the message is almost always used with all attributes, including id, to, from, created, and updated is not needed if you canβt edit the message - strangeqargo
- @strangeqargo, did you work with such a table on MySQL at least 1Gb in size? Acceleration, as a rule, on the face (even taking into account indexation), when instead of 1Gb table you sort the table entries in 30Mb. In addition, it is not very reasonable to use JOIN here, it will be much faster if you use the additional IN query. I would not give an example if I did not perform such a refactoring in practice. Just to lay such an architecture at the beginning of the project is easier than when 1Gb of data will be accumulated. - cheops
- worked, slowing down to> terabyte values, but not on primary keys. and even with such sizes it is better to use sharding. and in the proposed version, you will have to recognize the id to filter, and then pull out the message. overoptimization, with additional request - strangeqargo
- To drive a terabyte in MySQL is quite problematic (especially if the entire range is used). MyISAM will die, and InnoDB begins to seriously slow down if it is not fully in RAM. However, when problems with speed occur, the above technique helps out and this is not over-optimization, it is generally quite a common technique. I will not argue, especially SO, thank God, poorly suited for disputes. - cheops