Please describe the database architecture for the messenger. As I imagine, there should be two tables - the first stores the users, the second stores the correspondence on the users' id .

Closed due to the fact that the question is too common for the participants user194374, aleksandr barakin , Kromster , D-side , Nick Volynkin ♦ July 11 '16 at 12:28 .

Please correct the question so that it describes the specific problem with sufficient detail to determine the appropriate answer. Do not ask a few questions at once. See β€œHow to ask a good question?” For clarification. If the question can be reformulated according to the rules set out in the certificate , edit it .

  • Depends on the requirements of the messenger. Maybe you need a table for friends. Maybe you need a table for stickers. Maybe you need a table for a photo. You can continue for a long time - Victor Evlampyev
  • Yes, by the way, and how to store photos in a table? - lounah
  • @Schepalin usually store only the path to them or generally only the name of the file β€” forming the path to them automatically. The images themselves are stored either on the hard disk or in some storage (S3, Swift, etc.) - cheops

1 answer 1

In the simplest case, you can actually use two tables, users users and messages messages , which may contain the following fields (in addition to these fields, additional ones may be added)

 users ----- id - ΠΈΠ΄Π΅Π½Ρ‚ΠΈΡ„ΠΈΠΊΠ°Ρ‚ΠΎΡ€ ΠΏΠΎΠ»ΡŒΠ·ΠΎΠ²Π°Ρ‚Π΅Π»Ρ name - имя ΠΏΠΎΠ»ΡŒΠ·ΠΎΠ²Π°Ρ‚Π΅Π»Ρ ... messages -------- id - ΠΈΠ΄Π΅Π½Ρ‚ΠΈΡ„ΠΈΠΊΠ°Ρ‚ΠΎΡ€ сообщСния to_id - внСшний ΠΊΠ»ΡŽΡ‡ для связи с Ρ‚Π°Π±Π»ΠΈΡ†Π΅ΠΉ users (сообщСниС адрСсовано) from_id - внСшний ΠΊΠ»ΡŽΡ‡ для связи с Ρ‚Π°Π±Π»ΠΈΡ†Π΅ΠΉ users (сообщСниС ΠΎΡ‚ΠΏΡ€Π°Π²Π»Π΅Π½ΠΎ) content - содСрТимоС сообщСния created_at - врСмя создания сообщСния updated_at - врСмя рСдактирования сообщСния ... 

However, in practice, the size of the table of messages grows very quickly and when it reaches a volume of the order of several hundred megabytes, even to the indexed table, the queries can be slow. One way to optimize the layout is to take all the textual information of the message into a separate table of contents over a one to one connection

 messages -------- id - ΠΈΠ΄Π΅Π½Ρ‚ΠΈΡ„ΠΈΠΊΠ°Ρ‚ΠΎΡ€ сообщСния to_id - внСшний ΠΊΠ»ΡŽΡ‡ для связи с Ρ‚Π°Π±Π»ΠΈΡ†Π΅ΠΉ users (сообщСниС адрСсовано) from_id - внСшний ΠΊΠ»ΡŽΡ‡ для связи с Ρ‚Π°Π±Π»ΠΈΡ†Π΅ΠΉ users (сообщСниС ΠΎΡ‚ΠΏΡ€Π°Π²Π»Π΅Π½ΠΎ) content_id - внСшний ΠΊΠ»ΡŽΡ‡ для Ρ‚Π°Π±Π»ΠΈΡ†Ρ‹ contents (содСрТимоС сообщСния) created_at - врСмя создания сообщСния updated_at - врСмя рСдактирования сообщСния ... contents -------- id - ΠΈΠ΄Π΅Π½Ρ‚ΠΈΡ„ΠΈΠΊΠ°Ρ‚ΠΎΡ€ сообщСния content - содСрТимоС сообщСния 

As a result, the size of the messages table grows very slowly, requests to it are executed quickly, even taking into account the fact that you have to extract the contents from the contents table.

  • one
    Thank! So did, this is exactly the answer that I needed - lounah
  • one
    @cheops sorry, but the takeaway is absolutely useless, in this particular case it makes no sense to put the messages in a separate table, and the size of the table also does not matter. you just get an extra join, although the message is almost always used with all attributes, including id, to, from, created, and updated is not needed if you can’t edit the message - strangeqargo
  • @strangeqargo, did you work with such a table on MySQL at least 1Gb in size? Acceleration, as a rule, on the face (even taking into account indexation), when instead of 1Gb table you sort the table entries in 30Mb. In addition, it is not very reasonable to use JOIN here, it will be much faster if you use the additional IN query. I would not give an example if I did not perform such a refactoring in practice. Just to lay such an architecture at the beginning of the project is easier than when 1Gb of data will be accumulated. - cheops
  • worked, slowing down to> terabyte values, but not on primary keys. and even with such sizes it is better to use sharding. and in the proposed version, you will have to recognize the id to filter, and then pull out the message. overoptimization, with additional request - strangeqargo
  • To drive a terabyte in MySQL is quite problematic (especially if the entire range is used). MyISAM will die, and InnoDB begins to seriously slow down if it is not fully in RAM. However, when problems with speed occur, the above technique helps out and this is not over-optimization, it is generally quite a common technique. I will not argue, especially SO, thank God, poorly suited for disputes. - cheops