Good day. Here I thought, I wonder how it is organized in social. networks (facebook, VK) DB connections between users?

It would seem, everything is simple: created the base of friendship , in it 3 fields

|user_id|recipient_id|status|

Where
user_id - user id
recipient_id - profile id that has anything to do with user user_id
And status , where you can put the "relationship" of these 2 users (0-submitted a friend request, waiting for confirmation, 1-friend, 2 blacklisted).

It would seem, and everything, but it seems to me completely irrational, since the growth of data in the database can be calculated approximately by the formula

r = x^2 , where r - number of lines, and Ń… - number of users. Absolutely irrational approach.

And how would you do?

The fact is that there is a need to do something similar in function, but I am afraid that Muscle will hang from the load.

Thanks in advance for the help and your options :)

  • Do you have a relationship between Id aspiring to every with every Otherwise, where is the square? It seems to me in social networks, on the contrary, with most Id (out of millions) dozens (and not millions) of connections. Actually, this is a question about the matrix representation. Discharged or not. - avp

3 answers 3

The question of how to “store” a set of friends is perhaps one of the easiest questions to answer when designing such an architecture. 200-300 records in the database (one for each added friend) per user - this is ridiculous if you compare them, for example, with the number of private messages or even "likes" of news.

In projects of this scale, optimization is carried out mostly with the help of a well-chosen architecture of the backend'а and the tools that are used for distributed data processing.

The question "how to combine Hadoop, PHP, memcached, Thrift and MySQL in the most efficient way" is much more serious than the question "how would you like to write your friends to the database so that it is faster" .

Moreover, imagine what means and time are invested in optimizing search algorithms, or, for example, finding and caching a list of possible friends. Or, for example, in writing a self-made high-performance database from scratch on C , as the developers of Vkontakte did.


I am ready to assume that your task is quietly solved using standard tools and MySQL performance. Do not think about the performance ahead of time, optimize only the hotspot'Ń‹ , and then if the performance for some reason does not suit.


If you're interested, you can read about the Facebook architecture here and here.

    The number of data is described by the formula S = K * X , where X is the number of users and K is the average number of friends.
    With a huge margin, let K be 500 (in reality, I think less than 50, because there are a lot of fakes, and just pages with a small number of friends). And then the contact (140M users) of such records is 70 billion.
    But no one says that they are stored in one table. For example, you can have 1000 tables for each of user_id% 1000, which gives the average size of the table K * X / 1000 * (let it be 10 bytes) - 750 MB, which, when scaling number of tables to 10,000, gives only 75 MB, which is not a lot

    • Let's start with the fact that servicing several tables consumes much more memory than one :)) In order for Muscle not to collapse nafik with huge loads - you need to use a cluster of sql servers + adexate caching (I, for example, Sphinx + Memcache) .... and you will be happy :) - Stanislav Komar
    • Tables on different servers ... And interactions between such tables are minimized ... - timka_s

    Everything turned out to be easier.

     $mQuery = $db->query("SELECT * FROM `friendship` WHERE `uid`='".$myid."' || `rid`='".$myid."'"); 

    Further, during the mysql_fetch_array I stupidly check: If the uid array is uid does not match my $myid , then this uid written into the array as a friend’s id , if it is the same, then the value of this array is written into the array as a friend’s id .

    It turns out to be nothing complicated :) You just need to think a little :)

    • 7
      Do you understand the difference between what you asked and what you wrote now?)) - timka_s