I recently wondered about the more efficient linking of tables in a database. For example:

First way

| tasks | |----------------------------| | id INT | // primary key | title VARCHAR(20) | // index field | description VARCHAR (2000) | | test_data | |----------------------------| | id INT | // primary key | id_task INT | // foreign key | input_data VARCHAR (200) | | output_data VARCHAR (200) | 

Second way

 | tasks | |----------------------------| | title VARCHAR(20) | // primary key | description VARCHAR (2000) | | test_data | |----------------------------| | id INT | // primary key | title_task VARCHAR(20) | // foreign key | input_data VARCHAR (200) | | output_data VARCHAR (200) | 

In this example, two tables are shown (tasks and test data for tasks). I wanted to finally understand the issue of binding, because in some books I saw that the binding was the first way, i.e. through additional id fields. In other training materials I found that the link goes directly to the field containing any information. In this regard, I would like to get a clear answer to the topic when it is better to use what and how, and why? I would also like to use the opinion about the id field in each table, how important is it? Thank you in advance...

  • one
    The connection in 1 case is not completely clear. The second way is the standard link 1 to many. - JavaJunior

1 answer 1

The second method uses the so-called "natural key" - this is the primary key, made up directly of user data. In the first case, a “surrogate key” is used, which, as a rule (conditionally in 99.9% of cases), is generated by the database engine in the form of successively increasing numbers, and has no other purpose than to be unique.

Natural keys seem to even have their own adepts, although in practice I have not come across them, I only saw a couple of times on the Internet. The main disadvantages of the natural key:

  • often the primary field will not be a field, but a set of fields. If you keep a list of people, it can be "last name", "name", "middle name" and "date of birth", or passport details ("series", "number", "issued by", "date of issue "), or both. And when you need to link two tables, you have to store all these columns in both tables, and then each time enumerate them in the query:

     ... t1.first_name = t2.first_name and t1.last_name = t2.last_name and ... < еще 8 полей > 
  • a set of fields that was unique could suddenly become non-unique. For example, you decided to make the primary key a set of fields "last name", "name", "middle name" and "date of birth". Wait for two people with the same set of name + date of birth will not be long, and then have to somehow get out.