To indicate the belonging of a particular chromosome in the tables, we use either smallint, if you can simply refer to the number, or varchar (n), if you have to deal with well-established letter symbols, for example, you have to use varchar (2) for a person, because 1-22, there is also sex (X, Y) and mitochondrial (MT).
It is argued that in this case the use of enum instead of varchar will be more efficient, that is, it will have a positive effect on reducing the sample time. Is it so? And how more correct will it be to switch from varchar (2) to enum without destructive consequences? And will old queries in already compiled programs still work?
- It is argued that in this case the use of enum instead of varchar will be more effective. Yes, undoubtedly. And, since the number of values is small, there are no consisting only of numbers, there is no work with values as strings - you also pass smoothly past the main rake of using ENUM. Will old queries in already compiled programs still work? You do not have this sorting field? then there’s no reason for problems ... - Akina
- Save 2 bytes per write. enum with a small number of values will be stored as 1 byte. your varchar (2) usually takes 2-3 bytes (value bytes + 1 byte of length). - Mike
- @Mike In the genomes of some bacteria, we will save much more :) I'm more interested in time than in place. - Alexey Kozlov
- @Akina And if the unique key is used chr_left_right, i.e. chromosome number, left position, right position, enum will not harm? - Alexey Kozlov
- Experiment with time. but time and place are interrelated, if the file is much larger on the disk, then it will simply take longer to read :). But I doubt that there is a big time difference between working with 1 and 3 bytes. Although atomically comparing 1 byte enum is certainly a few processor cycles faster than taking the length of varchar and comparing the received number of bytes - Mike
|