How are the processor's width, its registers, the size of the addressable memory, and how does all this affect the performance?

Question

I study computer architecture from books Tannenbaum and Harris .

And still I do not understand many seemingly simple things. There are many questions, but almost all of them are somehow connected with the machine word. This is the most incomprehensible topic for me.

I want to make a reservation that I understand that there is a direct connection between the processor's digit capacity, the digit capacity of registers, the size of the addressable memory and the speed of the computer. I do not understand why.

How the processor’s starkiness affects its speed. It seems like 64-bit faster than 32-bit. But why? I do not understand. I will give an example: Command system for MIPS-32 processors. There, after all, the architecture of the commands itself is such that all commands are cleared to 32 bits. That is, if you make 64 bits, the high 32 bits will just have to be filled with zeros. And what, the processor will be from this faster?
From Wikipedia:
processor capacity (computer word width). A machine word is a machine-dependent and platform-dependent value, measured in bits or bytes (tritta or trayta), equal to the digit capacity of registers.
Why does the machine word have to be equal to the digit capacity of the registers? Why can't we read data 64 bits at a time with 16-bit registers, for example?
Now about the memory. Again from Wikipedia:
A 64-bit register is capable of storing one of 264 = 18,446,744,073,709,551,616 values. A processor with 64-bit memory addressing can directly access 16 DL memory.
I understand that the amount of memory depends on the address of the number of bits in the address. But then again, how is this related to registers? I see only one connection: if we are going to store addresses in registers, then the registers must have the same bitness as the addresses. But is it necessary to store addresses in registers?
About the bitness of the OS, I seem to understand, but I would like to clarify. As I understand it, the OS is connected with all of this like this: a 64-bit OS runs on a 64-bit processor, a 32-bit OS runs on a 32-bit processor. That is, with the advent of the 64-bit processor, the appearance of the 64-bit OS was inevitable. Do I understand this correctly?
And here the question is rather historical, but also very important for me. I always thought that a 64-bit processor appeared recently and it was presented as a big breakthrough. And on Wikipedia, this is what is written:
Requirements for the accuracy of scientific computing have increased, and in 1974 the first machine appeared with a 64-bit word - the Cray-1 supercomputer
What was the difficulty in creating a 64-bit processor? And what is the difficulty to create, for example, a 128-bit processor? What does it all depend on? What is determined? Digit registers? And what is the difficulty to increase the digit capacity of registers? How is it determined?

Honestly, I didn’t get this deep, but according to the fifth, the architecture implies not only the processor’s width, but also a bunch of related infrastructure (for example, the same SSE or an arbitrary order of instructions execution ), which play an important role in the processor speed.

jfs 44.5k 8 gold signs 53 silver marks 199 bronze marks · Accepted Answer · 2016-03-21T22:58:19

The main work of the processor is not to transfer information, but to transform it. The register is the same RAM, but from which the direct wiring goes to a heap of performing devices that perform arithmetic and other operations with data. These lines are done very much. Each register bit will have its own and a large set of transistors for performing specific operations. Hence the difficulty of increasing the number of digits. With an increase in the register dimension by 2 times, the volume of all executing devices at least doubles. crystal grows, heat generation increases.

Look at the command system, any data conversion requires the participation of at least one register. And some operations take place exclusively in registers. In the x86 architecture, you can add memory to the register. But for example, it is impossible to shift or multiply a memory cell. it is impossible to add up the values of two memory cells without first taking one of them into the register. because the executive mechanism of this operation has direct wiring only with the register.

Q: Why can't we read data 64 bits at a time with 16-bit registers

We can, but where to read and why? In general, modern processors do this, fill the internal cache and operate with the bus width, the registers do not participate here. We read 64 bits in the cache, and now we need to multiply them by 3, for example. And we have a 16-bit register, how to multiply? Correctly, in parts, applying a bunch of additional transformations and spending precious steps on it. Therefore, the dimension of the transmission bus is secondary. The main thing is the digit capacity of the register. And it was called machine word.

Q: But is it necessary to store addresses in registers?

Yes, sure. The processor as it should be said - take the data there. And where is it? In mind ? And how then the instruction will look like - take the address located at the address at that address in the instruction ... And if we need to work in succession with the data block (we are processing the array in a cycle) and this address must be increased (i.e. perform addition, which we can only do in the register)

By the way, the bit of the command and the bit of the processor are different things. In MIPS, all commands were packed in 32 bits. From time immemorial, the x86 platform was a variable-bit operation. from short single-byte to long monsters with a bunch of prefixes. The processor width = register width = maximum size of information processed by one instruction (we do not take any SSE in the usual commands that make up the main code).

Speed - who said that bitness plays a key role. Yes, the bit affects. The 64 bit processor and OS boom is an excellent marketing example. 64 bit code is often slower than 32 bit. If the program does not need to address more than 4 GB of memory, and its code stores 64 bit addresses, then the size of the program is 2 times larger. More size - longer to read in the cache. More memory is required. The race for gigabytes of RAM begins ... Now the reverse process has even begun. x32 ABI is developing in its entirety - 32-bit code operation in 64-bit mode.

But let's take RSA encryption, which is used in the same ubiquitous SSL. It requires complex calculations with very large numbers. Suppose we do not have specialized processor instructions for it. Of course, if the processor operates with 64 bit registers, it performs the calculation 2 times faster, simply because in one clock cycle it is able to process 2 times more information. Yes, it is difficult to overestimate the gain from the increase in digit capacity on computational problems with large numbers.

Q: The 64-bit OS runs on a 64-bit processor, the 32-bit OS runs on a 32-bit processor.

No, a 64 bit OS consists of a 64 bit code that can address memory with 64 bit addresses. Of course, it can only do this on a 64 bit processor. The appearance of the OS was of course inevitable. Although here marketing played a significant role. 90% of the people who understand computers think that to address over 4 GB of RAM on the intel platform, 64 Bit OS is needed. Yes, in Windows such restriction was forcibly entered. Intel processors in 32-bit PAE mode address up to 64 GB of RAM, while the truth is that one process is limited to 4 GB. 32 The linux bit feels great with such volumes.

Regarding the history and complexity of building 128 bit registers ... the only question is the price. Yes, on some kind of non-mass market systems, this was done a long time ago, it was not necessary on the mass market, that was not the case. And then it cost fabulous, because as we said at the beginning - each bit of the register is a bunch of performing devices, and with those production technologies it was difficult to place so many transistors on a chip, so to speak. Full-featured 128-bit processors are simply not needed, especially for the mainstream market, to address more than 64 EB memory, where else to find it. In general, now in all intel processors there are 16 SSE registers with a size of 128 bits, these are not general registers, they are for calculations. And on modern Xeon, designed for serious calculations, 32 ZMM registers of 512 bits ( see AVX ) ...

But now I have this question: imagine that we have developed our own team architecture and all the commands in it are removed in 32 bits.
Performance is important for us, so we decide to make them 64-bit.
Or use the fact that they are cleared in 32 bits and read 2 teams at a time?
Reading two commands at a time should complicate the micro architecture, right?
and another question: if the budget is only for 16-bit registers, and the teams are not removed in 16 bits?
@Aleksandr Elizarov I look (briefly looked at the command system) for mips, the same is all pretty fun because of their hard 32 bits for the command.
I saw an example of loading an address ... it is from 2 instructions, the first one loads the younger part of the address, the second is the oldest ... Ie
if there is not enough digit capacity, teams come up with 2. of course, one-dimensional commands are much easier to implement in hardware.
stones usually have a queue of commands and it fills up on its own and reads as much as the tire allows.
@Alexander Elizarov Actually, it seems to me that I should make the team wider than the registers.
the team itself is not in the register (in the general sense) loaded, it can provide more space for the team performers.
And it will somehow look stupid if we want to realize loading a constant into a register and cannot load a 16-bit value into a 16-bit register in one operation ...
@Aleksandr Elizarov Well, in principle, he is of course a register.
but it is so special that no one asks him to do the same bit width as general registers.
Judging by the description of the first intel, an operation code was loaded into such a register, depending on which additional data was read from the command stream, for example, addresses and writing this to the registers of other executing devices.

user58697 user58697 371 1 silver mark 4 bronze marks · Answer 2 · 2016-03-21T23:59:22

I see only one connection: if we are going to store addresses in registers, then the registers must have the same bitness as the addresses.

To completely confuse you: Should not.

For example, eight-bit processors can address 16 bits: a memory access instruction selects an address not from a register, but from a register pair. The 8086th processor with 16-bit registers can address 20 bits ( базовый регистр << 4 + регистр смещения ).

The mentioned Cray-1 had 64-bit data registers and 24-bit address registers.

Purely for reference: And what for were such perversions needed?
@Risto Pile of various reasons, among them, for example, the physical size of the chip / container (how many pins can be deduced from it), the physical size of the board (the wide tire is more difficult to plant), the price (copper is expensive), etc.

anonymous anonymous 21 1 bronze sign · Answer 3 · 2016-03-27T14:44:20

Everything is relative.

For example, the parallelization of algorithms can win on processors with several cores, but only in the case - if the algorithm is parallelizable. Those. if the task can be divided into two parallel threads.

The same with processor capacity. If the algorithm can be optimized for large registers, then yes, it will be faster on x86_64 than on i386.

But if the algorithm itself cannot be optimized and its speed does not depend on the size of the register, then you will not receive any acceleration.

How are the processor's width, its registers, the size of the addressable memory, and how does all this affect the performance?

3 answers 3

More articles: