📜 ⬆️ ⬇️

Virtual machine performance comparison of 6 cloud platforms: Selectel, MCS, Ya.Oblako, Google Cloud, AWS and Azure

Just now I came across two articles from one corporate blog about clouds - one about Kubernetes, and the second was an attempt to measure performance using a technique that seemed doubtful to me (spoiler - not for nothing).

About K8s, I also have something to say, but let's talk about performance.

The mistrust of the results was caused by many factors, but the main ones for me were the following: there were no test run parameters, the number of iterations was not announced, the machines chosen were not announced, there was no detailed configuration either. It is doubtful in general.
In general, I mainly use Google Cloud and AWS (a total of about ten years of experience has come to work with them) and I don’t work with domestic cloud providers, but, by coincidence, I have active accounts in Selectel, MCS, I. Cloud and, after this test, also in Azure.

Fortunately, all these platforms are public and, whatever I intend to, everyone can go, repeat and check if they wish.

The result of all this was the thought - why not spend a couple of hundred rubles , all weekends and really thoughtfully not to measure all six platforms and find out which of them gives the best performance relative to the cost and in absolute figures with the same configurations, and at the same time compare global suppliers with Russian.
And also, as it turned out, to clarify some of the "features" in the allocation of resources and remind ourselves and others that it is not always and not on all platforms for the same money you can get predictable performance.

The results were not to be said to be phenomenal, but in my opinion extremely curious.

I ask those interested under the cat.

Technique


Virtual machines


Each cloud provider runs successively in different availability zones (if there are two zones, then 1 machine is the first zone and 2 is in the second) three virtual machines with 4 CPUs, 8 GB of RAM and a 50 GB system disk.

The type of processor / instance is the newest available, if there is a choice.

VM type - shared with full kernel allocation.

Type of disks - network SSD with the ability to remount to another VM.

Options for guaranteed allocation of IOPS or machines optimized for this are not used, unless it is provided for by standard conditions of use and cannot be waived.

The default file system is ext4.

No manual system settings were made.

A series of tests was run on each of the machines, the totals for each machine were averaged.

The final performance of the platform is expressed by the arithmetic average of the average test values ​​for each of the virtual machines, but the standard deviation is also in the tables for those interested.

The operating system is Ubuntu 16.04 of the latest available patch level.

Cost calculation


The calculation of the cost was made without taking into account any bonus charges from the provider, without taking into account the cost of traffic, at the rate of the virtual machine’s work for a full calendar month without stopping.

Also, some platforms allow, through certain restrictions (which you can skillfully survive), to significantly reduce the cost of resources.

For AWS, these are Spot instances, for GCE, Preemptible instances. With a suitable application architecture, they can be successfully used without harm to it, but with a benefit for the wallet, it has been verified by me personally, and by dozens of companies that use both.
To this category can be attributed the type of disk in the Selectel. Despite the fact that the main measurements involved disks of the type “Fast”, there is still significantly cheaper “Universal”, which does not shine with speed, but is suitable for a huge number of tasks. Options for its use were also taken into account in the final calculations.

Tests


To run the tests, such a script was written, from which you can see all the launch parameters:

Test script
#!/usr/bin/env bash TIME=60 # Workload 70% read 30% write cat > fio-rand.fio << EOL [global] name=fio-rand-RW filename=fio-rand-RW rw=randrw rwmixread=70 rwmixwrite=30 bs=4K direct=1 numjobs=1 time_based=1 runtime=${TIME} [file1] size=2G iodepth=16 EOL echo "Run FIO" for i in {1..3}; do echo "$i iter:" fio fio-rand.fio |grep -E "(read|write|bw|iops|READ|WRITE)" |grep -v "Disk" done echo "Run stress-ng." for i in {1,2,4}; do for z in {1..3}; do echo -n "$z iter. Stress-NG for $i CPU: " stress-ng --cpu $i --cpu-method matrixprod --metrics-brief -t $TIME 2>&1 |sed -n '6p'| awk '{print $5}' done done for i in {1,2,4}; do for z in {1..3}; do echo -n "$z iter. Sysbench CPU for $i thread(s): " sysbench --num-threads=$i --max-time=$TIME --test=cpu run 2>&1|grep "total time:"|awk '{print $3}' done done for i in {1,2,4}; do for z in {1..3}; do echo -n "$z iter. Sysbench Memory for $i thread(s): " sysbench --num-threads=$i --max-time=$TIME --test=memory run 2>&1| grep "Operations performed:" done done 


For all tests except Sysbench CPU, more is better.

The results of all launches were collected in Excel spreadsheets for further calculations.
Well, it seems like I did - I told, now I need to tell what happened.

Testing


The machine is an example outside the test.


Clouds are usually compared to ordinary iron servers. I do not see much sense in this, since the cloud is not only and not so much directly computing power, but first of all - an ecosystem, but, nevertheless, I think many people will still be interested in such a comparison. Well, in general, something must be compared. With something close, famous and understandable.
I didn’t have the iron machine at hand, but there is a very new non-Dell workstation, it’s also a home server with a well-known processor (E5-4650L @ 2.60GHz), a suitable amount of not the fastest DDR3 EEC memory (frankly, the slowest of those that were generally compatible) and the SmartBuy SSD, bought 4 years ago and recently moved to this assembly.

Since all of this works under FreeBSD 11.2, a suitable virtual machine was created using bhyve tools and a test was run there.

Startup log
 Run FIO 1 iter: read : io=891652KB, bw=14861KB/s, iops=3715, runt= 60001msec bw (KB /s): min= 116, max=17520, per=100.00%, avg=15449.34, stdev=2990.83 write: io=381908KB, bw=6365.3KB/s, iops=1591, runt= 60001msec bw (KB /s): min= 49, max= 7752, per=100.00%, avg=6620.06, stdev=1290.46 READ: io=891652KB, aggrb=14860KB/s, minb=14860KB/s, maxb=14860KB/s, mint=60001msec, maxt=60001msec WRITE: io=381908KB, aggrb=6365KB/s, minb=6365KB/s, maxb=6365KB/s, mint=60001msec, maxt=60001msec 2 iter: read : io=930228KB, bw=15504KB/s, iops=3875, runt= 60001msec bw (KB /s): min= 5088, max=17144, per=99.98%, avg=15500.61, stdev=2175.23 write: io=398256KB, bw=6637.6KB/s, iops=1659, runt= 60001msec bw (KB /s): min= 2064, max= 7504, per=100.00%, avg=6639.82, stdev=979.69 READ: io=930228KB, aggrb=15503KB/s, minb=15503KB/s, maxb=15503KB/s, mint=60001msec, maxt=60001msec WRITE: io=398256KB, aggrb=6637KB/s, minb=6637KB/s, maxb=6637KB/s, mint=60001msec, maxt=60001msec 3 iter: read : io=886780KB, bw=14779KB/s, iops=3694, runt= 60001msec bw (KB /s): min= 1823, max=17248, per=100.00%, avg=15520.09, stdev=2453.59 write: io=379988KB, bw=6333.3KB/s, iops=1583, runt= 60001msec bw (KB /s): min= 731, max= 7488, per=100.00%, avg=6647.33, stdev=1054.67 READ: io=886780KB, aggrb=14779KB/s, minb=14779KB/s, maxb=14779KB/s, mint=60001msec, maxt=60001msec WRITE: io=379988KB, aggrb=6333KB/s, minb=6333KB/s, maxb=6333KB/s, mint=60001msec, maxt=60001msec Run stress-ng. 1 iter. Stress-NG for 1 CPU: 12227 2 iter. Stress-NG for 1 CPU: 12399 3 iter. Stress-NG for 1 CPU: 12134 1 iter. Stress-NG for 2 CPU: 23812 2 iter. Stress-NG for 2 CPU: 23558 3 iter. Stress-NG for 2 CPU: 21254 1 iter. Stress-NG for 4 CPU: 39495 2 iter. Stress-NG for 4 CPU: 39876 3 iter. Stress-NG for 4 CPU: 42370 1 iter. Sysbench CPU for 1 thread(s): 11.0566s 2 iter. Sysbench CPU for 1 thread(s): 11.0479s 3 iter. Sysbench CPU for 1 thread(s): 11.0451s 1 iter. Sysbench CPU for 2 thread(s): 5.6159s 2 iter. Sysbench CPU for 2 thread(s): 5.5664s 3 iter. Sysbench CPU for 2 thread(s): 5.5407s 1 iter. Sysbench CPU for 4 thread(s): 2.8368s 2 iter. Sysbench CPU for 4 thread(s): 2.8801s 3 iter. Sysbench CPU for 4 thread(s): 2.8244s 1 iter. Sysbench Memory for 1 thread(s): Operations performed: 104857600 (2537704.01 ops/sec) 2 iter. Sysbench Memory for 1 thread(s): Operations performed: 104857600 (2536025.17 ops/sec) 3 iter. Sysbench Memory for 1 thread(s): Operations performed: 104857600 (2472121.34 ops/sec) 1 iter. Sysbench Memory for 2 thread(s): Operations performed: 104857600 (3182800.43 ops/sec) 2 iter. Sysbench Memory for 2 thread(s): Operations performed: 104857600 (3379413.65 ops/sec) 3 iter. Sysbench Memory for 2 thread(s): Operations performed: 104857600 (3306495.59 ops/sec) 1 iter. Sysbench Memory for 4 thread(s): Operations performed: 104857600 (4300089.71 ops/sec) 2 iter. Sysbench Memory for 4 thread(s): Operations performed: 104857600 (4163689.93 ops/sec) 3 iter. Sysbench Memory for 4 thread(s): Operations performed: 104857600 (4163996.47 ops/sec) 


If you translate the results into a table view, you get the following:
TestIter 1Iter 2Iter 3AverageStdev
FIO READ IOPS3715.003875.003694.003761.3399.00
FIO WRITE IOPS1591.001659.001583.001611.0041.76
STRESS-NG 1 CPU12227.0012399.0012134.0012253.33134.45
STRESS-NG 2 CPU23812.0023558.0021254.0022874.671409.27
STRESS-NG 4 CPU39495.0039876.0042370.0040580.331561.56
Sysbench CPU for 111.0611.0511.0511.050.01
Sysbench CPU for 25.625.575.545.570.04
Sysbench CPU for 42.842.882.822.850.03
Sysbench Mem t 12537704.012536025.172472121.342515283.5137388.96
Sysbench mem t 23182800.433379413.653306495.593289569.8999393.41
Sysbench mem t 44300089.714163689.934163996.474209258.7078662.11

Well, there are reference data, now directly the results of testing providers.

Further, I will not give complete logs so as not to inflate the article, but they are stored with me, if you wish, ask for a link, I will share it, although the data from them are transferred to a table.

Yandex.Oblako


Results for ru-central1-a zone:

Result table
TestIter 1Iter 2Iter 3AverageStdev
FIO READ IOPS554.00543.00545.00547.335.86
FIO WRITE IOPS237.00232.00233.00234.002.65
STRESS-NG 1 CPU10236.0010045.0010161.0010147.3396.23
STRESS-NG 2 CPU19756.0019479.0020291.0019842.00412.77
STRESS-NG 4 CPU18743.0017906.0018192.0018280.33425.43
Sysbench CPU for 111.9411.9511.9811.960.02
Sysbench CPU for 27.197.236.166.860.61
Sysbench CPU for 43.723.723.703.710.01
Sysbench Mem t 12080442.662085059.552079872.002081791.402844.64
Sysbench mem t 22460594.622715142.012536824.572570853.73130641.04
Sysbench mem t 42978385.592928369.703020014.592975589.9645886.36


Results for ru-central1-b zone:

Result table
TestIter 1Iter 2Iter 3AverageStdev
FIO READ IOPS543.00537.00523.00534.3310.26
FIO WRITE IOPS232.00230.00224.00228.674.16
STRESS-NG 1 CPU10634.0010848.0011870.0011117.33660.55
STRESS-NG 2 CPU22109.0020861.0021020.0021330.00679.30
STRESS-NG 4 CPU18964.0019449.0018992.0019135.00272.29
Sysbench CPU for 111.3011.3511.3411.330.03
Sysbench CPU for 25.875.885.895.880.01
Sysbench CPU for 43.563.553.543.550.01
Sysbench Mem t 12190808.152197111.572197600.122195173.283788.20
Sysbench mem t 22442631.192433028.202415710.662430456.6813643.25
Sysbench mem t 43010239.123168720.683088677.503089212.4379242.13


Results for ru-central1-c zone:

Result table
TestIter 1Iter 2Iter 3AverageStdev
FIO READ IOPS541.00551.00558.00550.008.54
FIO WRITE IOPS232.00236.00239.00235.673.51
STRESS-NG 1 CPU10424.0010192.0010325.0010313.67116.41
STRESS-NG 2 CPU19637.0020330.0019585.0019850.67415.93
STRESS-NG 4 CPU28884.0028477.0028750.0028703.67207.42
Sysbench CPU for 111.6711.6411.6811.670.02
Sysbench CPU for 26.026.057.066.380.59
Sysbench CPU for 43.403.403.403.400.00
Sysbench Mem t 12131168.412130201.752142809.682134726.617016.81
Sysbench mem t 22777100.502592860.272226863.892532274.89280076.82
Sysbench mem t 42834838.092935298.852753443.732841193.5691093.99


Summary results:
TestAverageAvg minAvg maxStdevStDev%
FIO READ IOPS543.89534.33550.008.381.5%
FIO WRITE IOPS232.78228.67235.673.661.6%
STRESS-NG 1 CPU10526.1110147.3311117.33518.724.9%
STRESS-NG 2 CPU20340.8919842.0021330.00856.614.2%
STRESS-NG 4 CPU22039.6718280.3328703.675786.9926.3%
Sysbench CPU for 111.6511.3311.960.312.7%
Sysbench CPU for 26.375.886.860.497.7%
Sysbench CPU for 43.553.403.710.164.5%
Sysbench Mem t 12137230.432081791.402195173.2856732.392.7%
Sysbench mem t 22511195.102430456.682570853.7372533.452.9%
Sysbench mem t 42968665.322841193.563089212.43124154.354.2%

I want to draw special attention to one remarkable fact.

With a full load of all cores of virtual machines in zones A and B, the total performance is LOWER than with a load of only two cores out of four.

Moreover, I took additional cars in one of the zones and drove the test for them - the problem did not go anywhere.

I suppose that this is a technical problem and it is connected with the hardware features of the machines used for hypervisors and taking them into account when allocating resources (such an experience case is recalled). Well, or with something else, I can’t look inside, but I don’t want to guess much.

Hopefully, comrades from Ya. Oblak will read this article and do something with it, and if you are very lucky, they will tell you what it is, but it turns out somewhat insulting and sometimes unpleasant (many applications focus on the number of cores to calculate the number of threads) .

Mail.RU Cloud (MCS)


Mail.ru has only two availability zones, so two tests were performed on different machines in the same zone.

Results for the Moscow-East zone (the first VM):

Results table
TestIter 1Iter 2Iter 3AverageStdev
FIO READ IOPS487.00538.00534.00519.6728.36
FIO WRITE IOPS209.00231.00229.00223.0012.17
STRESS-NG 1 CPU7359.006567.007022.006982.67397.46
STRESS-NG 2 CPU14144.0014916.0013137.0014065.67892.08
STRESS-NG 4 CPU21381.0021199.0021032.0021204.00174.55
Sysbench CPU for 115.5416.2014.9815.570.61
Sysbench CPU for 27.307.707.537.510.20
Sysbench CPU for 44.024.093.793.960.16
Sysbench Mem t 11117493.991161261.851423941.921234232.59165744.17
Sysbench mem t 21819474.621692128.171668347.811726650.2081262.88
Sysbench mem t 42357943.972379492.562312976.142350137.5633938.38


Results for the Moscow-East zone (the second VM):

Test results
TestIter 1Iter 2Iter 3AverageStdev
FIO READ IOPS475.00509.00472.00485.3320.55
FIO WRITE IOPS205.00218.00204.00209.007.81
STRESS-NG 1 CPU6953.007030.007127.007036.6787.19
STRESS-NG 2 CPU14623.0013945.0013523.0014030.33554.94
STRESS-NG 4 CPU27022.0027184.0027670.0027292.00337.23
Sysbench CPU for 114.8813.4414.4514.260.74
Sysbench CPU for 26.897.136.696.900.22
Sysbench CPU for 43.523.493.683.570.10
Sysbench Mem t 11129165.421238462.801344025.161237217.79107435.28
Sysbench mem t 21904396.371740914.981733216.871792842.7496684.92
Sysbench mem t 42416702.172437844.982384159.802412902.3227043.55


Results for the zone "Moscow-North":

Test results
TestIter 1Iter 2Iter 3AverageStdev
FIO READ IOPS510.00647.00613.00590.0071.34
FIO WRITE IOPS218.00277.00262.00252.3330.66
STRESS-NG 1 CPU9657.009742.009867.009755.33105.63
STRESS-NG 2 CPU19251.0020069.0019677.0019665.67409.12
STRESS-NG 4 CPU39020.0038665.0038461.0038715.33282.88
Sysbench CPU for 112.4512.5312.6612.550.11
Sysbench CPU for 26.256.206.226.220.02
Sysbench CPU for 43.183.163.163.170.01
Sysbench Mem t 12003899.511990350.381974380.861989543.5814775.85
Sysbench mem t 21990419.202022621.531934822.521982621.0844415.93
Sysbench mem t 42337084.522227633.062021779.212195498.93160090.01


Summary results:

TestAverageAvg minAvg maxStdevStDev%
FIO READ IOPS531.67485.33590.0053.3610.0%
FIO WRITE IOPS228.11209.00252.3322.119.7%
STRESS-NG 1 CPU7924.896982.679755.331585.4420.0%
STRESS-NG 2 CPU15920.5614030.3319665.673243.4120.4%
STRESS-NG 4 CPU29070.4421204.0038715.338890.1030.6%
Sysbench CPU for 114.1312.5515.571.5210.7%
Sysbench CPU for 26.886.227.510.649.3%
Sysbench CPU for 43.573.173.960.4011.2%
Sysbench Mem t 11486997.991234232.591989543.58435219.8129.3%
Sysbench mem t 21834038.011726650.201982621.08132864.827.2%
Sysbench mem t 42319512.932195498.932412902.32111890.394.8%

From the interesting, I would like to note that the performance degradation problems with the use of four threads are not here, and it seems that honest (though rather weak) kernels are issued.

Also, in the “North” zone, much more powerful processors are used than in the “East” zone, the difference in performance at full load reaches two times. For the same money. Draw your own conclusions.

Selectel


The results of his testing were very interesting. In the absolute, it provides the most powerful 4-core machines from all the tested providers.

Results for the zone "Moscow - Berzarina-1":

Test results
TestIter 1Iter 2Iter 3AverageStdev
FIO READ IOPS2319.002294.002312.002308.3312.90
FIO WRITE IOPS998.00986.00995.00993.006.24
STRESS-NG 1 CPU11320.0011038.0010936.0011098.00198.91
STRESS-NG 2 CPU23164.0022093.0022558.0022605.00537.04
STRESS-NG 4 CPU43879.0044118.0044086.0044027.67129.74
Sysbench CPU for 112.0111.9611.9711.980.02
Sysbench CPU for 26.015.995.996.000.02
Sysbench CPU for 43.013.003.003.000.01
Sysbench Mem t 12158876.402162098.222158738.032159904.221901.32
Sysbench mem t 22413547.342340801.672569554.402441301.14116874.54
Sysbench mem t 42858920.382935705.542714476.622836367.51112325.57


Results for the zone "Moscow - Berzarina-2":

Results table
TestIter 1Iter 2Iter 3AverageStdev
FIO READ IOPS1735.001729.001724.001729.335.51
FIO WRITE IOPS745.00742.00740.00742.332.52
STRESS-NG 1 CPU18231.0018462.0018518.0018403.67152.13
STRESS-NG 2 CPU36965.0036495.0037006.0036822.00283.93
STRESS-NG 4 CPU74272.0074428.0074218.0074306.00109.05
Sysbench CPU for 111.2211.1711.1511.180.03
Sysbench CPU for 25.605.605.605.600.00
Sysbench CPU for 42.832.812.812.820.01
Sysbench Mem t 12396762.922405750.192394240.052398917.726050.06
Sysbench mem t 21980511.452079328.961968664.262009501.5660761.74
Sysbench mem t 42283159.052271698.712299665.982284841.2514059.32


Results for the SPB - Dubrovka-1 zone:

Results table
TestIter 1Iter 2Iter 3AverageStdev
FIO READ IOPS2550.002618.002666.002611.3358.29
FIO WRITE IOPS1096.001126.001147.001123.0025.63
STRESS-NG 1 CPU10801.0010512.0011175.0010829.33332.41
STRESS-NG 2 CPU21418.0021642.0023179.0022079.67958.62
STRESS-NG 4 CPU44183.0044557.0043012.0043917.33806.03
Sysbench CPU for 111.9711.9911.9911.990.01
Sysbench CPU for 25.995.996.005.990.01
Sysbench CPU for 43.023.003.003.010.01
Sysbench Mem t 12159958.702162062.662158540.582160187.311772.13
Sysbench mem t 22430650.732512678.852417945.572453758.3851420.53
Sysbench mem t 43171660.683018827.143343661.473178049.76162511.39


Summary table with the results:
TestAverageAvg minAvg maxStdevStDev%
FIO READ IOPS2216.331729.332611.33448.1420.2%
FIO WRITE IOPS952.78742.331123.00193.4920.3%
STRESS-NG 1 CPU13443.6710829.3318403.674297.5932.0%
STRESS-NG 2 CPU27168.8922079.6736822.008363.9630.8%
STRESS-NG 4 CPU54083.6743917.3374306.0017513.1432.4%
Sysbench CPU for 111.7211.1811.990.464.0%
Sysbench CPU for 25.865.606.000.233.9%
Sysbench CPU for 42.942.823.010.113.7%
Sysbench Mem t 12239669.752159904.222398917.72137912.866.2%
Sysbench mem t 22301520.362009501.562453758.38252972.3911.0%
Sysbench mem t 42766419.512284841.253178049.76450693.8116.3%

As I have already said, of all the tested, this provider provides the most efficient machines per 4 streams. But even here there is a feature - again, for the same money, we get performance that differs by almost 2 times - compare the results of Berezin-2 with the rest.

Also, I would like to note very fast drives at a reasonable price, the best of the three domestic providers that are available from the three tested. At the same time, the machine with the fastest processor has the slowest disk out of three.

It turns out a kind of lottery, however, taking into account that even if you are not lucky, everything will still be very, very decent.

Google cloud


GCE test results did not bring any special surprises.

Everything is completely predictable, homogeneous and in general corresponds to the declared.

Results for europe-west1-b zone:

Test results
TestIter 1Iter 2Iter 3AverageStdev
FIO READ IOPS924.00910.00888.00907.3318.15
FIO WRITE IOPS396.00391.00380.00389.008.19
STRESS-NG 1 CPU14237.0014137.0014094.0014156.0073.37
STRESS-NG 2 CPU28576.0028419.0028544.0028513.0082.96
STRESS-NG 4 CPU29996.0029880.0029449.0029775.00288.22
Sysbench CPU for 112.6312.6612.6712.650.02
Sysbench CPU for 26.526.416.386.440.08
Sysbench CPU for 43.353.563.563.490.12
Sysbench Mem t 12055240.492056617.632054720.942055526.35980.13
Sysbench mem t 21377683.731346931.631397680.791374098.7225563.81
Sysbench mem t 42279937.892275427.562278615.942277993.802318.63


Results for europe-west-1c zone:

Test results
FIO READ IOPS946.00995.00984.00975.0025.71
FIO WRITE IOPS406.00428.00422.00418.6711.37
STRESS-NG 1 CPU14256.0014250.0014423.0014309.6798.20
STRESS-NG 2 CPU28875.0029057.0029256.0029062.67190.56
STRESS-NG 4 CPU30317.0030462.0029478.0030085.67531.23
Sysbench CPU for 112.5212.4912.6112.540.06
Sysbench CPU for 26.286.306.316.290.02
Sysbench CPU for 43.383.573.523.490.10
Sysbench Mem t 12085832.842066794.242086303.392079643.4911130.26
Sysbench mem t 21368168.111535725.511710618.591538170.74171238.33
Sysbench mem t 42375534.542307610.222386046.892356397.2242576.47


Results for europe-west1-d zone:

Test results
TestIter 1Iter 2Iter 3AverageStdev
FIO READ IOPS885.00910.00943.00912.6729.09
FIO WRITE IOPS379.00390.00405.00391.3313.05
STRESS-NG 1 CPU14254.0014230.0014008.0014164.00135.63
STRESS-NG 2 CPU28262.0028321.0028473.0028352.00108.86
STRESS-NG 4 CPU29615.0029312.0029138.0029355.00241.39
Sysbench CPU for 112.6112.6512.6612.640.03
Sysbench CPU for 26.376.356.356.360.01
Sysbench CPU for 43.433.563.553.520.07
Sysbench Mem t 12050031.602068677.642052707.702057138.9810081.96
Sysbench mem t 21228313.901530374.731345581.791368090.14152283.14
Sysbench mem t 42335035.152420871.722361505.392372470.7543956.33


Summary table with the results:
TestAverageAvg minAvg maxStdevStDev%
FIO READ IOPS931.67907.33975.0037.624.0%
FIO WRITE IOPS399.67389.00418.6716.504.1%
STRESS-NG 1 CPU14209.8914156.0014309.6786.500.6%
STRESS-NG 2 CPU28642.5628352.0029062.67372.631.3%
STRESS-NG 4 CPU29738.5629355.0030085.67366.691.2%
Sysbench CPU for 112.6112.5412.650.060.5%
Sysbench CPU for 26.366.296.440.071.1%
Sysbench CPU for 43.503.493.520.010.4%
Sysbench Mem t 12064102.942055526.352079643.4913482.640.7%
Sysbench mem t 21426786.531368090.141538170.7496508.326.8%
Sysbench mem t 42335620.592277993.802372470.7550549.232.2%

There is nothing to even comment on.

Performance in 4 threads hardly differs from two, but does not degrade.

In general, each core is very productive and half as powerful as the kernel of the test virtuals, which goes out-of-competition, and they are not to be said to be the weakest.

The disks of stars from the sky are not enough, but for most tasks there will be plenty of them.

The only thing worth mentioning is the excellent homogeneity. Each of the machines differs in performance by no more than the measurement error, which gives excellent predictability and ease of planning.

AWS


The market leader, his test surprised me somewhat, since they have the same problem that Ya.Oblak showed up.

Despite the fact that I have been working with him for quite a long time, I somehow didn’t have much time to figure out the difference in performance between full load and partial load, so the results were a surprise to me to some extent.

For testing, we used the c5.xlarge type, as the cheapest of the matching requirements.

Results for the eu-central-1a zone:

Test results
TestIter 1Iter 2Iter 3AverageStdev
FIO READ IOPS1839.001976.002083.001966.00122.31
FIO WRITE IOPS789.00850.00895.00844.6753.20
STRESS-NG 1 CPU21422.0021722.0021736.0021626.67177.38
STRESS-NG 2 CPU43305.0043331.0043197.0043277.6771.06
STRESS-NG 4 CPU40876.0040884.0040888.0040882.676.11
Sysbench CPU for 18.778.778.778.770.00
Sysbench CPU for 24.404.404.404.400.00
Sysbench CPU for 42.522.522.522.520.00
Sysbench Mem t 13063495.183064238.673063452.113063728.65442.21
Sysbench mem t 21848705.161841708.241751938.221814117.2153962.11
Sysbench mem t 42413033.892249609.192299986.202320876.4383691.15


Results for the zone eu-central-1b:

Test results
TestIter 1Iter 2Iter 3AverageStdev
FIO READ IOPS1723.001988.002101.001937.33194.03
FIO WRITE IOPS739.00855.00903.00832.3384.32
STRESS-NG 1 CPU21785.0021733.0021741.0021753.0028.00
STRESS-NG 2 CPU43370.0043323.0040351.0042348.001729.61
STRESS-NG 4 CPU40857.0040864.0040916.0040879.0032.23
Sysbench CPU for 18.778.778.778.770.00
Sysbench CPU for 24.394.404.394.390.00
Sysbench CPU for 42.522.522.522.520.00
Sysbench Mem t 13065227.233065688.953063830.233064915.47967.78
Sysbench mem t 22032840.351987864.461968489.391996398.0733013.31
Sysbench mem t 42684716.322654257.872618592.532652522.2433096.05


Results for the eu-central-1c zone:

Test results
TestIter 1Iter 2Iter 3AverageStdev
FIO READ IOPS1761.002003.002108.001957.33177.95
FIO WRITE IOPS756.00861.00906.00841.0076.97
STRESS-NG 1 CPU21632.0021708.0021615.0021651.6749.52
STRESS-NG 2 CPU43247.0043236.0043283.0043255.3324.58
STRESS-NG 4 CPU39931.0039359.0040835.0040041.67744.20
Sysbench CPU for 18.778.778.778.770.00
Sysbench CPU for 24.404.404.404.400.00
Sysbench CPU for 42.522.522.522.520.00
Sysbench Mem t 13064343.663064434.202998820.163042532.6737856.17
Sysbench mem t 22235882.602088501.512166875.912163753.3473740.15
Sysbench mem t 42870035.792813221.502771999.662818418.9849224.29


Summary table of results:
TestAverageAvg minAvg maxStdevStDev%
FIO READ IOPS1953.561937.331966.0014.700.8%
FIO WRITE IOPS839.33832.33844.676.330.8%
STRESS-NG 1 CPU21677.1121626.6721753.0066.900.3%
STRESS-NG 2 CPU42960.3342348.0043277.67530.411.2%
STRESS-NG 4 CPU40601.1140041.6740882.67484.501.2%
Sysbench CPU for 18.778.778.770.000.0%
Sysbench CPU for 24.404.394.400.000.1%
Sysbench CPU for 42.522.522.520.000.1%
Sysbench Mem t 13057058.933042532.673064915.4712594.100.4%
Sysbench mem t 21991422.871814117.212163753.34174871.168.8%
Sysbench mem t 42597272.552320876.432818418.98253330.909.8%

As I said above - the results surprised me.

Yes, I understand that the problem manifests itself explicitly only under certain types of load (it is not visible in Sysbench), but considering the results of other platforms, this is clearly not a problem with the test, but performance limitations.

In defense of AWS, I can say that when creating a machine, it allows you to disable HyperThreading, which at least helps to eliminate the problem with a performance drop in some applications.

Otherwise, the disks do not guarantee such performance, but they support Burst to smooth the loads, so if you need to read / pee relatively much, quickly, but not very often (say, every few minutes), then everything will be fine.

Also, the homogeneity of the results is simply excellent, everything is predictable and without surprises.

Azure


Initially, I did not want to include it in the test, because I have never worked with him very much, and even I didn’t have an account there. But, on reflection, I decided to test it all the same, for even bill , for which I paid .

At once I want to explain that the region was chosen from the principle “somewhere in Europe”, and the type of machine is 100% suitable for the conditions (4 processors, 8GB of memory).
In the first iteration of the test, it was A4 v2, marked as “General purpose”, with which this article was published. Experts who came to the comments explained to me what I did wrong and that Azure has a machine that is slower can cost more than the one that is faster and without reading the documentation or googling about it . After that, the results were updated based on the type F4s

Results for France-Central-1:

Test results
TestIter 1Iter 2Iter 3AwerageStdev
FIO READ IOPS1066.001102.001038.001068.6732.08
FIO WRITE IOPS457.00473.00445.00458.3314.05
STRESS-NG 1 CPU9470.0010059.0010759.0010096.00645.30
STRESS-NG 2 CPU20424.0020502.0020940.0020622.00278.14
STRESS-NG 4 CPU39039.0039294.0039141.0039158.00128.35
Sysbench CPU for 110.3210.4210.5010.420.09
Sysbench CPU for 25.355.355.335.350.01
Sysbench CPU for 42.772.782.762.770.01
Sysbench Mem t 12449793.142467589.352456056.192457812.899027.22
Sysbench mem t 22370286.782388077.812299377.922352580.8446925.93
Sysbench mem t 42697042.082625447.202707918.642676802.6444806.37


Results for France-Central-2:

Test results
TestIter 1Iter 2Iter 3AwerageStdev
FIO READ IOPS1037.001104.001102.001081.0038.12
FIO WRITE IOPS445.00473.00473.00463.6716.17
STRESS-NG 1 CPU10159.0010360.0010452.0010323.67149.84
STRESS-NG 2 CPU21027.0020025.0020415.0020489.00505.08
STRESS-NG 4 CPU39530.0040927.0040170.0040209.00699.32
Sysbench CPU for 110.399.959.9110.080.27
Sysbench CPU for 25.095.135.195.140.05
Sysbench CPU for 42.692.752.662.700.04
Sysbench Mem t 12568336.752450640.642567906.162528961.1867827.92
Sysbench mem t 22401273.882362027.642372950.762378750.7620255.79
Sysbench mem t 42740927.622787787.192770497.392766404.0723696.44


Results for France-Central-3:

Test results
TestIter 1Iter 2Iter 3AwerageStdev
FIO READ IOPS1436.00830.001136.001134.00303.00
FIO WRITE IOPS614.00355.00487.00485.33129.51
STRESS-NG 1 CPU10834.0010326.0010763.0010641.00275.10
STRESS-NG 2 CPU21505.0021108.0021428.0021347.00210.53
STRESS-NG 4 CPU42194.0041540.0041427.0041720.33414.08
Sysbench CPU for 19.879.759.799.800.06
Sysbench CPU for 25.045.055.135.080.05
Sysbench CPU for 42.672.652.672.660.01
Sysbench Mem t 12622263.242616326.802632668.252623752.768271.93
Sysbench mem t 22495841.622438685.042556294.512496940.3958812.43
Sysbench mem t 42814306.592783117.342846909.912814777.9531898.90


Summary table of results:
TestAverageAvg minAvg maxStdevStDev%
FIO READ IOPS1094.561068.671134.0034.713.2%
FIO WRITE IOPS469.11458.33485.3314.303.0%
STRESS-NG 1 CPU10353.5610096.0010641.00273.732.6%
STRESS-NG 2 CPU20819.3320489.0021347.00461.792.2%
STRESS-NG 4 CPU40362.4439158.0041720.331288.043.2%
Sysbench CPU for 110.109.8010.420.313.0%
Sysbench CPU for 25.195.085.350.142.7%
Sysbench CPU for 42.712.662.770.052.0%
Sysbench Mem t 12536842.282457812.892623752.7683250.193.3%
Sysbench mem t 22409424.002352580.842496940.3976912.653.2%
Sysbench mem t 42752661.552676802.642814777.9570006.712.5%

Good performance, one of the best among the platforms presented. True price spoils everything.

Results


Performance


Let's start with a pivot table of results.

I insert it with an image, because I want to use colors, but the data there is from the tables presented above.


The smaller the better

Let's take a closer look at CPU performance:



In general, AWS holds the lead in average measured performance for single and dual nuclear loads. In second place is Google Cloud.

Of the Russian providers, Selectel showed itself the best. Кроме третьего места по неполной нагрузке у него однозначное первое при нагрузке всех ядер, даже с учетом неравномерности результатов между зонами (что неприятно, но в данном случае не влияет).

Теперь память:



По скорости работы с памятью плашку памяти пальму первенства удерживает AWS для однопоточного режима, для двух-поточного — Azure и Я.Облако для четырех-поточного.

Диски:



По скорости дисков у нас однозначный победитель — Selectel. Ничего подобного за сопоставимые деньги никто из участников сравнения не предлагает.

На втором месте — AWS благодаря разрешенному Burst-у и в целом приличной скорости.
За ним GCE и Azure, а замыкают список Я.Облако и MSC, которые предлагают примерно одинаковые по производительности решения.

Стоимость относительно производительности


А теперь поговорим о еще одном интересном факторе — стоимости.

Это сравнение не в коем случае не покрывает совокупности стоимости решений на разных платформах, его цель проста — сопоставить стоимость единицы производительности у разных провайдеров.

За основу расчета возьмем тест stress-ng.
Расчетные цены за 1 месяц использования каждого инстанса (без НДС):
ProviderYandex.CloudMCSSelectelGCEAWSAzure
Price (cur)3799.1236084050.624103.08147.57147.46
Price (rub)3799.1236084050.6246747.61689659.93229652.7316
Alt price (cur)3799.1236083,454.9435.656.079652.7316
Alt price (rub)3799.1236083,454.942330.3763670.34229652.7316

Таблица стоимости требует некоторого пояснения.

Для тех провайдеров, у которых есть возможности снижения стоимости ресурсов, описанных в начале статьи, есть две стоимости — основная и альтернативная, рассчитанная с учетом этих возможностей.

Так как это не скидка и завязано на сценарии использования, которые бывают разными, я счет хорошей идеей посчитать стоимость и с их учетом.

Так же, из-за разницы валют, стоимость AWS, Azure (да, я знаю что он умеет показывать в рублях (как-то), их калькулятор показал мне значения в долларах) и GCE приведена к рублевому эквиваленту, соответствующему курсу 65.46 рублей за доллар США.

Так же, для Azure у меня не получилось выделить стоимость диска, стандартный диск инстанса там 16 Гб, сколько будет действительно стоить диск из калькулятора не очень понятно (там еще и количество запросов учитывается), так что цена указана только за непосредственно инстанс , хотя общей ситуации это все равно не меняет, Azure остается самым дорогим.

Итак, стоимость каждого решения, приведенная к рублям за попугай в тесте stress-ng за минимальное количество ресурсов, которые было получено в тесте:


Меньше — лучше

Если посчитать на основании средних результатов теста, картина принципиально не поменяется, но кое что-что все таки изменится:


Меньше — лучше

Получается, что во всех категориях, если считать без НДС, побеждает Selectel, причем в категории тяжелой нагрузки практически с двухкратным отрывом.

Теперь посмотрим что будет, если перечитать стоимость с учетом возможной экономии в зависимости от сценария использования.

Альтернативная стоимость с учетом экономии за счет сценария использования, приведенная к рублям за попугай в тесте stress-ng за минимальное количество ресурсов, которые было получено в тесте:


Меньше — лучше

Оно же, но к среднему количеству ресурсов:



Здесь картина меняется.

Во всех сценариях, кроме тяжелой постоянно полной нагрузки с приличным перевесом вперед выходят AWS и GCE с практически идентичной стоимостью за единицу ресурсов.

В случае тяжелой нагрузки конкуренцию им составляет Selectel, предлагающий ресурсы практически за те же деньги, но с меньшим количеством «уступков» (все же, его ноды постоянны и не выключаются в произвольным момент времени, в отличии от AWS Spot и Google Preemptible инстансов).

Вот так, если аккуратно и грамотно подойти к архитектуре, можно здорово экономить на казалось бы пустом месте.

Вместо выводов


Тест получился длинный, но как по мне — интересный.

Для себя я сделал некоторые выводы по результатам, надеюсь он поможет и вам посмотреть на вопрос производительности облачных платформ немного с другой стороны и возможно немного облегчит муки выбора, а так же поможет в диагностике проблем производительности на некоторых платформах из-за выявленных «особенностей».
**UPDATE** Обновлены выводы и цены Selectel, т.к. в них бы некорректно учтен НДС
**UPDATE2** Обновлены результаты Azure на новый тип нод, обновлены выводы, но принципиально все равно ничего не поменялось

Source: https://habr.com/ru/post/439690/