This post investigates the network performance of AWS R4 instances with a focus on the "Up to 10 Gigabit" networking expected from smaller (r4.large - r4.4xlarge) instance types. Before starting it should be noted that this post is based on observation and as such is prone to imprecision and variance, it is intended as a guide for what can be expected and not a comprehensive or scientific review.
The R4 instance documentation states "The smaller R4 instance sizes offer peak throughput of 10 Gbps. These instances use a network I/O credit mechanism to allocate network bandwidth to instances based on average bandwidth utilization. These instances accrue credits when their network throughput is below their baseline limits, and can use these credits when they perform network data transfers." This is not particularly helpful in understanding the lower bounds on network performance and gives no indication of the baseline limits with AWS recommending customers benchmark the networking performance of various instances to evaluate whether the instance type and size will meet the application network performance requirements.
Logically we would expect the r4.large to have a fraction of the total 20 Gbps available on an r4.16xlarge. From the instance size normalisation table under the reserved instance modification documentation a *.large instance (factor of 4) should expect 1/32 of the resources available on a *.16xlarge instance (factor of 128) which works out at 0.625 Gbps (20 Gbps / 32) or 625 Mbps.
Testing r4.large baseline network performance
Using iperf3 between two newly launched Amazon Linux r4.large instances in the same availability zone in eu-west-1, we run into the first interesting anomaly with the network stream maxing out at 5 Gbps rather than the expected 10 Gbps:
$ iperf3 -p 5201 -c 172.31.7.67 -i 1 -t 3600 -f m -V
iperf 3-CURRENT
Linux ip-172-31-10-235 4.9.32-15.41.amzn1.x86_64 #1 SMP Thu Jun 22 06:20:54 UTC 2017 x86_64
Control connection MSS 8949
Time: Sun, 20 Aug 2017 07:35:48 GMT
Connecting to host 172.31.7.67, port 5201
Cookie: p2v6ry2kzjo2udittrzgmxotz7we3in5etmv
TCP MSS: 8949 (default)
[ 5] local 172.31.10.235 port 41270 connected to 172.31.7.67 port 5201
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks, omitting 0 seconds, 3600 second test, tos 0
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 598 MBytes 5015 Mbits/sec 9 664 KBytes
[ 5] 1.00-2.00 sec 596 MBytes 4999 Mbits/sec 3 559 KBytes
[ 5] 2.00-3.00 sec 595 MBytes 4992 Mbits/sec 9 586 KBytes
[ 5] 3.00-4.00 sec 595 MBytes 4989 Mbits/sec 0 638 KBytes
[ 5] 4.00-5.00 sec 596 MBytes 5000 Mbits/sec 0 638 KBytes
[ 5] 5.00-6.00 sec 595 MBytes 4989 Mbits/sec 0 638 KBytes
[ 5] 6.00-7.00 sec 595 MBytes 4990 Mbits/sec 6 638 KBytes
[ 5] 7.00-8.00 sec 595 MBytes 4990 Mbits/sec 3 524 KBytes
[ 5] 8.00-9.00 sec 596 MBytes 4997 Mbits/sec 0 586 KBytes
[ 5] 9.00-10.00 sec 596 MBytes 4997 Mbits/sec 0 603 KBytes
[ 5] 10.00-11.00 sec 595 MBytes 4990 Mbits/sec 0 638 KBytes
Interestingly, using 2 parallel streams results in us (mostly) reaching the advertised 10 Gbps:
$ iperf3 -p 5201 -c 172.31.7.67 -i 1 -t 3600 -f m -V -P 2
iperf 3-CURRENT
Linux ip-172-31-10-235 4.9.32-15.41.amzn1.x86_64 #1 SMP Thu Jun 22 06:20:54 UTC 2017 x86_64
Control connection MSS 8949
Time: Sun, 20 Aug 2017 07:37:38 GMT
Connecting to host 172.31.7.67, port 5201
Cookie: q343avscwpva5uyg2ayeinboxi5pllvw5l7r
TCP MSS: 8949 (default)
[ 5] local 172.31.10.235 port 41274 connected to 172.31.7.67 port 5201
[ 7] local 172.31.10.235 port 41276 connected to 172.31.7.67 port 5201
Starting Test: protocol: TCP, 2 streams, 131072 byte blocks, omitting 0 seconds, 3600 second test, tos 0
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 597 MBytes 5010 Mbits/sec 0 690 KBytes
[ 7] 0.00-1.00 sec 592 MBytes 4968 Mbits/sec 0 717 KBytes
[SUM] 0.00-1.00 sec 1.16 GBytes 9979 Mbits/sec 0
- - - - - - - - - - - - - - - - - - - - - - - - -
[ 5] 1.00-2.00 sec 595 MBytes 4994 Mbits/sec 0 690 KBytes
[ 7] 1.00-2.00 sec 592 MBytes 4962 Mbits/sec 18 638 KBytes
[SUM] 1.00-2.00 sec 1.16 GBytes 9956 Mbits/sec 18
- - - - - - - - - - - - - - - - - - - - - - - - -
[ 5] 2.00-3.00 sec 591 MBytes 4957 Mbits/sec 137 463 KBytes
[ 7] 2.00-3.00 sec 587 MBytes 4924 Mbits/sec 41 725 KBytes
[SUM] 2.00-3.00 sec 1.15 GBytes 9881 Mbits/sec 178
- - - - - - - - - - - - - - - - - - - - - - - - -
[ 5] 3.00-4.00 sec 593 MBytes 4973 Mbits/sec 46 367 KBytes
[ 7] 3.00-4.00 sec 591 MBytes 4956 Mbits/sec 40 419 KBytes
[SUM] 3.00-4.00 sec 1.16 GBytes 9929 Mbits/sec 86
- - - - - - - - - - - - - - - - - - - - - - - - -
[ 5] 4.00-5.00 sec 592 MBytes 4968 Mbits/sec 141 542 KBytes
[ 7] 4.00-5.00 sec 591 MBytes 4960 Mbits/sec 36 559 KBytes
[SUM] 4.00-5.00 sec 1.16 GBytes 9928 Mbits/sec 177
- - - - - - - - - - - - - - - - - - - - - - - - -
[ 5] 5.00-6.00 sec 595 MBytes 4995 Mbits/sec 30 664 KBytes
[ 7] 5.00-6.00 sec 588 MBytes 4934 Mbits/sec 8 568 KBytes
[SUM] 5.00-6.00 sec 1.16 GBytes 9929 Mbits/sec 38
- - - - - - - - - - - - - - - - - - - - - - - - -
[ 5] 6.00-7.00 sec 596 MBytes 5000 Mbits/sec 0 664 KBytes
[ 7] 6.00-7.00 sec 589 MBytes 4945 Mbits/sec 0 629 KBytes
[SUM] 6.00-7.00 sec 1.16 GBytes 9945 Mbits/sec 0
- - - - - - - - - - - - - - - - - - - - - - - - -
[ 5] 7.00-8.00 sec 588 MBytes 4935 Mbits/sec 7 655 KBytes
[ 7] 7.00-8.00 sec 594 MBytes 4982 Mbits/sec 0 682 KBytes
[SUM] 7.00-8.00 sec 1.15 GBytes 9917 Mbits/sec 7
- - - - - - - - - - - - - - - - - - - - - - - - -
[ 5] 8.00-9.00 sec 593 MBytes 4974 Mbits/sec 8 620 KBytes
[ 7] 8.00-9.00 sec 593 MBytes 4978 Mbits/sec 12 717 KBytes
[SUM] 8.00-9.00 sec 1.16 GBytes 9952 Mbits/sec 20
- - - - - - - - - - - - - - - - - - - - - - - - -
[ 5] 9.00-10.00 sec 596 MBytes 4999 Mbits/sec 0 638 KBytes
[ 7] 9.00-10.00 sec 590 MBytes 4951 Mbits/sec 0 717 KBytes
[SUM] 9.00-10.00 sec 1.16 GBytes 9950 Mbits/sec 0
This behaviour is not consistent, stopping and restarting the instances often resulted in the full 10 Gbps on a single stream suggesting the issue relates to instance placement, something that appears to be supported by the placement group documentation which states: "Network traffic to and from resources outside the placement group is limited to 5 Gbps." It is also possible that the streams are incorrectly being treated as placement group or public internet flows with different limits. For consistency I have used two parallel streams to avoid this issue in the rest of the article.
The CloudWatch graphs shows us reaching a steady baseline after around eight minutes of starting iperf3:
A quick word on the graph above, firstly it is in bytes and, having enabled detailed monitoring, one minute granularity. For conversion purpose this means we need to divide the value of the metric by 60 to get bytes per second and then multiple by 8 to get bits per seconds. Looking at the actual data from the graph above:
$ aws cloudwatch get-metric-statistics --metric-name NetworkOut --start-time 2017-08-20T08:23:00 --end-time 2017-08-20T08:35:00 --period 60 --namespace AWS/EC2 --statistics Average --dimensions Name=InstanceId,Value=i-0a7e009e7c0bf8fa8 --query 'Datapoints[*].[Timestamp,Average]' --output=text | sort
2017-08-20T08:23:00Z 486.0
2017-08-20T08:24:00Z 5726.0
2017-08-20T08:25:00Z 22711496136.0
2017-08-20T08:26:00Z 76376122845.0
2017-08-20T08:27:00Z 76403033046.0
2017-08-20T08:28:00Z 76357957564.0
2017-08-20T08:29:00Z 76304994405.0
2017-08-20T08:30:00Z 48667898310.0
2017-08-20T08:31:00Z 5776989873.0
2017-08-20T08:32:00Z 5816890095.0
2017-08-20T08:33:00Z 5692555065.0
2017-08-20T08:34:00Z 5692014471.0
The maximum average throughput (between 08:26 and 08:29) is around 76 GByte/minute which works out at around 1.2 GByte/second or approximately 10.1 Gbit/second. Similarly the baseline (from 08:31 onwards) is in the region of 5.7 GByte/minute which translates to around 94 MByte/second or around 750 Mbit/second. These numbers are naturally averages but are fairly close to the actual iperf3 results, with the peak throughput of just over 10Gbit/second:
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 6] 0.00-1.00 sec 604 MBytes 5065 Mbits/sec 0 551 KBytes
[ 8] 0.00-1.00 sec 604 MBytes 5062 Mbits/sec 0 524 KBytes
[SUM] 0.00-1.00 sec 1.18 GBytes 10127 Mbits/sec 0
- - - - - - - - - - - - - - - - - - - - - - - - -
[ 6] 1.00-2.00 sec 601 MBytes 5046 Mbits/sec 0 551 KBytes
[ 8] 1.00-2.00 sec 602 MBytes 5048 Mbits/sec 0 551 KBytes
[SUM] 1.00-2.00 sec 1.18 GBytes 10094 Mbits/sec 0
- - - - - - - - - - - - - - - - - - - - - - - - -
[ 6] 2.00-3.00 sec 602 MBytes 5046 Mbits/sec 0 577 KBytes
[ 8] 2.00-3.00 sec 602 MBytes 5047 Mbits/sec 0 577 KBytes
[SUM] 2.00-3.00 sec 1.17 GBytes 10093 Mbits/sec 0
- - - - - - - - - - - - - - - - - - - - - - - - -
[ 6] 3.00-4.00 sec 601 MBytes 5045 Mbits/sec 0 577 KBytes
[ 8] 3.00-4.00 sec 602 MBytes 5049 Mbits/sec 0 577 KBytes
[SUM] 3.00-4.00 sec 1.18 GBytes 10095 Mbits/sec 0
- - - - - - - - - - - - - - - - - - - - - - - - -
[ 6] 4.00-5.00 sec 602 MBytes 5049 Mbits/sec 0 577 KBytes
[ 8] 4.00-5.00 sec 601 MBytes 5045 Mbits/sec 0 577 KBytes
[SUM] 4.00-5.00 sec 1.18 GBytes 10094 Mbits/sec 0
- - - - - - - - - - - - - - - - - - - - - - - - -
[ 6] 5.00-6.00 sec 602 MBytes 5049 Mbits/sec 0 577 KBytes
[ 8] 5.00-6.00 sec 601 MBytes 5042 Mbits/sec 0 577 KBytes
[SUM] 5.00-6.00 sec 1.17 GBytes 10092 Mbits/sec 0
- - - - - - - - - - - - - - - - - - - - - - - - -
[ 6] 6.00-7.00 sec 602 MBytes 5046 Mbits/sec 0 577 KBytes
[ 8] 6.00-7.00 sec 602 MBytes 5049 Mbits/sec 0 577 KBytes
[SUM] 6.00-7.00 sec 1.18 GBytes 10095 Mbits/sec 0
- - - - - - - - - - - - - - - - - - - - - - - - -
[ 6] 7.00-8.00 sec 603 MBytes 5063 Mbits/sec 0 1.44 MBytes
[ 8] 7.00-8.00 sec 601 MBytes 5041 Mbits/sec 66 524 KBytes
[SUM] 7.00-8.00 sec 1.18 GBytes 10104 Mbits/sec 66
- - - - - - - - - - - - - - - - - - - - - - - - -
And the baseline of around 750Mbit/second:
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 6] 376.00-377.00 sec 43.8 MBytes 367 Mbits/sec 157 114 KBytes
[ 8] 376.00-377.00 sec 43.8 MBytes 367 Mbits/sec 157 78.7 KBytes
[SUM] 376.00-377.00 sec 87.5 MBytes 734 Mbits/sec 314
- - - - - - - - - - - - - - - - - - - - - - - - -
[ 6] 377.00-378.00 sec 45.0 MBytes 377 Mbits/sec 161 69.9 KBytes
[ 8] 377.00-378.00 sec 45.0 MBytes 377 Mbits/sec 167 78.7 KBytes
[SUM] 377.00-378.00 sec 90.0 MBytes 755 Mbits/sec 328
- - - - - - - - - - - - - - - - - - - - - - - - -
[ 6] 378.00-379.00 sec 43.8 MBytes 367 Mbits/sec 182 69.9 KBytes
[ 8] 378.00-379.00 sec 45.0 MBytes 377 Mbits/sec 168 105 KBytes
[SUM] 378.00-379.00 sec 88.8 MBytes 744 Mbits/sec 350
- - - - - - - - - - - - - - - - - - - - - - - - -
[ 6] 379.00-380.00 sec 42.5 MBytes 357 Mbits/sec 150 61.2 KBytes
[ 8] 379.00-380.00 sec 46.2 MBytes 388 Mbits/sec 165 96.1 KBytes
[SUM] 379.00-380.00 sec 88.8 MBytes 744 Mbits/sec 315
- - - - - - - - - - - - - - - - - - - - - - - - -
[ 6] 380.00-381.00 sec 36.2 MBytes 304 Mbits/sec 129 78.7 KBytes
[ 8] 380.00-381.00 sec 52.5 MBytes 440 Mbits/sec 203 105 KBytes
[SUM] 380.00-381.00 sec 88.8 MBytes 744 Mbits/sec 332
- - - - - - - - - - - - - - - - - - - - - - - - -
[ 6] 381.00-382.00 sec 36.2 MBytes 304 Mbits/sec 147 96.1 KBytes
[ 8] 381.00-382.00 sec 52.5 MBytes 440 Mbits/sec 220 87.4 KBytes
[SUM] 381.00-382.00 sec 88.8 MBytes 744 Mbits/sec 367
- - - - - - - - - - - - - - - - - - - - - - - - -
[ 6] 382.00-383.00 sec 46.2 MBytes 388 Mbits/sec 175 52.4 KBytes
[ 8] 382.00-383.00 sec 42.5 MBytes 357 Mbits/sec 167 114 KBytes
[SUM] 382.00-383.00 sec 88.8 MBytes 744 Mbits/sec 342
- - - - - - - - - - - - - - - - - - - - - - - - -
[ 6] 383.00-384.00 sec 41.2 MBytes 346 Mbits/sec 165 61.2 KBytes
[ 8] 383.00-384.00 sec 47.5 MBytes 398 Mbits/sec 170 96.1 KBytes
[SUM] 383.00-384.00 sec 88.8 MBytes 744 Mbits/sec 335
- - - - - - - - - - - - - - - - - - - - - - - - -
[ 6] 384.00-385.00 sec 50.0 MBytes 419 Mbits/sec 195 87.4 KBytes
[ 8] 384.00-385.00 sec 38.8 MBytes 325 Mbits/sec 157 52.4 KBytes
[SUM] 384.00-385.00 sec 88.8 MBytes 744 Mbits/sec 352
- - - - - - - - - - - - - - - - - - - - - - - - -
Calculating network credit rates
Using the baseline for network performance we can draw some inferences about the rate at which network credits are accrued. For simplicity I am going to define a network credit as having a value of 1 Gbps, so an instance with 10 network credits could transmit at 10 Gbps for 1 second, naturally if the instance network limit is 10 Gbps the maximum rate can't be exceeded even if the instance credit balance is sufficient (20 credits allows 2 seconds at 10 Gbps rather than 1 second at 20 Gbps) . Given the base network performance in the previous section, we can assume that an r4.large has a network credit rate of around 0.75 credits per second. We can also assume a starting balance of around 2700 as we were able to maintain 10 Gbps for around 295 seconds ((10 - 0.75) * 295) at the start of the iperf3 run. Finally it appears the maximum credit balance on the r4.large is the same as the initial balance. Leaving the instances idle for 3 hours should have resulted in a credit balance of around 8100 (0.75 rate * 3600 seconds in an hour * 3 hours) which should have theoretically allowed 810 seconds at 10 Gbps but instead provided only around 295 seconds.
R4 network performance table
Below is a table of the expected performance for R4 instance sizes.
Instance |
Baseline Gbps (approximate) |
Initial/Max Credit (approximate) |
Maximum time
at 10Gbps
(approximate seconds) |
r4.large |
0.75 |
2700 |
295 |
r4.xlarge |
1.25 |
5145 |
589 |
r4.2xlarge |
2.5 |
8925 |
1191 |
r4.4xlarge |
5 |
11950 |
2390 |
To calculate whether or not an instance will match your network throughput requirements take the difference from the baseline rate and your application base network utilisation and divide by the required burst rate.
For example, if your application requires a baseline of 0.6 Gbps, you would accrue credits at around 0.15 per second allowing you to burst for 10 Gbps for approximately one second every 66 seconds (10 / 0.15) or for 10 seconds every 660 seconds.