Understanding Linux Internals for Data Transfer – Part 2

We have already discussed in the part 1 of this series that what are the basic system calls required for communicating with a server.

In this blog post our main goal is to understand the differences with downloading data from S3 with and without using AWS-SDK.

Experimental Setup:

  • I3.xlarge machine
  • Different Payload Sizes
    • 100KB
    • 500KB
    • 1MB
    • 5MB
    • 10MB
    • 50MB
  • Concurrency
    • 1 Threads
    • 10 Threads
  • Downloading Client
    • HttpClient
    • AWS-SDK

Metrics to Monitor:

  • Download Speed ( Mb/s )
  • CPU Utilisation
  • Network Percentage Utilisation

Experiment Results:

Download Time ( in ms )

  • 100 KB Payload
Concurrency HttpClient Aws SDK
100 KB
10 Threads 721 ms 615 ms
1 Thread 734 ms 611 ms

 

  • 500 KB Payload
Concurrency HttpClient Aws SDK
500 KB
10 Threads 1116 ms 970 ms
1 Thread 1152 ms 1023 ms

 

 

  • 1 MB Payload
Concurrency HttpClient Aws SDK
1 MB
10 Threads 1334 ms 1221 ms
1 Thread 1362 ms 1268 ms

 

  • 5 MB Payload
Concurrency HttpClient Aws SDK
5 MB
10 Threads 1932 ms 1788 ms
1 Thread 1991 ms 1870 ms

 

  • 10 MB Payload
Concurrency HttpClient Aws SDK
10 MB
10 Threads 2294 ms 2156 ms
1 Thread 2247 ms 2177 ms

 

 

  • 50 MB Payload
Concurrency HttpClient Aws SDK
50 MB
10 Threads 4448 ms 4096 ms
1 Thread 4223 ms 4100 ms

 

 

CPU Statistics

  • With 10 Concurrency

  • With 1 Concurrency

 

Network Statistics

  • With 10 Concurrency
  • With 1 Concurrency

 

Conclusions

  • AWS SDK outperforms HttpClient for all the payload sizes.
  • CPU Utilisation while using AWS SDK and HttpClient is comparable.
  • Also Network Throughput while using AWS SDK and HttpClient is also comparable.


In the next blog post , we will go into the details of these downloads and see if we can improve upon the download time.

Go to Previous Blog Post

One thought on “Understanding Linux Internals for Data Transfer – Part 2

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.