Today’s blog focuses on the problem of latency when hosting applications or services within a cloud provider’s network like AWS.

Have you ever noticed the awkward delays on international phone calls? It’s mostly network latency. This is the time it takes for your voice to travel through the air to the phone, be converted to an electrical signal, transmitted to the other phone, converted to sound and transmitted through the air to the other person’s ear.

[stextbox id=”info”]
As a curious aside, there is a “natural” latency in face-to-face communications. When you talk to someone two metres away it takes about 6 milliseconds for the sound to arrive. A local phone call can actually have less latency than face-to-face communications because the sound travels such a short distance through the air to the handset and then mostly travels as an electrical signal at close to the speed of light.

Network latency in computing is primarily caused by the limits of the speed of light. Packets can only travel so fast. Other network effects, such as the number of routers and network device utilisation also contribute.

As distributed applications evolve and essentially become an amalgamation of geographically dispersed services, the limits imposed by network latency will become more apparent.

One can imagine in the not-too-distant future an IT manager yelling at a subordinate to “throw bandwidth” or “new EC2 instances” at a poorly performing application and not understanding the real latency problem.

[stextbox id=”info” caption=”Latency and Bandwidth”]
If we imagine a jetliner making a trip from Singapore to Los Angeles, latency is analogous to the time the flight takes. Once a network path is established, latency is essentially a fixed constraint. Bandwidth can be thought of as the number of passengers on board. If you add “passengers” more “packets” arrive per jet-liner, but the flight still takes the same amount of time.

Latency is typically measured as round-trip latency.  This is the time a packet takes to go from source to destination and back again. Round trip latency excludes the amount of time that a destination system spends processing a packet. Typically you need to compare network latency and application response times to work out whether your network is the problem or your application.

One way around latency is to make the network path a shorter distance. Place latency-sensitive parts of a system closer together. Consider also which network providers you use as they will have different network architectures and therefore different latencies to different locations. And also look at caching strategies (eg. CDNs, web optimisation) to essentially pre-load data at a location. Finally look at reducing the network-chattiness of your applications if possible.

Latency-sensitive components of your system need to be considered up-front in the planning phase. Once you’ve built your distributed application and locked in your provider agreements with AWS and other cloud providers, your latency constraints are locked in. And once set-up, latency also needs to be constantly monitored for change and its impact on your environment.