Showing posts with label performance. Show all posts
Showing posts with label performance. Show all posts

Monday, January 17, 2011

What Have You Done for Me, Latency?

When developing web applications, it's not uncommon to develop and deploy the application on different network topoligies. During development, for example, you might be running a database server, web server and web browser all on localhost. Your application is running well and the performance is snappy. But user experience starts to lag when the application is deployed to production where web server and database reside on different boxes. What's to blame? Latency.

But fear not, intrepid programmer! Here's a simple command line hack that you can use to make network performance on localhost more closely resemble WAN or LAN performance.

#!/bin/bash

# This script adds a tc rule to the loopback device to 
# help developers approximate WAN performance on their 
# loopback device.

# add a delay rule to the loopback device
sudo tc qdisc replace dev lo root handle 1:0 netem delay 10msec 5msec 10%

# list current state of the loopback device
echo 'Loopback device is now configured to simulate a high latency network route.'
echo ''
echo "Current dev lo rules ... `tc qdisc show dev lo`"
echo ''
echo ''

# demonstrate delay by pinging localhost
ping -c 3 localhost

I run the above commands on login so that overly chatty code doesn't sneak past.

Sunday, August 1, 2010

Beware a slow death by regular expressions!

Overview & Motivation

The Java programming language provides great support for regular expressions. In general, the performance of regular expressions in Java is very good. The trade-off for that excellent performance is a (somewhat) protracted API. To see what I mean, have a a look at a canonical regular expression example from Sun's javadoc:

Pattern p = Pattern.compile("a*b");
Matcher m = p.matcher("aaaaab");
boolean b = m.matches();

Kind of verbose, considering all we've accomplished is checking a single CharacterSequence for a match against one regular expression! Such complaints about decreased simplicity are likely what motivated Sun to provide some regular expression "convenience" methods inside the API for the String class. Specifically, as of version 6, the String API provides five such convenience methods:

  • public boolean matches(String regex);
  • public String replaceFirst(String regex, String replacement);
  • public String replaceAll(String regex, String replacement);
  • public String[] split(String regex);
  • public String[] split(String regex, int limit);

Each of the above convenience methods constructs and delegates the method call to Pattern and Matcher objects. What's not immediately obvious from the documentation for those convenience methods is that they become extremely inefficient as you execute them many thousands of times.

Reg. Exp. API Timing Data

Consider the following bit of code that illuminates the inefficiency of the String.matches(regex) method:

public class RegExpExerciser
{
 public static void main(String[] args)
 {

  int numTimes = 10000000;

  String regexp = "a*b";
  String toMatch = "aaaaab";

  long begin1 = System.currentTimeMillis();
  for (int i = 0; i < numTimes; i++)
  {
   toMatch.matches(regexp);
  }
  System.out.println(
   "Milliseconds using convenience method from String: " 
   + (System.currentTimeMillis() - begin1)
  );

  long begin2 = System.currentTimeMillis();
  Pattern p = Pattern.compile(regexp);
  for (int i = 0; i < numTimes; i++)
  {
   p.matcher(toMatch).matches();
  }
  System.out.println(
   "Milliseconds when re-using compiled reg. exp. pattern: " 
   + (System.currentTimeMillis() - begin2)
  );

 }
}

Executing the above code, my box reveals these times:

Milliseconds using convenience method from String: 5074
Milliseconds when re-using compiled reg. exp. pattern: 1720

RegExpHelper.java to the Rescue!

The code sample above clearly demonstrates that the time to compile a regular expression Pattern dominates the time required to actually check whether a given string matches the pattern. Utilizing that key insight, I offer a helper class in the Big-Oh Software Common library that provides simple APIs for working with regular expressions without sacrificing performance: RegExpHelper.java (javadoc). The key feature to note about that class is that it caches compiled regular expressions, making its methods both simple and efficient.