Standard Regular Expression usage in .NET is the interpreted variety. In my experience "interpreted" typically means less-than-optimal performance. The purpose of this post is to describe how to precompile or inline your regular expressions for better performance. I'm not going to make the argument for or against regular expressions vs. handwritten parsing code, you can search for those answers elsewhere.
It probably doesn't make much sense to even worry about this when your regular expressions will be used only infrequently, However, if you have expressions that are executed frequently, as in a loop or something, you may find that interpreted regular expressions can drag you down a bit.
Pros and Cons
- Interpreted Regular Expressions start quickly but will not perform optimally at runtime.
- Precompiled Regular Expressions require a little more start time but provide improved runtime performance. If you have an expression that is executed frequently, you may want to consider this option. I've read that precompiled regular expressions perform 30% improved performance over the interpreted variety.
- Inline Regular Expressions creates an entirely new assembly that provides the benefits of improved start time as wells as the benefits of precompiled expressions. Since a new assembly is created it can be reused across applications, which is always, IMO, a good thing.
In the following three examples I will show how to do each with a simple regular expression that checks for a valid email address.
public bool IsValidEmailAddress(string email)
{
Regex regex = new Regex(@"^[\w-\.]+@([\w-]+\.)+[\w-]{2,4}$");
return regex.IsMatch(email);
}
public bool IsValidEmailAddress(string email)
{
Regex regex = new Regex(@"^[\w-\.]+@([\w-]+\.)+[\w-]{2,4}$",RegexOptions.Compiled);
return regex.IsMatch(email);
}
Ok, this one's a little, but not much, more complicated. The goal here is to create an assembly with your expressions built into it. For my email validation expression we can do this:
RegexCompilationInfo regexInfo = new RegexCompilationInfo(@"^[\w-\.]+@([\w-]+\.)+[\w-]{2,4}$", RegexOptions.None, "EmailExpression", "MyRegEx", true);
Regex.CompileToAssembly(new RegexCompilationInfo[] { regexInfo }, new AssemblyName("MyRegEx"));
This will create an assembly named "MyRegEx.dll" in your Bin directory (see image below). Then, from the application that will use this assembly you can add a reference to your new regular expression assembly.
Making use of the new assembly is quite easy:
private bool IsValidEmailAddress(string email)
{
MyRegEx.EmailExpression ee = new MyRegEx.EmailExpression();
return ee.IsMatch(email);
}
That's it. Pretty simple, eh?