Using the IIS 7 URL Rewrite Module to block crawlers
Jun 5th
Here’s an easy way to block the main web crawlers – Google Bing and Yahoo – from indexing any site across an entire server. This is really useful if you push all your beta builds to a public facing server, but don’t want them indexed yet by the search engines.
1. Install the IIS URL Rewrite Module.
2. At the server level, add a request blocking rule. Block user-agent headers matching the regex: googlebot|msnbot|slurp.
Or, just paste this rule into “C:\Windows\System32\inetsrv\config\applicationHost.config”
<system.webServer>
<rewrite>
<globalRules>
<rule name="RequestBlockingRule1" stopProcessing="true">
<match url=".*" />
<conditions>
<add input="{HTTP_USER_AGENT}" pattern="googlebot|msnbot|slurp" />
</conditions>
<action type="CustomResponse" statusCode="403"
statusReason="Forbidden: Access is denied."
statusDescription="You do not have permission to view this page." />
</rule>
</globalRules>
</rewrite>
</system.webServer>
This’ll block Google, Bing and Yahoo from indexing any site published on the server. To test it out, try the Firefox User Agent Switcher.
Counting blank lines in a large text file with C# and .NET
May 16th
There are a few different ways in .NET to count the number of blank lines in a text file, but which is the most efficient?
To test some different approaches, I’ve written this function to create a huge text file with some empty lines. To verify that my counting functions give correct results, I need this function to create a file with a specific number of blank lines.
private static void BuildFileWithBlankLines(string path, uint blanks)
{
var random = new Random();
var counter = 0;
using (var sw = File.CreateText(path))
{
while (counter < blanks)
{
var isBlank = (random.Next(100) == 0); // ~1% of lines blank
sw.WriteLine(isBlank ? "" : "NOT BLANK");
if (isBlank) { counter++; }
}
}
}
For my test, I’ve used 250,000 empty lines, which outputs a file just over 250 MB. This should be big enough to expose the differences between counting methods.
1. Regular expression matching in memory
var content = File.ReadAllText(path); var re = new Regex(@"^\r?\n", RegexOptions.Multiline); var count = re.Matches(content).Count;
This technique loads the whole text file into a string variable using File.ReadAllText, which is really just a wrapper for StreamReader.ReadToEnd. It loads the character bytes from a file stream into a StringBuilder. The load is relatively quick, taking about 8% of the total processing time. Once the file is loaded into memory, a regular expression finds and counts all the blank lines. This is, not surprisingly, glacially slow. Total duration was 16,531 ms.
2. Counting line by line through a string array
var count = File.ReadAllLines(path).Count(s => s.Length == 0);
I like this approach because it’s simple and readable. The first step is similar to the above, but this time uses a StreamReader (via File.ReadAllLines) to load the file into a string array, rather than into a single string. IEnumerable.Count then loops through the array and tries to match each line against a predicate that checks for empty strings. This takes less than half the time of the regex approach, only 7,006 ms.
3. Streaming through the file
uint count = 0;
using (var sr = File.OpenText(path))
{
string line;
while ((line = sr.ReadLine()) != null)
{
if (line.Length == 0) { count++; }
}
}
Unlike the first two, this last approach doesn’t load the entire file into memory. It uses File.OpenText to create a new StreamReader, then reads each line from the FileStream counting blank lines as it goes along. This is blazing fast compared to the other two, taking just 2,108 ms.
Conclusion
There’s almost an order of magnitude difference between the best and worst methods. So if you’re working with large files, try not to load them into memory first, but stream directly from disk instead.
This applies especially to parsing large data files before loading them into a database. In some cases, it’s not just slow to load into memory first, but impossible.
For example, a 500 GB XML file wouldn’t load easily into an XDocument, but could be streamed with an XmlReader and parsed with XNode.ReadFrom. Similarly, a huge CSV file couldn’t be loaded easily into a string then String.Split all in one go. Instead, you’d probably parse it line by line with a StreamReader, or use a streaming CsvReader library.
Customising Visual Studio debugger DataTips with DebuggerDisplay
Feb 18th
The DebuggerDisplay attribute lets you change how an object is displayed in the Visual Studio debugger. Take these two simple classes representing a Company and an Employee.
public class Company
{
public string Name { get; set; }
public string StockSymbol { get; set; }
public IEnumerable<Employee> Employees { get; set; }
public Company(string name) { Name = name; }
}
public class Employee
{
public string Name { get; set; }
public Employee(string name) { Name = name; }
}
Create and initialise a new Company object with some data.
var company = new Company("Microsoft")
{
StockSymbol = "MSFT",
Employees = new List<Employee>
{
new Employee("Anders Hejlsberg"),
new Employee("Steve Ballmer"),
new Employee("Scott Guthrie")
}
};
Now, if you break into the code and start a debug session, you can hover over the company object to reveal its debugger DataTip. By expanding the nodes you can drill down into the object’s data.

The default information shown in the DataTip, although useful, can be improved by decorating classes with the DebuggerDisplay attribute. For companies, it might be useful to to display their Name and StockSymbol instead of the object type {Company}.
[DebuggerDisplay("{Name} - {StockSymbol}")]
public class Company
The same goes for the Employee object. We can show the employee name instead of the type {Employee}.
[DebuggerDisplay("{Name}")]
public class Employee
The result is a more useful and descriptive view of your Company objects.

For more info about customising Visual Studio’s debugger DataTips, see this article from MSDN Magazine:
Fibonacci numbers iterator with C# yield statements
Feb 15th
Here’s a C# iterator for generating the sequence of Fibonacci numbers. It uses the yield return statement to pass back each number in turn. By default, the arithmetic addition won’t throw an exception when previous + current is greater than UInt64.MaxValue, so I’m using a checked expression to enable overflow checking.
public static IEnumerable<ulong> FibonacciNumbers()
{
yield return 0;
yield return 1;
ulong previous = 0, current = 1;
while (true)
{
ulong next = checked(previous + current);
yield return next;
previous = current;
current = next;
}
}
We can use the iterator with a foreach loop until it throws an OverflowException.
try
{
foreach (ulong i in FibonacciNumbers())
{
Console.WriteLine("{0:0,0}", i);
}
}
catch (OverflowException)
{
Console.WriteLine("Sorry, the next number is too big.");
}
Building a string of repeated characters in C#
Feb 14th
As many different techniques as I could think of for making a string of repeated characters in C#.
char ch = 'A';
int count = 100;
// clean and concise, using the string contructor overload
var spaces1 = new string(ch, count);
// using Enumerable.Repeat to build a list of spaces,
// then folding them into a string
var spaces2 = Enumerable.Repeat(ch, count).Aggregate("", (s,c) => s+c);
// brute force of looping into a StringBuilder
var sb = new StringBuilder(count);
for (int i = 0; i < count; i++) { sb.Append(ch); }
var spaces3 = sb.ToString();
// building an array of strings, then joining without a delimiter
var spacesArray = Enumerable.Repeat(ch.ToString(), count).ToArray();
var spaces4 = string.Join(string.Empty, spacesArray);
// dirty hack with StringBuilder.Insert
var spaces5 = new StringBuilder().Insert(0, ch.ToString(), count).ToString();
// succinct hack with String.PadRight
var spaces6 = string.Empty.PadRight(count, ch);
// using ArrayList.Repeat to build the array of repeated strings
// then smushing with String.Concat
var arrayList = ArrayList.Repeat(ch, count).ToArray();
var spaces7 = string.Concat(arrayList);
Examples of C# Object Initializers
Feb 13th
A selection of the various C# object initialization syntaxes:
// explicit shorthand array, C# 1.0 style
int[] numbersIntArray1 = { 4, 8, 15, 16, 23, 42 };
// two dimensional 3x2 array
int[,] numbersInt3x2Array = { { 4, 8 }, { 15, 16 }, { 23, 42 } };
// implicitly typed array
var numbersIntArray2 = new[] { 4, 8, 15, 16, 23, 42 };
// typed array with an implicitly type variable
var numbersDoubleArray = new double[] { 4, 8, 15, 16, 23, 42 };
// generic List<>
var numbersList = new List<int> { 4, 8, 15, 16, 23, 42 };
// generic HashSet<>
var numbersHashSet = new HashSet<int> { 4, 8, 15, 16, 23, 42 };
// generic Dictionary
var dictionary = new Dictionary<int, string>
{
{ 4, "four" },
{ 8, "eight" },
{ 15, "fifteen" },
{ 16, "sixteen" },
{ 23, "twenty three" },
{ 42, "fourty two" }
};
// implicitly typed array of System.Uri objects
var searchEngines = new[]
{
new Uri("http://www.google.com/"),
new Uri("http://www.bing.com/"),
new Uri("http://www.yahoo.com/")
};
// implicitly typed array of an anonymous type
var complexNumbers = new[]
{
new { Real = 4, Imaginary = 7 },
new { Real = -10, Imaginary = 3 },
};
// implicitly typed array of System.Func<> delegates
var operations = new Func<int, int>[]
{
(x => x * x),
(x => x + x)
};
// nested object initializers
var smtpClient = new SmtpClient
{
Host = "mail.example.com",
Port = 25,
Credentials = new NetworkCredential
{
UserName = "MailScheduler",
Password = @"%yI2Ei5GL0"
}
};
// implicitly typed jagged array of strings
var jaggedArray = new[]
{
new[] {"Jack", "Sayid", "Hurley", "Miles"},
new[] {"Sawyer", "Kate"}
};
// implicitly typed array of objects implementing IEnumerable<int>
var listOfLists = new IEnumerable<int>[]
{
new List<int> {4, 8, 15, 16, 23, 42},
new Collection<int> { 4, 8, 15, 16, 23, 42 },
new int[] {4, 8, 15, 16, 23, 42 }
};
Snack size C#: null coalescing operator
Feb 13th
You need to assign the value of an expression to a variable, unless that expression evaluates to null, in which case default to something else. And if that something else is also null, then fall back to another value, and so on.
For example, a web page that uses a theme setting to control look-and-feel. If the user has selected their own theme then use that; otherwise use the site’s default theme; or, if that’s not defined, use the global theme; or finally if no global theme is defined then fall back to a system default.
The long, readable version
Using good, old fashioned, conditional logic.
string theme;
if (userTheme != null)
{
theme = userTheme;
}
else if (siteTheme != null)
{
theme = siteTheme;
}
else if (globalTheme != null)
{
theme = globalTheme;
}
else
{
theme = defaultTheme;
}
The shorter, slightly befuddling version
By nesting the ternary conditional operator in a most disagreeable way.
string theme =
userTheme != null ? userTheme
: siteTheme != null ? siteTheme
: globalTheme != null ? globalTheme
: defaultTheme;
The deluxe C# 3.0 version
With the overkill of an implicitly-typed array, an IEnumerable extension method, and a lambda predicate.
string theme = new [] {
userTheme, siteTheme, globalTheme, defaultTheme
}.First(t => t != null);
The elegant coalescence version
With C# 2.0’s null-coalescing operator, doing what it does best.
string theme = userTheme ?? siteTheme ?? globalTheme ?? defaultTheme;
Free Microsoft certification beta exams coming soon
Feb 11th
New beta Microsoft certification exams available free between March 31, 2010 and April 20, 2010.
- 70-515 TS: Web Applications Development with Microsoft® .NET Framework 4
- 70-516 TS: Accessing Data with Microsoft® .NET Framework 4
- 70-513 TS: Windows Communication Foundation Development with Microsoft® .NET Framework 4
- 70-519 Pro: Designing and Developing Web Applications using Microsoft® .NET Framework 4
More info on Gerry O’Brien’s blog here: Free Certification Exams
70-513 TS: Windows Communication Foundation Development with Microsoft® .NET Framework 4
70-515 TS: Web Applications Development with Microsoft® .NET Framework 4
70-516 TS: Accessing Data with Microsoft® .NET Framework 4
70-519 Pro: Designing and Developing Web Applications using Microsoft® .NET Framework 4
TableHeaderScope.Column and HTML standards compliance
Feb 5th
Neil spotted this quirky behaviour in the TableHeaderCell control. Suppose you’ve got a Table web control that renders a <table> tag with a <th scope=”col”>, using code something like this:
Table table = new Table()
{
Rows =
{
new TableRow()
{
Cells =
{
new TableHeaderCell()
{ Scope = TableHeaderScope.Column, Text = "MyColumn" }
}
}
}
};
Then the rendered HTML won’t be W3C compliant because a scope value of “column” isn’t valid:
<table border="0">
<tr>
<th scope="column"></th>
</tr>
</table>
According to the W3C HTML and XHTML specs, the acceptable values for scope are [row|col|rowgroup|colgroup]. This is confirmed by the W3C Markup Validation Service, which reports the error:

Stepping into the source code for the TableHeaderCell control with Reflector or the .NET Reference Source, you can see the root of the problem:
protected override void AddAttributesToRender(HtmlTextWriter writer)
{
TableHeaderScope scope = Scope;
if (scope != TableHeaderScope.NotSet)
{
writer.AddAttribute(HtmlTextWriterAttribute.Scope,
scope.ToString().ToLowerInvariant());
}
}
When the control sets the scope attribute’s value, it uses a lowercase string representation of the TableHeaderScope.Column enumeration, i.e. “column”. An enum value of TableHeaderScope.Col would’ve been ok, but TableHeaderScope.Column is invalid.
Workarounds
Neil has a good workaround by adding the scope attribute manually instead of using the TableHeaderCell.Scope property:
TableHeaderCell th = new TableHeaderCell(); th.Attributes["scope"] = "col";
I thought I’d have a go at extending this fix into a simple control adatper that makes sure TableHeaderCell always renders itself correctly. It works by checking whether the Scope property is set to TableHeaderScope.Column, and if so manually adds the attribute scope=”col” instead. Here’s the code:
public class TableHeaderCellAdapter : WebControlAdapter
{
protected override void RenderBeginTag(HtmlTextWriter writer)
{
TableHeaderCell th = (TableHeaderCell) this.Control;
if (th.Scope == TableHeaderScope.Column)
{
th.Scope = TableHeaderScope.NotSet;
th.Attributes["scope"] = "col";
}
base.RenderBeginTag(writer);
}
}
To associate the control adapter class with the TableHeaderCell control, add a browser definition file to the App_Browsers folder:
<!-- file: ~/App_Browsers/ControlAdapters.browser -->
<browsers>
<browser refID="Default">
<controlAdapters>
<adapter
controlType="System.Web.UI.WebControls.TableHeaderCell"
adapterType="TableHeaderCellAdapter" />
</controlAdapters>
</browser>
</browsers>
Configuring IIS7 for ASP.NET on Windows Vista
Feb 5th
As part of Microsoft’s ongoing Trustworthy Computing initiative, Vista takes the secure by default principle more seriously than previous versions of Windows. Consequently, many features aren’t installed by default, one of which is IIS 7. So, to install IIS go to:
Control Panel – Programs – Turn Windows features on or off
Then find the Intenet Information Services section and add everything you need, including ASP.NET.
