Bugfree.dk – Ronnie Holm's blog

Not anti-anything, just pro-quality

Unit testing LINQ to SQL using TypeMock

Posted by Ronnie Holm on 4th May 2010

Recent months have brought about a proliferation of mocking frameworks that mocks what more traditional framework like Rhino Mocks cannot. Instead of creating and loading a mock implementation at runtime, the new breed of mocking frameworks hooks into the CLR to intercept and redirect calls. This opens up virtually every aspect of a class to mocking, which is useful for testing code not written with explicit testability in mind. Until recently, TypeMock was the only mocking framework around that took the latter approach, but it’s now being challenged by Moles from Microsoft Research and JustMock from Telerik.

Why traditional dependency-breaking techniques come short

After watching a screencast on how to use Moles to unit test LINQ to SQL without hitting the database, I thought it would be interesting to do the same with TypeMock. But first, let’s make sure we understand why traditional dependency-breaking techniques come short in testing LINQ to SQL. Assuming we want to put a repository under test, our goal is to mock how it accesses the database. Here’s a simple implementation of a repository that queries the Employee table of the AdventureWorks database:

    public class EmployeeRepository {
        public List<Employee> GetEmployeesByHireDate(DateTime start, DateTime end) {
            using (var ctx = new AdventureWorksDataContext())
                return (from e in ctx.Employees
                        where e.HireDate >= start && e.HireDate <= end
                        select e).ToList();
        }
    }

All calls to the database are routed through the AdventureWorksDataContext generated by Visual Studio. To mock access to the database, we therefore have to mock part of the data context. Easier said than done, though, for the context doesn’t expose an interface that a fake can implement. In addition, the tables are accessed through properties on the context that return a type of Table<TEntity>. Unfortunately, the constructor of Table<TEntity> is internal and the class itself is sealed, eliminating the hope of instantiating or subclassing the type by traditional means:

    public sealed class Table<TEntity> : IQueryProvider,
            ITable, IListSource, ITable<TEntity>, IQueryable<TEntity>,
            IEnumerable<TEntity>, IQueryable, IEnumerable
            where TEntity : class {
        internal Table(DataContext context, MetaTable metaTable) {
            ...
        }
    }

For an example of how the data context itself creates an instance of Table<TEntity>, take a look at the Employees property on the AdventureWorksDataContext. It relies on the GetTable<Employee> method on the DataContext class to create an instance of Table<Employee>. Despite its constructors being internal, the GetTable<TEntity> method has no trouble constructing an instance of the Table<TEntity> type, as they both reside in the System.Data.Linq assembly:

    public partial class AdventureWorksDataContext : DataContext {
        public Table<Employee> Employees {
            get {
                return GetTable<Employee>();
            }
        }
    }

How to break the unbreakable

The design of LINQ to SQL leaves us short of a traditional testing seam, as Michael Feathers would phrase it; a place at which we can alter the behavior of a program without editing in that place. This explains why, with LINQ to SQL, traditionally we’ve had to test against a real database with all its constraints, making our tests brittle, slow, and painful to write and maintain. With the new breed of mocking frameworks the issues of not being able to subclass or not being able to call an internal constructor go away (and new issues take their place). Regardless, here’s how to write a unit test for the CustomerRepository that doesn’t hit the database:

    [TestClass]
    public class CustomerRepositoryTest {
        private EmployeeRepository _repository;

        [TestInitialize]
        public void Initialize() {
            _repository = new EmployeeRepository();

            var fakeEmployees = new List<Employee> {
                new Employee {EmployeeID = 1, HireDate = new DateTime(2004, 12, 1)},
                new Employee {EmployeeID = 2, HireDate = new DateTime(2006, 7, 1)},
                new Employee {EmployeeID = 3, HireDate = new DateTime(2009, 3, 1)}
            }.AsQueryable();

            var fakeDataContext = Isolate.Fake.Instance<AdventureWorksDataContext>();
            Isolate.Swap.NextInstance<AdventureWorksDataContext>().With(fakeDataContext);

            // var fakeEmployeeTable = Isolate.Fake.Instance<Table<Employee>>();
            // Isolate.WhenCalled(() => fakeDataContext.Employees).WillReturn(fakeEmployeeTable);
            // Isolate.WhenCalled(() => fakeEmployeeTable).WillReturnCollectionValuesOf(fakeEmployees);
            // or by transitivity
            Isolate.WhenCalled(() => fakeDataContext.Employees).WillReturnCollectionValuesOf(fakeEmployees);
        }

        [TestMethod]
        public void GetEmployeesByHireDate_should_return_hires_from_2008_until_present() {
            var employees = _repository.GetEmployeesByHireDate(new DateTime(2008, 1, 1), DateTime.Now);
            Assert.AreEqual(1, employees.Count());
            Assert.AreEqual(3, employees[0].EmployeeID);
        }
    }

The test method itself looks exactly as if we’d been testing against a real database. The difference lies in the Initialize method, where we setup the fake data context and database contents. We instruct TypeMock to return the fake context in place of the real one inside EmployeeRepository. And whenever someone calls the Employees property on the fake context, we have TypeMock intercept the call and return a fake collection of type IQueryable<Employee>. We could’ve returned an instance of Table<Employee>, which implements IQueryable<Employee>, but in this case returning the collection is simpler and sufficient. Had we had more methods on our repository, we likely would’ve added additional rows to the Employee table and populated more of its columns.

  • Share/Bookmark

Tags: , , ,
Posted in .Net | 3 Comments »

The given-expect testing pattern

Posted by Ronnie Holm on 25th April 2010

I was watching Brett Schuchert’s TDD screencast on implementing the shunting yard algorithm in C#. In it Brett builds up his tests in a style I hadn’t come across before. Each test is expressed as a given-expect statement. A pattern that is particularly useful in situations in which a class has a main method that accepts an open-ended number of dissimilar inputs.

I found the given-expect pattern useful in testing a piece of code that I was working on this week. I was refactoring and adding tests around an ASP.NET control adapter that makes SharePoint 2007 pages more XHTML compliant. I wanted to reuse the transformations outside the control adapter and hence ended up moving the transformation logic to a new class. It accepts possibly malformed HTML and relies on heuristics of the HTML Agility Pack to build a DOM off of it. I can then query the DOM, looking for known violations, and patch them before returning XHTML to the caller.

    public class HtmlToXHtmlTransformer {
        private readonly HtmlDocument _document;

        public HtmlToXHtmlTransformer(string html) {
            _document = new HtmlDocument();
            _document.DetectEncoding(new StringReader(html));
            _document.LoadHtml(html);
        }

        private void Transform(string xpath, Action<HtmlNode> nodeMatch) {
            var nodes = _document.DocumentNode.SelectNodes(xpath);
            if (nodes != null)
                foreach (var node in nodes)
                    nodeMatch.Invoke(node);
        }

        private void FixDuplicateBorderAttributeOnSPGridViewControl() {
            Transform("//table[count(@border)=2]", node => node.Attributes.Remove("border"));
        }

        public string Transform() {
            FixDuplicateBorderAttributeOnSPGridViewControl();
            _document.OptionWriteEmptyNodes = true;
            return _document.DocumentNode.WriteTo();
        }
    }

The complete HtmlToXHtmlTransformer collects a dozen transformations. Its Transform method is what we want to call with various HTML fragments to verify that they come out as XHTML. For this purpose, we might do the tests as Visual Studio data-driven tests that read their input and output from a text file. But in most cases I prefer traditional tests, so I can describe the purpose of a test with a descriptive method name and possibly a comment.

    [TestClass]
    public class HtmlToXHtmlTransformerTest {
        private string _result;

        [TestMethod]
        public void Must_selfclose_nodes_when_allowed() {
            Given("<br>");
            Expect("<br />");
        }

        [TestMethod]
        public void Must_remove_duplicate_border_on_SPGridView_control {
            Given(@"<table border=""0"" border=""0""></table>");
            Expect(@"<table border=""0""></table>");
        }

        private void Expect(string xhtml) {
            Assert.AreEqual(xhtml, _result);
        }

        private void Given(string html) {
            var transformer = new HtmlToXHtmlTransformer(html);
            _result = transformer.Transform();
        }
    }

I particularly like the clarity of the given-expect pattern and find that for a reasonable number of tests it’s a viable alternative to data-driven test. I do, however, recognize the value of data-driven tests in situations where a non-developer wants to test a class. Though at the unit test level I’ve never experienced this. It’s more characteristic of FitNesse for acceptance testing. However you unit test, just make sure your tests run with a minimum of effort on your part and that they run fast.

  • Share/Bookmark

Tags: , , ,
Posted in .Net, SharePoint | No Comments »

SharePoint list access using the Repository pattern

Posted by Ronnie Holm on 18th January 2010

(Download code here. See related post on SharePoint list definition using the Template pattern)

Maybe it’s that the SharePoint API is hard to use. Maybe it’s that coding against SharePoint is about making smaller additions here and there. Maybe it’s that the Patterns & Practices SharePoint Guidance isn’t widely known. Whatever the reason, developing for SharePoint requires equal attention to the separation of presentation, business, and data access code. Hence, starting with data access, we may want to create a repository and route queries through it (the SharePoint Guidance outlines a more sophisticated implementation than the one below):

    [TestClass]
    public class EmployeesRepositoryTest {
        private SPSite _siteCollection;
        private SPWeb _site;
        private EmployeesRepository _repository;

        private readonly Employee _duffyDuck = new Employee {
            Id = 1000, Name = "Duffy Duck", HireDate = new DateTime(2009, 12, 1),
            Remarks = "Looks like a duck, quacks like a duck, probably is a duck"
        };
        private readonly Employee _porkyPig = new Employee {
            Id = 1001, Name = "Porky Pig", HireDate = new DateTime(2010, 2, 1)
        };
        private readonly Employee _sylvesterTheCat = new Employee {
            Id = 1002, Name = "Sylvester the Cat", HireDate = new DateTime(2010, 3, 1)
        };
        private readonly Employee _bugsBunny = new Employee {
            Id = 1100, Name = "Bugs Bunny", HireDate = new DateTime(2010, 1, 1)
        };

        public void AddEmployees() {
            _repository.AddEmployee(_site, _duffyDuck);
            _repository.AddEmployee(_site, _porkyPig);
            _repository.AddEmployee(_site, _sylvesterTheCat);
        }

        public void ClearEmployees() {
            var definition = new EmployeesDefinition();
            var employees = _site.Lists[definition.ListName];
            while (employees.Items.Count > 0)
                employees.Items.Delete(0);
        }

        [TestInitialize]
        public void Initialize() {
            _siteCollection = new SPSite("http://localhost");
            _site = _siteCollection.OpenWeb("/");
            _repository = new EmployeesRepository();
            ClearEmployees();
            AddEmployees();
        }

        [TestCleanup]
        public void Cleanup() {
            _site.Dispose();
            _siteCollection.Dispose();
        }

        [TestMethod]
        public void AddEmployee_should_add_valid_employee() {
            _repository.AddEmployee(_site, _bugsBunny);
            var e = _repository.GetEmployeeById(_site, _bugsBunny.Id);
            Assert.AreEqual(_bugsBunny.Id, e.Id);
            Assert.AreEqual(_bugsBunny.Name, e.Name);
            Assert.AreEqual(_bugsBunny.HireDate, e.HireDate);
            Assert.AreEqual(_bugsBunny.Remarks, e.Remarks);
        }

        [TestMethod]
        public void GetEmployeesHiredBetween_should_return_2010_hires() {
            var from = new DateTime(2010, 1, 1);
            var to = new DateTime(2010, 12, 31);
            var employees = _repository.GetEmployeesHiredBetween(_site, from, to);
            Assert.AreEqual(2, employees.Count);
            Assert.AreEqual(_porkyPig.Id, employees[0].Id);
            Assert.AreEqual(_sylvesterTheCat.Id, employees[1].Id);
        }
    }

In the words of Martin Fowler, here’s the essence of the Repository pattern:

A Repository mediates between the domain and data mapping layers, acting like an in-memory domain object collection. Client objects construct query specifications declaratively and submit them to Repository for satisfaction. Objects can be added to and removed from the Repository, as they can from a simple collection of objects, and the mapping code encapsulated by the Repository will carry out the appropriate operations behind the scenes. Conceptually, a Repository encapsulates the set of objects persisted in a data store and the operations performed over them, providing a more object-oriented view of the persistence layer.

In SharePoint terms, business logic should query a repository which in turn queries a SharePoint list. Within the repository, the weakly typed items returned are then mapped to strongly typed data transfer objects, which are returned to the business layer.

With this approach to data access comes a number of advantages: (1) duplicate data access code is eliminated. Only within the repository do we setup the query and transform the weakly typed SPListItemCollection into strongly typed data transfer objects. (2) The list definition classes introduced in SharePoint list definition using the Template pattern may be used to construct CAML queries from strongly typed field names. Lastly, (3) accessing data through a repository makes it easier to mock the data access part of the application and to integration test that part.

The code below is a simple, yet usable, implementation of the Repository pattern. The idea is to have all repositories inherit from a common base class. Its purpose is to wrap the querying of a list and to log what’s going on. Because it’s a base class, it shouldn’t know how to transform the weakly typed result into strongly data transfer objects or which CRUD operations a particular repository supports:

    public abstract class ListRepository {
        protected ListDefinition Definition { get; set; }
        protected SPListItemCollection Result { get; set; }

        protected void Query(SPWeb site, string caml) {
            AssertValidSite(site);
            AssertValidCaml(caml);
            AssertListExistence(site);
            AssertListDefinitionSetBySubclass();

            var watch = new Stopwatch();
            var list = site.Lists[Definition.ListName];
            var query = new SPQuery {Query = caml};
            Debug.WriteLine(
                string.Format("About to run query against list '{0}': {1}", list, caml));
            watch.Start();
            Result = list.GetItems(query);
            watch.Stop();
            Debug.WriteLine(
                string.Format("Query against '{0}' returned {1} rows in {2} ms",
                              list, Result.Count, watch.ElapsedMilliseconds));
        }

        protected void AssertListExistence(SPWeb site) {
            if (!ListDefinition.ListExists(site, Definition.ListName))
                throw new ArgumentException(
                    string.Format("No '{0}' list on site '{1}'", Definition.ListName, site.Url));
        }

        protected void AssertListDefinitionSetBySubclass() {
            if (Definition == null)
                throw new NullReferenceException(
                    string.Format(
                        "Sublcass must set Definition property prior querying '{0}' list",
                        Definition.ListName));
        }

        protected void AssertValidCaml(string query) {
            if (string.IsNullOrEmpty(query))
                throw new NullReferenceException("Query must not be null or empty");
        }

        protected void AssertValidSite(SPWeb site) {
            if (site == null)
                throw new NullReferenceException("Site must not be null");
        }
    }

Each ListRepository connects to a corresponding ListDefinition, holding the name of the list to query and its strongly typed field names. It’s the responsibility of a concrete repository to set the Definition property prior to doing any querying. After running a query, the Result property holds the weakly typed result, which a concrete repository can then transform into data transfer objects to be passed to the business layer.

As an example, add to the concrete EmployeeRepository any CRUD method you see necessary to fulfill the business requirements:

    public class EmployeesRepository : ListRepository {
        public EmployeesRepository() {
            Definition = new EmployeesDefinition();
        }

        public void AddEmployee(SPWeb site, Employee e) {
            var list = site.Lists[Definition.ListName];
            var item = list.Items.Add();
            item[EmployeesDefinition.EmployeeId] = e.Id;
            item[EmployeesDefinition.Name] = e.Name;
            item[EmployeesDefinition.HireDate] = e.HireDate;
            item[EmployeesDefinition.Remarks] = e.Remarks;
            item.Update();
        }

        public Employee GetEmployeeById(SPWeb site, int id) {
            var caml =
                string.Format(@"
                      <Where>
                        <Eq>
                          <FieldRef Name=""{0}"" />
                          <Value Type=""Integer"">{1}</Value>
                        </Eq>
                      </Where>",
                    EmployeesDefinition.EmployeeId, id);
            Query(site, caml);

            IList<Employee> employees = Map(Result);
            if (employees.Count == 0)
                throw new ArgumentException(string.Format("No employee with id = {0} exists", id));
            return employees[0];
        }

        public ReadOnlyCollection<Employee> GetEmployeesHiredBetween(SPWeb site, DateTime from, DateTime to) {
            var caml =
                string.Format(@"
                      <Where>
                        <And>
                            <Geq>
                              <FieldRef Name=""{0}"" />
                              <Value IncludeTimeValue=""TRUE"" Type=""DateTime"">{1}</Value>
                            </Geq>
                            <Leq>
                              <FieldRef Name=""{0}"" />
                              <Value IncludeTimeValue=""TRUE"" Type=""DateTime"">{2}</Value>
                            </Leq>
                        </And>
                      </Where>",
                    EmployeesDefinition.HireDate,
                    SPUtility.CreateISO8601DateTimeFromSystemDateTime(from),
                    SPUtility.CreateISO8601DateTimeFromSystemDateTime(to));
            Query(site, caml);
            return new ReadOnlyCollection<Employee>(Map(Result));
        }

        protected IList<Employee> Map(SPListItemCollection items) {
            var employees = new List<Employee>();
            foreach (SPItem item in items) {
                var e = new Employee {
                                Id = (int)item[EmployeesDefinition.EmployeeId],
                                Name = (string)item[EmployeesDefinition.Name],
                                HireDate = (DateTime)item[EmployeesDefinition.HireDate],
                                Remarks = (string)item[EmployeesDefinition.Remarks]
                            };
                employees.Add(e);
            }
            return employees;
        }
    }

Calling the GetEmployeeById or GetEmployeesHiredBetween methods, the caller is required to pass in an SPWeb instance pointing to the the site holding the list to query. Outside the SharePoint context, you have to manually create this instance, like with the integration tests above. But within the SharePoint context, callers are likely to just pass in SPContext.Current.Web.

The above repository implementation deliberately ignores any issue of caching. If you find the need for it, however, you can replace SPList.GetItem with PortalSiteMapProvider.GetCachedListItemsByQuery. The advantage of using the PortalSiteMapProvider over Asp.Net caching of the result is that the provider takes care of invalidating cache entries when items are modified. The disadvantage is that the provider is part of the SharePoint publishing API, which isn’t part of WSS 3.0. In addition, it’s only available from code running within SharePoint, and likely requires cache settings to be tweaked.

      Another, non-trivial, improvement would include the use of the Unit of Work pattern. A commonly used pattern in ORMs because it offers a way to “keep track of everything you do during a business transaction that can affect the database. When you’re done, it figures out everything that needs to be done to alter the database as a result of your work". Like with the DataContext class in LINQ to SQL or the DataSet class in ADO.NET, it could add transaction support to SharePoint lists.

      • Share/Bookmark

      Tags: , ,
      Posted in .Net, SharePoint | 1 Comment »

      SharePoint list definition using the Template pattern

      Posted by Ronnie Holm on 11th January 2010

      (Download code here. See related post on SharePoint list access using the Repository pattern)

      Books on SharePoint often show how to create lists from code by calling the SharePoint API from directly within a feature receiver. This receiver would then contain in-place string literals for column names, create the columns, update the default view, and so on. While such an approach provides for an nice demonstration of the SharePoint API, it tends to carry over into production code.

      My take on feature receivers, however, is that they should contain the least possible amount of code. Staying true to object-orientation and separation of concerns, list definition and creation should instead consist of a common set of classes and methods to be called from the receiver:

          public override void FeatureActivated(SPFeatureReceiverProperties p) {
              using (var site = p.Feature.Parent as SPWeb) {
                  var employees = new EmployeesDefinition();
                  if (!ListDefinition.ListExists(site, employees.ListName))
                      employees.CreateOnSite(site);
              }
          }

      The first area to improve on is that of string literal field names getting duplicated across the code base. String literals make it impossible for the compiler to enforce a consistent naming of columns. As column names are used in defining the list, in forming CAML queries against it, and in accessing the result, misspelled column names is a common source of runtime errors. The second area to improve on is that of duplication of code for doing common list operations, like checking for the existence of the list or removing its title column. Lastly, we want to get rid of duplicate control logic, i.e., verifying preconditions and calling the same series of methods with every list definition.

      One approach to improving on these areas is to start by defining a base class for list definitions. The base class combines helper methods, working on lists, with the Template pattern for defining the common steps of list creation:

          public abstract class ListDefinition {
              public string ListName;
              public string ListDescription;
              protected SPList List;
      
              public void CreateOnSite(SPWeb site) {
                  if (string.IsNullOrEmpty(ListName))
                      throw new ArgumentException("Expected not null or not empty list name member");
                  if (site == null) throw new NullReferenceException("Expected valid site");
                  CreateList(site);
                  if (List == null) throw new NullReferenceException("Expected list member not null");
                  CreateFields(List);
                  UpdateDefaultView(List);
                  SetAdditionalProperties(List);
              }
      
              protected abstract void CreateList(SPWeb site);
              protected virtual void CreateFields(SPList l) {}
              protected virtual void UpdateDefaultView(SPList l) {}
              protected virtual void SetAdditionalProperties(SPList l) {}
      
              public static void HideAndMakeTitleFieldNotRequiredOnItemNewAndEditPage(SPList l) {
                  var title = l.Fields.GetField("Title");
                  title.Required = false;
                  title.Hidden = true;
                  title.Update();
              }
      
              public static bool ListExists(SPWeb site, string listName) {
                  if (string.IsNullOrEmpty(listName))
                      throw new ArgumentException("ListNameMustNotBeNullOrEmpty", listName);
                  SPList list = null;
                  try {
                      list = site.Lists[listName];
                  }
                  catch (ArgumentException) {
                      // list not found
                  }
                  return list != null ? true : false;
              }
          }

      At the core of the Template pattern is the template method. It defines the steps of an algorithm through a series of method calls. Each either abstract or virtual depending on if the step is mandatory or voluntary in the algorithm. Subclasses then specify the behavior of individual steps by implementing or overriding methods. In case of the ListDefinition class, CreateOnSite is our template method. It ensures that context, in the form of an SPList instance, is passed along to each step in the algorithm.

      Each subclass defines the specific characteristics of a list. It defines constants for column names and their data types, the views and which column to index, and so on. This makes for a concise, easy to read, and consistent list definition:

          public class EmployeesDefinition : ListDefinition {
              public const string EmployeeId = "EmployeeId";
              public const string Name = "Name";
              public const string HireDate = "HireDate";
              public const string Remarks = "Remarks";
      
              public EmployeesDefinition() {
                  ListName = "AcmeEmployees";
                  ListDescription = "Employees at Acme Corp.";
              }
      
              protected override void CreateList(SPWeb site) {
                  var guid = site.Lists.Add(ListName, ListDescription, SPListTemplateType.GenericList);
                  List = site.Lists[guid];
              }
      
              protected override void CreateFields(SPList l) {
                  l.Fields.Add(EmployeeId, SPFieldType.Integer, true);
                  l.Fields.Add(Name, SPFieldType.Text, true);
                  l.Fields.Add(HireDate, SPFieldType.DateTime, true);
                  var remarks = (SPFieldMultiLineText) l.Fields[l.Fields.Add(Remarks, SPFieldType.Note, false)];
                  remarks.RichText = true;
                  remarks.RichTextMode = SPRichTextMode.FullHtml;
                  remarks.Update();
                  l.Update();
              }
      
              protected override void UpdateDefaultView(SPList l) {
                  var v = l.DefaultView;
                  v.ViewFields.Delete("LinkTitle");
                  v.ViewFields.Add(EmployeeId);
                  v.ViewFields.Add(Name);
                  v.ViewFields.Add(HireDate);
                  v.ViewFields.Add(Remarks);
                  v.Update();
              }
      
              protected override void SetAdditionalProperties(SPList l) {
                  HideAndMakeTitleFieldNotRequiredOnItemNewAndEditPage(l);
                  l.Fields[HireDate].Indexed = true;
                  l.Update();
              }
          }

      Since the string literals have now become public constants, we get IntelliSense and compile-time checking referencing them. In addition, by defining common operations on the base class and using the Template pattern, we reduce the amount of code that would otherwise have to go into each definition. Finally, the base class asserts, on behalf of all subclasses, that the list name is set before attempting to create the list and that the SPList field is set after calling the CreateList method.

      • Share/Bookmark

      Tags: , ,
      Posted in .Net, SharePoint | 1 Comment »

      Generating 2D random fractal terrains with C#

      Posted by Ronnie Holm on 23rd February 2009

      Back in 1999, when I was learning C++ and MFC, I remember spending a great deal of time writing an application that displayed the Mandelbrot set, probably the most famous of all fractals. And when I learned Remote Procedure Call, I even converted the application into a distributed Mandelbrot generator (which made sense given the CPU speed of the time).

      What’s so fascinating about the Mandelbrot set, and other fractals, is that the often simple equations that define them are able to give birth to such complex creations.

      In general, a fractal is defined as:

      “[…] ‘a rough or fragmented geometric shape that can be split into parts, each of which is (at least approximately) a reduced-size copy of the whole,’ a property called self-similarity. A fractal often has the following features: it has a fine structure at arbitrarily small scales; it is too irregular to be easily described in traditional Euclidean geometric language; it is self-similar (at least approximately or stochastically); [… and] it has a simple and recursive definition. […] Natural objects that approximate fractals to a degree include clouds, mountain ranges, lightning bolts, coastlines, and snowflakes.”

      The Mandelbrot set is an example of a fractal whose definition contains no stochastic element, i.e., the fractal looks the same every time it’s generated. Recently, though, I came across a simple algorithm that claimed to generate fractal terrains by adding randomness to the equation. So, given my history with fractals, I wanted to play with the algorithm that’s based on progressive refinement through midpoint displacement in one dimension:

          Start with a single horizontal line segment
          Repeat for a sufficiently large number of times
              Repeat over each line segment
                  Find the midpoint of the line segment
                  Displace the midpoint in y by a random amount
              Reduce the range for random numbers
      

      Here’s my implementation of the algorithm in C#:

          class Program {
              static void Main() {
                  var ys = new List<double>(new double[] { 0.0, 0.0 } );
                  double displacement = 1.0;
                  Random random = new Random();
      
                  for (int i = 0; i < 8; i++)
                      ys = Split(ys, displacement *= 0.5, random);
                  ComposeCoordinatePairs(ys);
              }
      
              static List<double> Split(List<double> ys, double displacement, Random random) {
                  if (ys.Count < 2)
                      throw new ArgumentException(">= 2 coordinates required");
                  var r = new List<double>();
                  for (int i = 0; i < ys.Count - 1; i++) {
                      double dy = (ys[i + 1] - ys[i]) / 2.0;
                      double d = random.NextDouble() * displacement;
                      r.Add(ys[i]);
                      r.Add(ys[i] + dy + d);
                  }
                  r.Add(ys.Last());
                  return r;
              }
      
              static void ComposeCoordinatePairs(List<double> ys) {
                  double dx = 1.0 / (ys.Count - 1);
                  for (int i = 0; i < ys.Count; i++)
                      Console.WriteLine("{0:0.000} {1:0.000}", i * dx, ys[i]);
              }
      

      Splitting line segments, the C# code doesn't store the x-component of the coordinate pairs because it can be inferred from the number of y-components and the fact that x-components go from 0 to 1, both inclusive. Hence, starting with the line segment (0,0)-(1,0), the code splits it by calling the Split method passing in a list of y-components of the line segment. After the first iteration the single line segment becomes two, then four, then eight, and so on until the line segments have undergone eight iterations of splitting. Generally, after the i'th iteration the number of line segments is 2^i and so after eight iterations we end with 256 line segments.

      As prescribed by the algorithm, merely splitting line segments doesn’t give rise to a mountain. The mountain emerges because each time a line segment is split in two, a random displacement is added to the y-component at the center of the line segment. Assigning an upper limit to the maximum displacement within an iteration is what defines the roughness of the generated terrain. For the above implementation, the maximum displacement is defined as half of that of the previous iteration starting with 0.5. Thus, the first couple of iterations roughly define the terrain and additional iterations fill in the details.

      In animated form the transformations from two to 256 line segments looks like so:

      Among the practical applications of such a random fractal terrain generator are computer games. Not necessarily for generating 2D ridgelines, but for generating entire 3D terrains or even clouds. It works by generalizing 1D midpoint displacement to 2D in the form of the diamond-square algorithm. Then a cloud, for instance, can be derived from the 3D terrain by applying a height map to it; a height map in which different heights get assigned different colors.

      • Share/Bookmark

      Tags: ,
      Posted in .Net | No Comments »

      Asp.Net error handling by HttpModule

      Posted by Ronnie Holm on 30th August 2008

      Download EmailingExceptionModule-1.0.zip.

      In any real-world application, unhalted exceptions are an unavoidable fact of life. Such exceptions should occur relatively infrequently, but when they do some kind of supporting infrastructure better be in place to capture the when and why of the exceptions. As a developer, you should have the ability to perform a postmortem analysis on what went wrong and learn from it. Paramount to such analysis is capturing and logging sufficient information at the point of failure.

      What this post covers is the thoughts and development of an Asp.Net HttpModule that captures and emails such unhalted exception information to a designated email address.

      To be clear, the goal of the EmailingExceptionModule isn’t to prevent unhalted exception from occurring per se. Within code, a developer may not know how to handle an exception, and so the exception should rightfully propagate the call stack. In case no caller steps in and handles the exception, the best choice is most likely to terminate the application. It’s almost always better to be upfront with the user than conceal the possibly inconsistent state of the application.

      So when does the EmailingExceptionModule come in handy then? One scenario is that of C# unchecked exceptions, where the compiler doesn’t force the caller to catch all types of exceptions thrown by the callee. Unchecked exceptions may materialize as type cast or null reference exceptions when accessing variables. What unchecked exceptions boil down to is that, given the cyclomatic complexity of some methods, it’s impractical to manually work through every conceivable path of execution ahead of time. Another scenario is problems with the runtime environment, such as out of disk space or database server down. Nonetheless, some map of the path to failure is helpful in preventing others going down that same path.


      (Summary part of example email. Click here for complete output.)

      With web applications, we can take advantage of the application running in a centralized environment. In the spirit of Microsoft’s Doctor Watson technology, we can register our own web application error handler and hook it into the Http request pipeline, and have the application execute our code on unhalted exceptions. In practice, such an error handler is implemented either through the Application_Error method of an application’s Global.asax or by loading an HttpModule into the application.

      Hinted by the title, I went for the HttpModule. The reason may best be understood by taking a peak behind the .Net curtains: Global.asax is a filename hardwired into the framework so that whenever an application is first hit, the framework asks HttpApplicationFactory, an internal factory class, for an HttpApplication representing the application (there’s a unique one for each virtual application). As part of manufacturing the HttpApplication, the factory locates the application’s Global.asax, compiles it into a dynamically generated DLL, loads the DLL, and reflects over the Global.asax class looking for methods adhering to the convention of modulename_eventname. Methods found, such as Application_Error, are then stored in a list of MethodInfos and passed along to HttpApplication. Then, during HttpApplication initialization, events are bound to the methods within Global.asax.

      That’s how Application_Error of Global.asax gets called even though the method isn’t overriding a base class implementation as is typical for the template pattern. That’s also why code within global.asax isn’t binary reusable across applications. An HttpModule, on the other hand, can be loaded into any number of applications.

      In practice, implementing an HttpModule it a matter of creating a class that implements the IHttpModule interface:

         public interface IHttpModule {
            void Init(HttpApplication context);
            void Dispose();
         }
      

      The crux of error handling with Asp.Net is the Error event of HttpApplication. The easiest way to hook up the event to a method is within the Init method of the class. In addition, we assign the application object to a field so that later we can access the most recently thrown exception through it:

         public class EmailingExceptionModule : IHttpModule {
            private HttpApplication _application;
      
            public void Init(HttpApplication a) {
               _application = a;
               _application.Error += OnUnhaltedException;
            }
      
            public void Dispose() {}
      
            private void OnUnhaltedException(object sender, EventArgs e) {
               Exception ex = _application.Server.GetLastError().InnerException;
               // Implementation left out for brevity. Please download source
            }
         }
      

      The OnUnhaltedException method is where to add code that collects information about the exception and the environment, and that composes and ships the email. Depending on the environment, you may find the need to include additional information in the email. Given the source code, adding to the email is a matter of adding key/value pairs in the form of labels and values to a dictionary. Each dictionary then gets rendered as a table of the key and value columns.

      To have an Asp.Net application load the EmailingExceptionModule and to configure its email related settings, add the following nodes to the application’s web.config:

         <configuration>
            <appSettings>
               <add key="EmailingExceptionModule_SenderAddress" value="sender@domain.com"/>
               <add key="EmailingExceptionModule_ReceiverAddress" value="receiver@domain.com"/>
               <add key="EmailingExceptionModule_SmtpServer" value="mail.domain.com"/>
               <add key="EmailingExceptionModule_SubjectPrefix" value="MyApp: "/>
            </appSettings>
            <system.web>
               <httpModules>
                  <add name="EmailingExceptionModule" type="Holm.AspNet.EmailingExceptionModule,
                             EmailingExceptionModule, Version=1.0.0.0, Culture=neutral,
                             PublicKeyToken=13806f613f05e959"/>
               </httpModules>
            </system.web>
         </configuration>
      

      So what’s the user experience of unhalted exceptions? Outside the confines of the EmailingExceptionModule, there’s no indication that the application took note of what happened. Depending on the user’s profile, it might be helpful to prompt for additional information at the crash point. Or perhaps turn to the Windows Live Messenger IM Control & Presence API and have the user engage in an MSN conversation with a developer.

      What’s the advantage of emailing developers unhalted exceptions over storing the information in the file system or a database? I believe in keeping things simple and practical, and handing off information to the file system or a database isn’t. Most likely the information will end up gathering virtual dust on some server. The email approach, on the other hand, is a proactive, visible, and efficient means to the end of rapidly course-correcting for wrong assumptions.

      Update, Sep 15: While you don’t need a PDB file to debug code through Visual Studio (that’s what <compilation debug=”true”> in web.config is for), the PDB file is required for the stack trace to contain line numbers. Since the PDB file typically has to reside next to the corresponding DLL for the .Net runtime to pick it up, the PDB file may need to be deployed to the GAC along with the DLL. Refer to the first three paragraphs of the “Debugging assemblies that live in the GAC” section of this post for how to GAC deploy PDB files.

      For a centrally deployed Asp.Net application, it may be acceptable to ship and deploy the PDB file with the application. Keep in mind, though, that because the PDB file is a mapping of locations in the IL to the source file, having access to the PDB file makes it easier for someone to reverse engineer the code. It’s also possible to get at line numbers without the PDB file being deployed, but the approach is somewhat involved. Based on the mapping nature of the PDB file, the idea is to write out IL offsets as part of the exception and to post-process the offsets using a PDB file at a separate location.

      Yet another alternative to shipping the PDB file is to setup a symbol server. With this nifty piece of code, the StackTrace class loads the PDB file of the symbol server, adding line numbers to the stack trace.

      • Share/Bookmark

      Tags: , , ,
      Posted in .Net, SharePoint | 2 Comments »

      Code based, dynamic CAML query composition

      Posted by Ronnie Holm on 3rd July 2008

      Today I faced an interesting challenge while querying list items in SharePoint. On a page I have a control that displays list items based on the terms used to tag the page, i.e., the content type on which the page is based contains a multi-valued field that holds a subset of predefined tags. Similarly, for items in a SharePoint list, each item is tagged using a subset of the same predefined tags. The responsibility of the control is then to (1) read the tags associated with the page and (2) query the tags of the list items in an OR-wise fashion. The (3) outcome is then displayed by the control as a set of context specific items.

      To decide on the items to display, an initial approach might be to go through the list one item at a time, looking for matching tags. But working with large lists in SharePoint this approach is not recommended. Alternatively, the predicate against which each item in the list is evaluated could take on the form of a CAML query. Looking for items matching a single tag is then easily expressed:

         <Where>
            <Eq>
               <FieldRef Name='MyField' />
               <Value Type='LookupMulti'>tag1</Value>
            </Eq>
         </Where>
      

      Next, consider a query with two tags OR’ed together. Although more verbose, the query is still fairly straightforward to compose on the fly:

         <Where>
            <Or>
               <Eq>
                  <FieldRef Name='MyField' />
                  <Value Type='LookupMulti'>tag1</Value>
               </Eq>
               <Eq>
                  <FieldRef Name='MyField' />
                  <Value Type='LookupMulti'>tag2</Value>
               </Eq>
            </Or>
         </Where>
      

      However, for CAML queries where more than two tags are OR’ed together, the binary nature of the OR operator works against us. Where real programming languages have their parsers automatically transform an expression like (a || b || c) into ((a || b) || c), the CAML query parser applies no such transform — In a sense CAML is more like an XML serialized, abstract syntax tree than a query language intended for direct use:

          <Where>
            <Or>
               <Or>
                  <Eq>
                     <FieldRef Name='MyField' />
                     <Value Type='LookupMulti'>tag1</Value>
                  </Eq>
                  <Eq>
                     <FieldRef Name='MyField' />
                     <Value Type='LookupMulti'>tag2</Value>
                  </Eq>
               </Or>
               <Eq>
                  <FieldRef Name='MyField' />
                  <Value Type='LookupMulti'>tag3</Value>
               </Eq>
            </Or>
         </Where>
      

      So how do we go about composing a query that OR together n tags (not to mention a mix of ORs and ANDs)? One approach might be using the U2U CAML Query Builder. The tool delivers a UI for formulating queries, but also exposes its CAML query builder API as a .Net assembly. Unfortunately, documentation on how to use the builder is sparse. Nonetheless, I went for the case of composing a query that OR together three tags:

         string ComposeCamlQuery() {
            Builder b = new Builder(CamlTypes.GetListItems);
            b.AddViewField("Title");
            b.AddViewField("MyField");
      
            bool addCombinerNode;
            foreach (string tag in new string[] { "tag1", "tag2", "tag3" })
               b.AddWhereField("MyField", tag, "LookupMulti",
                               "Or", out addCombinerNode);
            return b.CamlDocument.InnerXml;
         }
      

      The outcome is a rather strange looking CAML query with extra <And></And> and missing <Eq></Eq> tags. Funny thing is that this type of the query can be correctly composed through the UI, so most likely I’m not using the API correctly:

         <Where>
            <And>
               <And>
                  <Or>
                     <FieldRef Name="MyField" />
                     <Value Type="LookupMulti">tag1</Value>
                  </Or>
                  <Or>
                     <FieldRef Name="MyField" />
                     <Value Type="LookupMulti">tag2</Value>
                  </Or>
               </And>
               <Or>
                  <FieldRef Name="MyField" />
                  <Value Type="LookupMulti">tag3</Value>
               </Or>
            </And>
         </Where>
      

      I eventually abandoned the idea of using the CAML Query Builder API. Instead, a colleague pointed me to Waldek Mastykarz’s post on generating dynamic CAML queries. As long as all tags are to be either AND’ed or OR’ed together, Mastykarz provides an elegant solution, which I modified to be recursive, more generic, and read like an induction proof (at a negligible performance cost):

         ComposeCamlQuery(new[] { "tag1", "tag2", "tag3" },
                          "Or",
                          "<Where>{0}</Where>",
                          @"<Eq>
                             <FieldRef Name='MyField' />
                             <Value Type='LookupMulti'>{0}</Value>
                            </Eq>");
      
         string ComposeCamlQuery(IList<string> ops, string relOp,
                                 string query, string leaf) {
            return ops.Count == 1
               ? string.Format(query,
                               string.Format(leaf, ops[0]))
               : ComposeCamlQuery(ops.Skip(1).ToList(),
                                  relOp,
                                  string.Format(query,
                                     string.Format("<{0}>{1}{{0}}</{0}>",
                                     relOp,
                                     string.Format(leaf, ops[0]))),
                                  leaf);
         }
      

      Obviously, more complex code could be written for a mix of AND’s and OR’s. Before doing so, however, you probably want to evaluate other approaches for filtering list items:

      1. Formulate a simpler, too general query and post-process the result
      2. Formulate smaller queries and optionally merge the results and do post-processing
      3. Use the SharePoint Search API

      I also considered such alternatives as Linq to SharePoint or Caml.net, but although the first holds a lot of promise, it isn’t ready for prime time. As for the latter, its focus is more on composing type-safe queries than dynamic ones, so it’s more of a supplement to the first two alternatives.

      As for SharePoint Search, the downside is that search has to be configured and that the list must have been indexed for the outcome to be accurate. Depending on the frequency with which the list is modified, the time between incremental crawls, and the context in which the results are displayed, the search approach may or may not be the solution you’re looking for.

      Update, July 19: Added recursive code solution.

      • Share/Bookmark

      Tags: ,
      Posted in SharePoint | 2 Comments »