LINQ to XML: An introduction

Why LINQ to XML? This addition was needed; the best way to understand that is to show what you had to do when you wanted to created a simple XML document as the following…

<users>
  <user>
    <name>Alice Wonderland</name>
  </user>
  <user>
    <name>Joe Smith</name>
  </user>
</users>

… in .NET 2.0:

class Program
{
    static void Main(string[] args)
    {
        // create the main xml document.
        XmlDocument xml = new XmlDocument();

        // create and add the users node.
        XmlElement users = xml.CreateElement(“users”);
        xml.AppendChild(users);

        CreateUserNode(xml, users, “Alice Wonderland”);
        CreateUserNode(xml, users, “Joe Smith”);

        // save the results to a temporary file.
        xml.Save(@“C:\Users\Chris\Desktop\output.xml”);
    }

    private static void CreateUserNode(XmlDocument xml, XmlElement users, string name)
    {
        // create and add the user node.
        XmlElement user = xml.CreateElement(“user”);
        users.AppendChild(user);

        // create and add the name node.
        XmlElement nameNode = xml.CreateElement(“name”);
        user.AppendChild(nameNode);

        // create and add the text node.
        XmlText text = xml.CreateTextNode(name);
        nameNode.AppendChild(text);
    }
}

Quite a complex task for such a simple XML file, isn’t it?!

That’s also what the .NET team thought, and for .NET 3.5 they have added the LINQ to XML featureset. LINQ to XML is not only a way to query XML but also a way to compile more easily XML files. Some languages, speaking of VB9, take this concept even a step further and hide the XML creation from the user. Check out the VB9’s XML literals to understand what I mean.

Compiling the XML file from above in C# 3.5 (by using LINQ to XML) looks like this (the key class here is XElement):

static void Main(string[] args)
{
    // create the main xml document.
    XElement xml = new XElement(“users”,
        new XElement(“users”,
            CreateUserNode(“Alice Wonderland”),
            CreateUserNode(“Joe Smith”)));

    // save the results to a temporary file.
    xml.Save(@“C:\Users\Chris\Desktop\output.xml”);
}

private static XElement CreateUserNode(string name)
{
    return new XElement(“user”,
        new XElement(“name”, name));
}

Way shorter, way better! It’s also a lot easier to understand because the items are nested in exactly the same way as they are nested in the final XML file. Creating XML files is finally fun in C#. 8)

Querying XML files

But let us come to the more interesting part now: querying xml files. To allow us to do some proper querying let’s use a more “elaborated” XML file:

<root>
  <user female=“true”>
    <firstName>Emma</firstName>
    <lastName>Heinz</lastName>
    <location>Salzburg</location>
  </user>
  <user female=“false”>
    <firstName>Joe</firstName>
    <lastName>Smith</lastName>
    <location>Vienna</location>
  </user>
  <user female=“true”>
    <firstName>Emma</firstName>
    <lastName>Smith</lastName>
    <location>Vienna</location>
  </user>
</root>

It would be interesting to query for all the females (having the female attribute not null and set to true) and order the result by location. The result is returned as anonymous type and displayed in the console window.

Querying XML files was also possible prior to .NET 3.5 and LINQ to XML. You had to use XPath to do that.

var xml = XElement.Load(@“C:\Users\Chris\Desktop\users.xml”);

// create a query on the xml document.
var query = from u in xml.Elements()
            where u.Attribute(“female”) != null && u.Attribute(“female”).Value == “true”
            orderby u.Element(“location”).Value descending
            select new
            {
                FirstName = u.Element(“firstName”).Value,
                LastName = u.Element(“lastName”).Value,
                Location = u.Element(“location”).Value
            };

foreach (var item in query)
{
    Console.WriteLine(“First: “ + item.FirstName + “; Last: “ + item.LastName + “; Location: “ + item.Location);
}
Console.ReadLine();

The first line of the query asks for all elements underneath the root element: for all “user” elements in our case. Then the query checks if a “female” attribute is present and compares the value of the attribute with the “true” constant. After that the results are ordered by the value in the “location” node. Finally, the result is returned as anonymous type where the values of the “firstName”, “lastName” and “location” node are used to initialize the properties of that type.

LINQ to XML supports all the different query operators that you know from LINQ to Objects or LINQ to SQL.

The current LINQ to XML implementation has still one big problem: you need to specify the names of the nodes as string and can’t use properties for them. That means that you only get errors for invalid nodes during runtime and not during compile time. This could be solved if the XML file would implement a schema and a tool would convert this schema into classes that you could use. This might come in the future but right now nothing is avaible (for LINQ to XML) that does this task.

Another interesting example is to extend the The LINQ to XML concept to read data from another data source and use the new classes to create the XML stream. This might be very interesting if you want to return an ATOM/RSS feed for data that is found in your database or any other data source:

// create a simple list with three elements.
var list2 = new List<Foo>
{
    new Foo { FirstName = “Joe”, LastName = “Smith” },
    new Foo { FirstName = “Emma”, LastName = “Who” },
    new Foo { FirstName = “Andrea”, LastName = “Smith” }
};

// query from the list all with the last name "Smith" and
// create an XML element for each result. The generated
// list of XElements is then added to the root "users" element.
var xml = new XElement(“users”,
            from f in list2
            where f.LastName == “Smith”
            select new XElement(“user”,
                new XElement(“firstName”, f.FirstName),
                new XElement(“lastName”, f.LastName)));

// save the result to the desktop.
xml.Save(@“C:\Users\Chris\Desktop\output.xml”);

Published on Mar 16th, 2008 — Tags: , ,
Comments (3)    digg it!    kick it   

LINQ design guidelines

Mircea Trofin, a program manager on the .NET team at Microsoft, has posted an interesting article on what guidelines should be followed when implementing LINQ features in own classes and framework (he covers extension methods, Func, Action, Expression, IQueryable, IEnumerable etc).

These guidelines have been reviewed by the .NET team and are somehow official. They seem very logical and it’s also the way how I think that LINQ should be implemented in own classes.

Check them out.

Published on Mar 15th, 2008 — Tags: ,
Comments (0)    digg it!    kick it   

Who listens where?

Ever needed a way to understand if an application is installed that listens to a certain port; and that in C#? Some people need such features. I don’t know why exactly but it could be useful to make sure that you don’t use a port that is already used by another application.

In C# there are two ways to understand if somebody already listens on a port: you could create a Socket and check for a SocketException (and there for a specific error code) to understand that something went wrong or you could use the IPGlobalProperties class.

I tend to like the second approach a lot more because exceptions shouldn’t be used to control the flow of an application and are also quite slow.

Now since we are in C# 3.0 we can spice the whole thing up by using a LINQ expression. The GetActiveTcpListeners method on the IPGlobalProperties class returns an array of all active TCP listeners. If you ever coded in C# 3.0 you will know that an array implements the IEnumerable interface and that we can use LINQ with all classes that implement that interface:

/// <summary>
/// Returns whether the given port is open.
/// </summary>
/// <param name="port">The port that is checked for being open.</param>
/// <returns></returns>
public static bool IsPortOpen(int port)
{
    // get all active listener and check if one of them has
    // the given port.
    var connection = (from c in IPGlobalProperties.GetIPGlobalProperties()
                          .GetActiveTcpListeners()
                      where c.Port == port
                      select c).FirstOrDefault();

    // if we got null none has the open port.
    return (connection != null);
}

Published on Feb 26th, 2008 — Tags: , , ,
Comments (6)    digg it!    kick it   

Opf3 and LINQ support

Opf3 is my little baby; speaking of this is the O/RM framework that I have created and that I love to tweak and add features to. As latest addition I have added LINQ support to it!

What are the steps to enable LINQ in Opf3?

First, you need to download the Demo or request the Express Edition of the framework. You could also buy it because you think the framework rocks or even I rock ;) - Buying means that you get the FULL source code of the framework and all updates for free!

Next, you need to create a Console Application (or Windows Forms Application, or even ASP.NET project) where you need to reference the Chili.Opf3 assembly and the new Chili.Opf3.Linq assembly. I have implemented all the LINQ features in an own assembly. It’s just an extension to the framework, which means you still can use the Opf3 “Core” in your .NET 2.0 projects (in Visual Studio 2005); without modifying one line of code.

Skip this step if you have already used Opf3: Then, you might read the beginner tutorials and guidelines for Opf3. Just to get you started with the framework.

Next comes the LINQ fun! To enabled LINQ you need to reference the “Chili.Opf3.Linq” namespace in the files where you want to use it. That’s because LINQ has been implemented by creating an extension method for the ObjectContext class and you won’t see the extension method without adding the namespace reference to the file!

You also need to register the storage class (that is the class that encapsulates the database for Opf3) with the so called “LinqQueryCommandBuilder”. This class is then used by the storage class to convert the Linq expression to an SQL statement (when a query comes to the storage all registered command builder are asked if they can handle the query. If none can you get an exception). This looks like this:

// set the ObjectContext up, by using a SQL Server storage class instance.
ObjectContext context = new ObjectContext(
    new MsSqlStorage(“sa”, “password”, “localhost\\sqlexpress”, “Test”))
// specify the command builder to have LinqQueries translated into SQL. 
context.Storage.StorageCommandBuilders.Add(
    new LinqQueryCommandBuilder());

Making your first query looks then like this:

// get all contacts ordered by the ID.
var query = from c in context.GetPersistents<Contact>()
                  orderby c.ContactID
                  select c;

LINQ for Opf3 supports also anonymous types as return values …

var anonquery = from c in contacts
                where c.ContactType.Type.ToUpper() == “BUYER”
                orderby c.LastName
                select new { ContactID = c.ContactID
                                 , LastName = c.LastName
                                 , Firstname = c.FirstName };

… and joins are also supported:

var query = from c in context.GetPersistents()             
            join u in context.GetPersistents() on c.ID equals u.ID
            select u;

The LINQ extension to the Opf3 framework comes also with one helper class that holds some useful methods, like “Overlaps” (allows to check for overlapping date/time ranges), In (allows to specify a list of items and it checks the database values against them), Min, Max, Field<T&gt (allows to specify a field that is not found as property, but only present in the database).

You might also extend the Opf3 LINQ extension by adding own utility classes that implement the IMethodHandler interface. Such an extension would result in two new classes: one that holds the utility methods and the other that implements the IMethodHandler interface. The class that implements the IMethodHandler interface needs then to be registered with the LinqQueryCommandBuilder. The example shows the registering of a “RandomMethodHandler” that would handle the methods of a utility class that might be named “Randomizer”:

ObjectContext context = new ObjectContext(
     new MsSqlStorage(“sa”, “password”, “localhost\\sqlexpress”, “Test”));
                       
// create the linq query command builder.
var builder = new LinqQueryCommandBuilder();
// register the method builder for the random class.
builder.MethodHandlers.Add(new RandomMethodHandler());

// register the query command builder with the storage
// of the ObjectContext.
context.Storage.StorageCommandBuilders.Add(builder);

The usage of the “Randomizer” looks like this:

var query = from i in context.GetPersistents<Foo>()
                where i.ID == Randomizer.GetNextRandom(1, 100)
                select i;

Now each time a method is encountered the LinqQueryCommandBuilder is going to ask all the registered MethodHandler if they can handle the method. In our case the RandomMethodHandler (since registered) can handle the method and will do that and return the result.

For an example on how to implement something like this check out the QueryUtility and QueryUtilityMethodsHandler classes in the Chili.Opf3.Linq project.

Thanks to Alfred Ortega, who already created a short intro to LINQ for Opf3 in the Opf3 forums.

Edit: Btw, if you are interested in what LINQ and lambda expressions in .NET are, you cold have a look at one of my older posts.

Published on Feb 18th, 2008 — Tags: , ,
Comments (6)    digg it!    kick it   

Webcast on how LINQ expressions work internally

I have done a short screencast that is explaining how LINQ works internally. I think most people are scared when they see Lambda expressions and I tried to explain why they look as they look, what they actually do and that they are very similar to normal delegate calls.

Tune in, if you are interested in how LINQ is working internally and what happens internally when you write the LINQ expressions into the IDE editor.

Published on Nov 14th, 2007 — Tags: ,
Comments (1)    digg it!    kick it   

How to write a simple ADO.NET ORM by using Linq and extension methods - in 10 minutes

Yesterdays post from Alex James inspired me to create a very simple “ORM” by using C# 3.0’s extension methods and Linq. I wanted the ORM to be very simple and to build upon the existing ADO.NET infrastructure. That means that the “framework” is going to use the existing ADO.NET classes and extend them to return persistent object instances.

The classic methods are still available. The new methods are implemented as extension methods and add therefore only something without removing something else - which is great!

Now let’s get started. First I thought of creating an attribute to create the mapping between the persistent object and the returned data. Note that in C# 3.0 private fields don’t need to be created for a property, if not accessed from somewhere else then through the property itself (that’s why the setter and getter are empty!)

/// <summary>
/// Attribute used to decorate the properties that are filled with
/// the result from the database.
/// </summary>
[AttributeUsage(AttributeTargets.Property)]
public sealed class ColumnAttribute : Attribute
{
        /// <summary>
        /// Name of the column in the database.
        /// </summary>
        public string Name { get; set; }
}

Next we need to define some extension methods. I’m going to put that code into a static class called “ADONETExtensions”.

public static class ADONETExtensions
{
        // Methods…
}

The first method that’s going to be added is a method that returns an enumerator for the classes that implement the IDataReader interface. I have the method called GetEnumerator.

/// <summary>
/// Gets an enumerator for the given datareader.
/// </summary>
public static IEnumerable<T> GetEnumerator<T>(this IDataReader reader) where T : new()
{
        // Ensure to close the reader when done.
        using (reader)
        {
                // Read until we reach the end of the result.m
                while (reader.Read())
                {
                        // Create a new instance of the object to return.
                        T instance = Activator.CreateInstance<T>();

                        // Get a enumerable of properties that have
                        // the ColumnAttribute specified.
                        // Return an anonymous type that holds the
                        // property and the ColumnAttribute.
                        var props = from c in typeof(T).GetProperties()
                                  where Attribute.GetCustomAttribute(c, typeof(ColumnAttribute)) != null
                                  select new { Property = c, Attribute = (ColumnAttribute)Attribute.GetCustomAttribute(c, typeof(ColumnAttribute)) };

                        // For each property returned set the value
                        // returned by the database.
                        props.ToList().ForEach(x => x.Property.SetValue(instance, reader.TryGetValue(x.Attribute.Name), null));

                        // Return the current instance.
                        yield return instance;
                }
        }
}

The method uses a Linq statement to get all properties of the specified persistent type that are decorated with the “ColumnAttribute” attribute. The property and attribute itself is returned as new anonymous type. They are going to be used later in the ForEach method. Foreach loops through all mapped properties and does something. In this case we set the returned value (from the query) as the property value in the persistent object instance. ForEach expects a delegate. We are using a lambda expression here. A lambda expression is another notation for an anonymous delegate.

TryGetValue is an extension method that is also implemented in the ADONETExtensions class and looks like this:

/// <summary>
/// Tries to get a value for the given column name.
/// </summary>
public static object TryGetValue(this IDataReader reader, string name)
{
        // Get the schema table for the data reader.
        DataTable table = reader.GetSchemaTable();

        // The first column of the schema table
        // contains the names of the returned columns.
        // Loop through all rows of the first column to
        // understand if the given column name (name
        // argument) is found in the schema table.
        for (int i = 0; i < table.Rows.Count; i++)
        {
                if (table.Rows[i][0].ToString().ToLower() == name.ToLower())
                {
                        // If found return the value at the index i.
                        return reader[i];
                }
        }
                       
        // Return null if the column is not in the result.
        return null;
}

What might be a little bit confusing is the “GetSchemaTable” call. This call returns a DataTable that holds the schema of the returned IDataReader’s data. The best way to understand what’s found in this table is to set a breakpoint into the code and have a look at the table.

What we need next is a way to get the objects from our database connection. That’s why we create a GetObjects extension method for IDbConnection. This method is very similar to the one that has been created by Alex. The differences are that the method expects a generic type (the type T of the persistent objects being returned), next it returns an IEnumerable and it allows an IDbConnection to be specified. That’s possible because each IDbConnection has a method “CreateCommand” to create a command from the connection.

/// <summary>
/// Gets objects from the database.
/// </summary>
/// <param name="sql">The SQL statement to get the objects.</param>
public static IEnumerable<T> GetObjects<T>(this IDbConnection connection, string sql) where T : new()
{
        // Create a new command.
        IDbCommand command = connection.CreateCommand();
        command.CommandText = sql;

        // Get the reader.
        return command.ExecuteReader().GetEnumerator<T>();
}

Now we are done with our “ORM”. Next is to create a persistent object that uses our “ColumnAttribute” to specify the mapping:

class User
{
        /// <summary>
        /// Property Name maps to column NAME.
        /// </summary>
        [Column(Name = “NAME”)]
        public string Name
        {
                get;
                set;
        }
}

The column name and property name equal only in this example.
We are finally done with creating classes. Now let’s use them!

class Program
{
        static void Main(string[] args)
        {
                // Create the connection to the database.
                SqlConnection connection = new SqlConnection(@“user id=sa;server=waldemar\sqlexpress;database=Test;password=xxxx”);
                connection.Open();

                // Filter the database results by using
                // a Linq expression.
                var result = from u in connection.GetObjects<User>(“select * from [USERS]“)
                         where u.Name == “Alex” || u.Name == “Christian”
                         orderby u.Name ascending
                         select u;

                // Loop over the results.
                foreach(var user in connection.GetObjects<User>(“select * from [USERS]“))
                {
                        // Do something.
                }

                // Return a list with the results.
                List<User> list = connection.GetObjects<User>(“select * from [USERS]“).ToList();
        }
}

That’s it. Our little framework allows us to load persistent objects from the database. The framework is very simple and leaves still a lot of open points. There is, for example, no way to handle database parameters at the moment and it needs also a way to save the persistent objects back to the database.

That’s now up to you… Happy coding :)

Published on May 12th, 2007 — Tags: , ,
Comments (5)    digg it!    kick it