Tuesday, June 30, 2009

Introduction to Language Integrated Query with Delphi Prism: Part 1

Language Integrated Query, or LINQ (pronounced link), is a declarative programming language developed by Microsoft for the .NET framework. In a declarative programming language you specify what result you want to achieve without needing to specify how those results are produced.

SQL SELECT statements are an example of declarative programming. With a given SQL statement you specify what data you want returned. It is up to the SQL engine to determine how that data is derived.

Traditional Delphi, on the other hand, is categorized as imperative programming. When writing Delphi code, you generally describe in detailed terms how you want the result obtained. You do this using control structures, expressions, and explicit calls to functions and procedures.

This article is designed to provide you with a general overview of using LINQ with Delphi Prism, the latest .NET development tool from Embarcadero Technologies and RemObjects. For a more detailed discussion of LINQ and its related topics, refer to the latest version of the .NET framework SDK.

Overview of LINQ

Language Integrated Query comes in a variety of different flavors, depending on the version of the .NET framework you are compiling against. For example, there is LINQ to Objects, LINQ to DataSets, LINQ to SQL, and LINQ to XML. Each of these technologies, which were introduced in .NET 3.0, provides classes you can use in conjunction with LINQ to work with objects in various domains of the .NET framework.

In addition, some of the major new features that are emerging in .NET also are LINQ related. One example is LINQ to Entities, which is part of the Entity Framework. The Entity Framework is one of the latest database technologies being promoted by Microsoft.

Turning our attention to the general case of LINQ, LINQ queries can be divided into three basic parts. These are

  1. A queryable source of data
  2. A query definition
  3. Query execution

The following sections describe each of these parts in greater detail.

Queryable Sources of Data

In strictly .NET terms, a queryable source of data is any object that implements the generic IEnumerable<T> interface, which includes List<T> and Dictionary<TKey, TValue>. In Delphi Prism, this also includes any collection that is declared as a sequence, as well as arrays (both fixed and dynamic).

Some of the LINQ technologies, such as LINQ to DataSet and LINQ to XML, are implemented through LINQ providers, which are support classes that enable LINQ operations. These providers usually include special methods that provide access to an IEnumerable<T> reference based on an object that otherwise does not support this interface. For example, LINQ to DataSet provides the AsEnumerable extension method to the DataTable class.

The following variable declaration defines a simple queryable data source. In this case, the data source is a sequence.

var seq: sequence of Integer := [1,2,3,4,5,6,7,8,9];

Query Definitions using Query Syntax

A query definition consists of two parts, a query variable and a query expression. Furthermore, query expressions are defined using one of two techniques. The first is called query syntax, and the second is called method syntax. This section describes query syntax. Method syntax is described later in this article.

Query expressions are defined using a combination of one or more LINQ statements, comparison operators, extension methods, and value references. At a minimum, a LINQ query includes a from clause, which defines an alias and a data source. Consider the following code segment:

var numbers: sequence of Integer := [1,2,3,4,5,6,7,8,9];

var myquery := from c in numbers select c;

The first line of code defines the data source, and the second line defines the query. The query in this case selects all items from the sequence numbers, as can be seen in the following figure (the code that populates the list box has not been shown yet).

Figure 1. The LINQExamples project

The c in the preceding query is an alias, and it is used to reference items in the numbers sequence. The select part of the query, which contrary to SQL standards, appears at the end of the statement, defines what is returned by the query. In this case, the query returns each of the items in the sequence.

The preceding query really just creates another sequence that is no different from the one that it queried. Most LINQ queries, by comparison, either select subsets of the data, performs data transformations, or other similar operations. The following query includes a where clause that limits the myquery sequence to those numbers between 4 and 8, inclusively. As you can see, the alias is essential for performing this operation.

var numbers: sequence of Integer := [1,2,3,4,5,6,7,8,9];
var myquery := from c in numbers where (c >= 4) and (c <= 8) select c;

In both queries shown so far in this section, the select clause could have been omitted. Specifically, if what we are returning is exactly the same type of item as in the sequence, we can omit the select clause altogether. In other words, omitting the select clause is somewhat similar to a SELECT * clause in a SQL query. The following query, which does not have a select clause, produces the same result as the preceding one.

var numbers: sequence of Integer := [1,2,3,4,5,6,7,8,9];
var myquery := from c in numbers where (c >= 4) and (c <= 8);

The select clause in LINQ is required when you are querying objects with one or more members, and want to only return a subset of its members, or want to return transformed data. For example, the following query returns a sequence of objects whose values are equal to the square of the original values. The objects returned by the select clause, in this case, are anonymous types. (An anonymous type of an object whose type is never explicitly declared, one of many advanced language features supported by Delphi Prism.)

var numbers: sequence of Integer := [1,2,3,4,5,6,7,8,9];
var myquery := from c in numbers select new class(Value := c * c);

LINQ queries can be relatively complicated. In addition to the from, where, and select keywords shown here, Delphi Prism also supports the following LINQ operations: order by, group by, join, with, take, skip, reverse, and distinct. In addition, LINQ queries can make use of lambda expressions, both in the where and select parts (Lambda expressions are discussed briefly later in this article). Finally, similar to SQL, LINQ queries can include subqueries.

Executing LINQ Queries

Defining a query and executing a query are two distinct steps (though as you will learn shortly, Delphi Prism provides a mechanism for combining these operations). Specifically, the query defined in the preceding section identifies what the query will do, but does not execute the query. In LINQ this is referred to as deferred execution.

You cause the query to be executed by iterating over it using a for each loop. Each execution of the loop brings back another instance of whatever the query returns. For example, if you used a for each loop to iterate over the query shown in the preceding example, each iteration of the loop would return a different instance of the anonymous type.

The following is the entire code sequence (short as it is), which shows how the Value member of the returned objects are assigned to the list box (which is name ResultList in this example).

ResultsList.Items.Clear;
var numbers: sequence of Integer := [1,2,3,4,5,6,7,8,9];
var myquery := from c in numbers select new class(Value := c * c);
for each m in myquery do
  ResultsList.Items.Add(m.Value.ToString);

When written to the list box, this query result looks like the following.

Figure 2. Data from an anonymous type returns from a LINQ query

Similar to the alias in the query definition, the for each loop defines a local variable that holds the value of the item that is returned in each iteration. In the preceding code, this variable is named m. In most cases, Delphi Prism uses type inference to determine the type of this variable (which is fortunate in this case, since we used an anonymous type).

Not all LINQ queries use deferred execution. To cause the results of the query to retrieve immediately, you can use either the IEnumerable methods ToList<TSource> or ToArray<TSource>. An example of this is shown in the following code.

var numbers: sequence of Integer := [1,2,3,4,5,6,7,8,9];
var MyResults: Array of Integer :=
  (from c in numbers where (c >= 4) and (c <= 8)).ToArray();
for each val in MyResults do
ResultsList.Items.Add(val.ToString);

Similarly, if you use any of the grouping operations, such Average, Count, First, and so forth, the query is executed immediately. This is demonstrated in the following code.

var numbers: sequence of Integer := [1,2,3,4,5,6,7,8,9];
var MyResults: Double := (from c in numbers).Average();
MessageBox.Show(MyResults.ToString);

LINQ Query Method Syntax

As mentioned earlier, LINQ query expressions can be defined using both query syntax and method syntax. Method syntax is implemented through extension methods on the System.Linq.Enumerable class, and corresponds roughly to the LINQ query syntax operators.

The truth is that when you use query syntax in your query expressions, the compiler converts these into method syntax. Another way to put this is that any query that you can write using query syntax can also be defined using method syntax. Interestingly, the opposite is not true. Specifically, there are some queries that you must define using method syntax, as there is no equivalent in query syntax.

Unlike in query syntax, there are no aliases in method syntax. Instead, the methods are called on your IEnumerable object directly, using dot notation. These operations are typically performed in a single expression. In other words, using dot notation, method syntax sometimes includes a number of method calls, implemented through a chain of calls.

This is demonstrated in the following code segment, which produces a query identical to the first one listed in the preceding section “Query Definitions using Query Syntax.”

var myquery := numbers.where(c -> c >= 4).
  where(c -> c <= 8).orderby(c -> c).select(c -> c);

The preceding statement is a single statement, even though it has been wrapped onto two lines due to the limits of the column space in this article. Fortunately, this wrapped form can still be compiled by Delphi since its compiler is designed to ignore white space. Also, the arguments of the where, orderby, and select clauses are lambda expressions. Lambda expressions are discussed in the following section.

As with the query expressions, the select method call here is not necessary, since there is no transformation being performed on the returned items.

The syntax is a little different when your sequence contains more complex data. This can be seen from the following code, which is the method syntax version of the code segment described in the following section “LINQ to Objects.”

var Customers := GetCustomers;
var query := Customers.Where(Customer -> Customer.Age > 10).
  Where(Customer -> Customer.Age < 40).
  Where(Customer -> Customer.Active);

Lambda Expressions

Lambda expressions are anonymous functions that can accept zero or more parameters, and typically return a value. The parameters appear on the left-hand side of the lambda operator, and are enclosed in parentheses (unless there are zero or one parameters, in which case the parentheses are optional).

The parameters, if present, are following by the lambda operator, which is the -> character sequence in Delphi Prism (C# uses => as the lambda operator). When reading a lambda expression aloud, the lambda operator is spoken “goes to.” As a result, the first expression in the preceding query expression is read “customer goes to customer dot age greater than 10.”

The expression on the right-hand side of the lambda operator includes either an expression or a statement. This is the value that is returned by the lambda expression (if the statement is an expression lambda). It is possible, however, to include an expression block that does not resolve to an expression. These lambdas are called statement lambdas. Statement lambdas should not be used in method calls when using method syntax.

In most cases, lambda expressions take advantage of the compiler’s ability to infer the type of the input parameters.

Summary

This article has provided you with a brief introduction to the composition of language integrated queries. In part 2 of this article, which will appear within a month of this article, numerous examples of LINQ queries are shown, including technologies such as LINQ to Objects, LINQ to DataSets, and LINQ to XML.

Copyright © 2009 Cary Jensen All Rights Reserved

1 comment:

  1. Great overview. Thanks for sharing it. I am looking forward to your CodeRage session too.

    ReplyDelete