Introduce DataBase,Asp.net,JavaScript,Xml,Html,Css,Sql,Php,ASP.NET Controls,AJAX,Tools,HTML,CSS,JavaScript,Open Source Project,WPF,.Net Framework,Linq
Top Recommended Hosting

Linq to sql Entity Inheritance

by the3factory 4/12/2008 9:26:00 AM

Sometime a single table contains many types of entities. For example, imagine a list of contacts-some of them can be customers, others can be suppliers, and others can be company employees. From a data point of view, each entity can have some specific fields. (For example, a customer can have a discount field, which is not relevant for employees and suppliers.) From a business logic point of view, each entity can implement different business rules. The best way to model this kind of data in an object-oriented environment is by leveraging inheritance to create a hierarchy of specialized classes. LINQ to SQL allows a set of classes derived from the same base class to map the same relational table.

The InheritanceMapping attribute decorates the base class of a hierarchy, indicating the corresponding derived classes that are based on the value of a special discriminator column. The Code parameter defines a possible value, and the Type parameter defines the corresponding derived type. The discriminator column is defined by the IsDiscriminator argument being set to true in the Column attribute specification. Listing 5-3 provides an example of a hierarchy based on the Contacts table of the Northwind sample database.

Listing 5-3: Hierarchy of classes based on contacts
Image from book

[Table(Name="Contacts")]
[InheritanceMapping(Code = "Customer", Type = typeof(CustomerContact))]
[InheritanceMapping(Code = "Supplier", Type = typeof(SupplierContact))]
[InheritanceMapping(Code = "Shipper", Type = typeof(ShipperContact))]
[InheritanceMapping(Code = "Employee", Type = typeof(Contact), IsDefault = true)] 
public class Contact {
[Column(IsPrimaryKey=true)] public int ContactID;
[Column(Name="ContactName")] public string Name;
[Column] public string Phone;
[Column(IsDiscriminator = true)] public string ContactType;
}
public class CompanyContact : Contact {
[Column(Name="CompanyName")] public string Company;
}
public class CustomerContact : CompanyContact {
}
public class SupplierContact : CompanyContact {
}
public class ShipperContact : CompanyContact {
public string Shipper {
get { return Company; }
set { Company = value; }
}
}
Image from book

Contact is the base class of the hierarchy. If the contact is a Customer, a Supplier, or a Shipper, the corresponding classes derive from an intermediate CompanyContact, which defines the Company field corresponding to the CompanyName column in the source table. The CompanyContact intermediate class is necessary because you cannot reference the same column (CompanyName) in more than one field, even if this happens in different classes in the same hierarchy. The ShipperContact class defines a Shipper property that exposes the same value of Company, with a different semantic meaning.

Note 

This approach requires you to flatten the union of all possible data columns for the whole hierarchy into a single table. If you have a normalized database, you might have data for different entities separated in different tables. You can define a view to use LINQ to SQL to support entity hierarchy, but to update data you must make the view updatable.

The level of abstraction offered by having different entity classes in the same hierarchy is well described by the sample queries shown in Listing 5-4. The queryTyped query uses the OfType operator, while queryFiltered relies on a standard where condition to filter out contacts that are not customers.

Listing 5-4: Queries using a hierarchy of entity classes
Image from book

var queryTyped =
from    c in contacts.OfType<CustomerContact>()
select  c;
var queryFiltered =
from    c in contacts
    where   c is CustomerContact
select  c;
foreach( var row in queryTyped ) {
Console.WriteLine( row.Company );
}
// We need an explicit cast to access the CostumerContact members
foreach( CustomerContact row in queryFiltered ) {
Console.WriteLine( row.Company );
}
Image from book

The SQL queries produced by these LINQ queries are functionally identical to the following one. (The actual query is different because of generalization coding.)

SELECT [t0].[ContactType], [t0].[CompanyName] AS [Company],
[t0].[ContactID], [t0].[ContactName] AS [Name],
[t0].[Phone]
FROM   [Contacts] AS [t0]
WHERE  [t0].[ContactType] = 'Customer'

The difference between queryTyped and queryFiltered queries lies in the returned type. A queryTyped query returns a sequence of CustomerContact instances, while queryFiltered returns a sequence of the base class Contact. With queryFiltered, you need to explicitly cast the result into a CustomerContact type if you want to access the Company property.

Unique Object Identity

An instance of an entity class stores an in-memory representation of table row data. If you try to instantiate two different entities containing the same row, you obtain a reference to the same inmemory object. In other words, object identity (same references) maintains data identity (same table row) using the entity unique key. The LINQ to SQL engine ensures that the same object reference is used when an entity instantiated from a query result is already in memory. This check does not happen if you create an instance of an entity by yourself. In Listing 5-5, you can see that c1 and c2 reference the same Contact instance, even if they originate from two different queries, while c3 is a different object, even if its content is equivalent to the others.

Note 

If you want to force reloading data from the database, you must use the Refresh method of the DataContext class. We will say more about this later in the “Concurrent Operations” section.

Listing 5-5: Object identity
Image from book

var queryTyped =
from    c in contacts.OfType<CustomerContact>()
orderby c.ContactID
select  c;
var queryFiltered =
from    c in contacts
where   c is CustomerContact
orderby c.ContactID
select  c;
Contact c1 = null;
Contact c2 = null;
foreach( var row in queryTyped.Take(1) ) {
c1 = row;
}
foreach( var row in queryFiltered.Take(1) ) {
c2 = row;
}
Contact c3 = new Contact();
c3.ContactID = c1.ContactID;
c3.ContactType = c1.ContactType;
c3.Name = c1.Name;
c3.Phone = c1.Phone;
Debug.Assert( c1 == c2 ); // same instance
Debug.Assert( c1 != c3 ); // different objects
Image from book

Entity Constraints

The entity classes cannot represent all possible check constraints of a relational table. No attributes are available to specify the same alternate keys (unique constraint), triggers, and check expressions that can be defined in a relational database. This fact is relevant when you start to manipulate data using entity classes, because you cannot guarantee that an updated value will be accepted by the underlying database. (For example, it could have a duplicated unique key.) However, because you can load into entity instances only parts (rows) of the whole table, these kinds of checks are not possible without accessing the relational database, anyway.

There is partial support for describing a primary key, unique constraint, and other indexes only through XML external metadata specification. You will see this discussed later in the “External Mapping” section. This support is useful only to generate a database starting from LINQ to SQL metadata.

More complete support is available for maintaining valid relationships between entities, just like the support offered by foreign keys in a standard relational environment.

Associations Between Entities

Relationships between entities in a relational database are modeled on the concept of foreign keys referring to primary keys of a table. Class entities can use the same concept through the Association attribute, which can describe both sides of a one-to-many relationship described by a foreign key.

EntityRef   Let us start with the concept of lookup, which is the typical operation used to get the customer related to one order. Lookup can be seen as the direct translation into the entity model of the foreign key relationship existing between the CustomerID column of the Orders table and the primary key of the Customers table. In our entity model, the Order entity class will have a Customer property (of type Customer), that shows the customer data. This property is decorated with the Association attribute and stores its information in an EntityRef<Customer> member (named _Customer), which enables the deferred loading of references that you will see shortly. Listing 5-6 shows the definition of this association.

Listing 5-6: Association EntityRef
Image from book

[Table(Name="Orders")]
public class Order {
[Column(IsPrimaryKey=true)] public int OrderID;
[Column] private string CustomerID;
    [Association(Storage="_Customer", ThisKey="CustomerID")]
public Customer Customer {
get { return this._Customer.Entity; }
set { this._Customer.Entity = value; }
}
    private EntityRef<Customer> _Customer;
}
Image from book

As you can see, the CustomerID column must be defined in Order because otherwise it would not be possible to obtain the related Customer. The ThisKey argument or the Association attribute indicates the “foreign key” column (which would be a comma-separated list if more columns were involved for a composite key) that is used to define the relationship between entities. If you want to hide this detail in the entity properties, you can declare that column as private, just as in the Order class shown earlier.

Using the Order class in a LINQ query, you can specify a Customer property in a filter without the need to write a join between Customer and Order entities. In the following query, the Country member of the related Customer is used to filter orders that come from customers of a particular Country:

Table<Order> Orders = db.GetTable<Order>();
var query =
from   o in Orders
where  o.Customer.Country == "USA"
select o.OrderID;

The previous query is translated into an SQL JOIN like the following one:

SELECT    [t0].[OrderID]
FROM      [Orders] AS [t0]
LEFT JOIN [Customers] AS [t1]
       ON [t1].[CustomerID] = [t0].[CustomerID]
WHERE     [t1].[Country] = "USA"

Until now, we have used entity relationships only for their metadata-building LINQ queries. When an instance of an entity class is created, a reference to another entity (such as the previous Customer property) works with a technique called deferred loading. The related Customer entity is not instantiated and loaded into memory from the database until it is accessed either in read or write mode.

More Info 

EntityRef<T> is a wrapper class that is instantiated with the container object to give a valid reference for any access to the referenced entity. Each read/write operation is filtered by a property getter and setter, which execute a query to load data from the database the first time this entity is accessed.

In other words, to generate an SQL query to populate the Customer related entity when the Country property is accessed, you would use the following code:

var query =
from   o in Orders
where  o.OrderID == 10528
select o;
foreach( var row in query ) {
Console.WriteLine( row.Customer.Country );
}

The process of accessing the Customer property involves checking to determine whether the related Customer entity is already in memory. If it is, that entity is accessed; otherwise, the following SQL query is executed and the corresponding Customer entity is loaded in memory and then accessed:

SELECT [t0].[Country], [t0].[CustomerID], [t0].[CompanyName]
FROM   [Customers] AS [t0]
WHERE  [t0].[CustomerID] = "GREAL"

The GREAL string is the CustomerID value for order 10528. As you can see, the SELECT statement queries all columns declared in the Customer entity, even if they are not used in the expression that accessed the Customer entity. (In this case, the executed code never referenced the CompanyName member.)

EntitySet   The other side of an association is a table that is referenced from another table through its primary key. Although this is an implicit consequence of the foreign key constraint in a relational model, you need to explicitly define this association in the entity model. If the Customers table is referenced from the Orders table, you can define an Orders property in the Customer class that represents the set of Order entities related to a given Customer. The relationship is implemented by an instance of EntitySet<Order>, which is a wrapper class over the sequence of related orders. You might want to directly expose this EntitySet<T> type, as in the code shown in Listing 5-7. In that code, the OtherKey argument of the Association attribute specifies the name of the member on the related type (Order) that defines the association between Customer and the set of Order entities.

Listing 5-7: Association EntitySet (visible)
Image from book

[Table(Name="Customers")]
public class Customer {
[Column(IsPrimaryKey=true)] public string CustomerID;
[Column] public string CompanyName;
[Column] public string Country;
    [Association(OtherKey="CustomerID")]
    public EntitySet<Order> Orders;
}
Image from book

You might also decide to hide this implementation detail, making only an ICollection<Order> visible outside of the Customer class, as in the declaration shown in Listing 5-8. In this case, the Storage argument of the Association attribute specifies the EntitySet<T> physical storage.

Listing 5-8: Association EntitySet (hidden)
Image from book

public class Customer {
[Column(IsPrimaryKey=true)] public string CustomerID;
[Column] public string CompanyName;
[Column] public string Country;
private EntitySet<Order> _Orders;
[Association(OtherKey="CustomerID", Storage="_Orders")]
public ICollection<Order> Orders {
get { return this._Orders; }
set { this._Orders.Assign(value); }
}
}
Image from book

With both models of association declaration, you can use the Customer class in a LINQ query, accessing the related Order entities without the need to write a join. You simply specify the Orders property. The next query returns the names of customers who placed more than 20 orders:

Table<Customer> Customers = db.GetTable<Customer>();
var query =
from   c in Customers
where  c.Orders.Count > 20
select c.CompanyName;

The previous LINQ query is translated into an SQL query like the following one:

SELECT [t0].[CompanyName]
FROM [Customers] AS [t0]
WHERE ( SELECT COUNT(*)
        FROM [Orders] AS [t1]
        WHERE [t1].[CustomerID] = [t0].[CustomerID]
       ) > 20

In this case, no instances of the Order entity are created. The Orders property serves only as a metadata source to generate the desired SQL query. If you return a Customer entity from a LINQ query, you can access the Orders of a customer on demand:

var query =
from   c in Customers
where  c.Orders.Count > 20
    select c;
foreach( var row in query ) {
Console.WriteLine( row.CompanyName );
foreach( var order in row.Orders ) {
Console.WriteLine( order.OrderID );
}
}

In the previous code, you are using deferred loading. Each time you access the Orders property of a customer for the first time (as indicated by the highlighted code in the preceding code sample), a query like the following one (which uses @p0 as the parameter to filter CustomerID) is sent to the database:

SELECT [t0].[OrderID], [t0].[CustomerID]
FROM [Orders] AS [t0]
WHERE [t0].[CustomerID] = @p0

If you want to load all orders for all customers into memory using only one query to the database, you need to request immediate loading instead of deferred loading. To do that, you have two options. The first approach, which is demonstrated in Listing 5-9, is to force the inclusion of an EntitySet using a DataShape instance and the call of its LoadWith<T> method.

Listing 5-9: Use of DataShape and LoadWith<T>
Image from book

DataContext db = new DataContext( ConnectionString );
Table<Customer> Customers = db.GetTable<Customer>();
DataShape ds = new DataShape();
ds.LoadWith<Customer>( c => c.Orders );
db.Shape = ds;
var query =
from   c in Customers
where  c.Orders.Count > 20
select c;
Image from book

The second option is to return a new entity that explicitly includes the Orders property for the Customer:

var query =
from   c in Customers
where  c.Orders.Count > 20
select new { c.CompanyName, c.Orders };

These LINQ queries send an SQL query to the database to get all customers who placed more than 20 orders, including the whole order list for each customer. That SQL query might be similar to the one shown in the following code:

SELECT [t2].[CompanyName], [t3].[OrderID], [t3].[CustomerID], (
SELECT COUNT(*)
FROM [Orders] AS [t4]
WHERE [t4].[CustomerID] = [t2].[CustomerID]
) AS [count]
FROM (
SELECT [t0].[CustomerID], [t0].[CompanyName]
FROM [Customers] AS [t0]
WHERE (
SELECT COUNT(*)
FROM [Orders] AS [t1]
WHERE [t1].[CustomerID] = [t0].[CustomerID]
) > 20
) AS [t2]
LEFT OUTER JOIN [Orders] AS [t3] ON [t3].[CustomerID] = [t2].[CustomerID]
ORDER BY [t2].[CustomerID], [t3].[OrderID]
Note 

You can observe that there is a single SQL statement here and the LINQ to SQL engine parses the result, extracting different entities (Customers and Orders). Keeping the result ordered by CustomerID, the engine can build in-memory entities and relationships in a faster way.

You can filter the subquery produced by relationship navigation. Suppose you want to see only customers who placed at least five orders in 1997, and you want to load and see only these orders. You can use the AssociateWith<T> method of the DataShape class to do that, as demonstrated in Listing 5-10.

Listing 5-10: Use of DataShape and AssociateWith<T>
Image from book

DataShape ds = new DataShape();
ds.AssociateWith<Customer>(
    c => c.Orders.Where(
        o => o.OrderDate.Value.Year == 1997 ) );
db.Shape = ds;
var query =
from   c in Customers
where  c.Orders.Count > 5
select c;
Image from book

You will appreciate that the C# filter condition (o.OrderDate.Value.Year == 1997) is translated into the following SQL expression:

(DATEPART(Year, [t2].[OrderDate]) = 1997)

Using AssociateWith<T> alone does not apply the immediate loading behavior. If you want both immediate loading and filtering through a relationship, you have to call both the LoadWith<T> and AssociateWith<T> methods. The order of these calls is not relevant. For example, you can write the following code:

DataShape ds = new DataShape();
ds.AssociateWith<Customer>(
c => c.Orders.Where(
o => o.OrderDate.Value.Year == 1997 ) );
ds.LoadWith<Customer>( c => c.Orders );
db.Shape = ds;

Loading all data into memory using a single query might be a better approach if you are sure you will access all data that is loaded, because you will spend less time in round-trip latency. However, this technique will consume more memory and bandwidth when the typical access to a graph of entities is random. Think about these details when you decide how to query your data model.

Other Association Attributes   The Association attribute can also have other parameters. Name corresponds to the foreign key constraint name, Unique defines a real one-to-one relationship, and OtherKey can specify the comma-separated value of members that forms the primary key of the related entity.

Graph Consistency   Relationships are bidirectional between entities-when an update is made on one side, the other side should be kept synchronized. LINQ to SQL does not automatically manage this kind of synchronization, which has to be done by the class entity implementation. LINQ to SQL offers an implementation pattern that is also used by code-generation tools such as SQLMetal, a tool that is part of the Microsoft .NET 3.5 Software Development Kit (SDK), which will be described later in this chapter. This pattern is based on the EntitySet<T> class on one side and on the complex setter accessor on the other side. The product documentation offers a detailed explanation of how to implement this pattern if you do not want to rely on tools-generated code.

Change Notification   You will see in the “Data Update” section that LINQ to SQL is able to track changes in entities, submitting equivalent changes to the database. This process is implemented by default through an algorithm that compares an object’s content with its original values, requiring a copy of each tracked object. The memory consumption can be high, but it can be optimized if entities participate in the change tracking service by announcing when an object has been changed.

The implementation requires an entity to expose all its data through properties, and each property setter needs to call the PropertyChanging method of DataContext. Entities that implement this pattern should also implement the System.Data.Linq.INotifyPropertyChanging interface. Further details are available in the product documentation. Tools-generated code for entities (such as that emitted by SQLMetal) already implements this pattern.

Note 

The System.Data.Linq.INotifyPropertyChanging interface should not be confused with the System.ComponentModel.INotifyPropertyChanged interface that is used in frameworks such as Windows Forms and Windows Presentation Foundation (WPF). These interfaces can work in conjunction with each other when performing data binding in a User Interface layer.

Related posts

Sign up for PayPal and start accepting credit card payments instantly.


Powered by BlogEngine.NET 1.2.0.0