seaN aitkeN – The Blog

February 7, 2007

Select DISTINCT from DataTable – Success!

Filed under: .NET — clevercoder @ 12:30 pm

So there was the evolution of ADO into .NET.. All these wonderful disconnected data objects to make life… well… easier? Turns out that the magic carpet of ADO.NET has it’s limitations. One such limitation is the lack of a "SELECT DISTINCT ..." against a DataTable. Why not? After all, it does have filtering and sorting… even a .Select(...) method.

Well, after scouring and searching the online resources, I found a half usable MS solution here: Microsofts one column hack

Nice, but I needed multiple columns…

So here’s my fix. Hope this helps someone somewhere:


       public static DataTable SelectDistinct(DataTable SourceTable, string[] FieldNames)
        {
            DataTable dt = new DataTable();
            foreach (string field in FieldNames)
            {
                dt.Columns.Add(field, SourceTable.Columns[field].DataType);
            }

            DataRow lastRow = null;
            DataRow tmpRow = null;
            DataRow[] sortedRows = SourceTable.Select("", CollectionHelper.ToStringCSV(FieldNames));
            bool added = false;
            foreach (DataRow dr in sortedRows)
            {
                added = true;

                if (lastRow == null)
                {
                    added = false;

                }
                else
                {

                    foreach (string field in FieldNames)
                    {
                        if (!ColumnEqual(dr[field], lastRow[field]))
                        {
                            added = false;
                        }
                    }
                }
                if (!added)
                {
                    tmpRow = dt.NewRow();
                    foreach (string field in FieldNames)
                    {
                        tmpRow[field] = dr[field];
                    }
                    dt.Rows.Add(tmpRow);
                    lastRow = tmpRow;
                }
            }

            return dt;
        }

        private static bool ColumnEqual(object A, object B)
        {

            // Compares two values to see if they are equal. Also compares DBNULL.Value.
            // Note: If your DataTable contains object fields, then you must extend this
            // function to handle them in a meaningful way if you intend to group on them.

            if (A == DBNull.Value && B == DBNull.Value) //  both are DBNull.Value
                return true;
            if (A == DBNull.Value || B == DBNull.Value) //  only one is DBNull.Value
                return false;
            return (A.Equals(B));  // value type standard comparison
        }

Cheers!
-Sean

November 30, 2006

Hashtable serialization and the IDeserializationCallback interface

Filed under: .NET — clevercoder @ 3:09 pm

It turns out that the process of serializing and deserializing certain classes in the .NET Framework isn’t always as straightforward as it seems. Let’s start with the problem. We have a Hashtable that we would want to maintain as part of the serialization of a particular class. Let’s call this BoringClass.

Inside the GetObjectData() method of BoringClass, we try to add a Hashtable to the SerializationInfo object. An example would be:

public void GetObjectData(SerializationInfo info, StreamingContext context)
{
     Hashtable hashtable = new Hashtable();
     hashtable["testItem"];

     info.AddValue("hash", hashtable);
}

The problem is when we try to deserialize the object, such as:

public BoringClass(SerializationInfo info, StreamingContext context)
{
     Hashtable hashtable = (Hashtable) info.GetValue("hash", typeof(Hashtable));

     Console.WriteLine("Value is: " + hashtable["testItem"]);
}

And, subsequently try to acccess said Hashtable, you may get a rather vague runtime exception like:

Exception has been thrown by the target of an invocation.

Inspecting the InnerException gives us a clue:

{"Object reference not set to an instance of an object."}

Warmer…
Setting a breakpoint in the special constructor and inspecting the SerializationInfo object prior to doing anything with it shows us more.. in that the Hashtable is basically broken:

info.m_data
{Dimensions:[4]}
    [0]: Count = 0
    [1]: null
    [2]: null
    [3]: null

The first element is recognized as a Hashtable, but has no elements!

Cutting to the chase, it all boils down to the fact that Hashtable implements the interface IDeserializationCallback. That interface is rather trivial, but the implications are far reaching.

Using Reflector for .NET, I cracked open the hood of the Hashtable class. Look at the special constuctor for serialization. Notice anything funny?

protected Hashtable(SerializationInfo info, StreamingContext context)
{
      this.m_siInfo = info;
}

Comparing this to the serialization constructor on a DataTable:

protected DataTable(SerializationInfo info, StreamingContext context) : this()
{
      bool flag1 = (context.Context != null) ? Convert.ToBoolean(context.Context, CultureInfo.InvariantCulture) : true;
      SerializationFormat format1 = SerializationFormat.Xml;
      SerializationInfoEnumerator enumerator1 = info.GetEnumerator();
      while (enumerator1.MoveNext())
      {
            string text1;
            if (((text1 = enumerator1.Name) != null) && (text1 == "DataTable.RemotingFormat"))
            {
                  format1 = (SerializationFormat) enumerator1.Value;
            }
      }
      this.DeserializeDataTable(info, context, flag1, format1);
}

It’s pretty obvious that something is different. Well, looking forther, I found the deserialization code in the OnDeserialization event handler method. So it seems that for the Hashtable to be fully deserialized, the OnDeserialization method must be invoked!

It turns out that a handful of classes in the framework depend on the IDeserializationCallback. (16 in fact). The IDeserializationCallback interface, and the process in general is described in length on this page: http://msdn2.microsoft.com/en-us/library/ms973893.aspx

The meaty part:

“Objects are reconstructed from the inside out, and calling methods during deserialization can have undesirable side effects, since the methods called might refer to object references that have not been deserialized by the time the call is made. If the class being deserialized implements the IDeserializationCallback, the OnSerialization method will automatically be called when the entire object graph has been deserialized. At this point, all the child objects referenced have been fully restored. A hash table is a typical example of a class that is difficult to deserialize without using the event listener described above. It is easy to retrieve the key/value pairs during deserialization, but adding these objects back to the hash table can cause problems since there is no guarantee that classes that derived from the hash table have been deserialized. Calling methods on a hash table at this stage is therefore not advisable.”

So, it seems that the Hashtable will serialize properly when created directly from Formatter.Serialize(…) and Formatter.DeSerialize(…). It seems that the Formatter wires up and calls the OnDeserialization(…) method on the object in question. The problem here is that we are using info.info.GetValue(...) to retrieve the Hashtable object. The SerializationInfo class only calls the special constructor and the OnDeserialization(…) method is never called.

The solution:

Simply call “OnDeserialization(…)” yourself after the object has been retrieved with GetValue! Simple, eh?!

So the modified code would look like:

        public BoringClass(SerializationInfo info, StreamingContext context)
        {
            Hashtable hashtable = (Hashtable) info.GetValue("hash", typeof(Hashtable));
            hashtable.OnDeserialization(this);

            Console.WriteLine("Value is: " + hashtable["testItem"]);

        }

I’m not sure if this can be considered the “best” practice, but it solves the problem for now.

ope someone else can benefit from this.

Cheers!
-Sean

July 14, 2006

Nullable datetime database columns… can we say DateTime?

Filed under: .NET — clevercoder @ 12:30 pm

I’m always trying to either re-invent the wheel, or at least make my code more efficient and readable.. or just cool. Struggling with the DBNull values returned from ADO data providers in .NET has always been slightly less than exciting. Problems always tend to creep up when you least expect them to. Recently, I encountered a situation where I wanted to simply read a very nullable datetime column in SQL Server to a Nullable object in .NET. Using strongly typed datasets lead me to think that this should be easily achieved. Was I wrong.. Although this simply is useless with strongly typed datasets (as accessing a DateTime column of a typed DataRow will always throw an exception), it’s still a neat trick that can be adapted for other use.

I was on a search to simplify the mundane task of assigning the value of a nullable datetime column in SQL Server to a Nullable<DateTime> object in .NET. This is what I started with:

obj.SomeDateField = (row["SomeDateField"] == DBNull)?null:(DateTime)row["SomeDateField"];

Writing this dozens of times over and over gets old. Real quicK! In reference to the mention of strongly typed datasets above, this is the alternative using them:

if (row.IsSomeDataFieldNull())
{
     obj.SomeDateField = null;
}
else
{
     obj.SomeDateField = row.SomeDateField;
}

Far from simple. So I got to thinking.. “There must be an easier way to do this!”..

In a nutshell, I created the following method for working with a cell returned from a DataRow to create a Nullable type that can be specified at runtime:

public static Nullable ToNullable(object dbSource) where T : struct
        {

Nullable retVal;

if (DBNull.Value == dbSource)
            {
                retVal = null;
            }
            else
            {
                retVal = (T?) dbSource;
            }

return retVal;
        }

And, using this trick, the example above would look like:

obj.SomeDateField = DBHelper.ToNullable<datetime>(row["SomeDateField"]);

Cheers!


April 6, 2006

Subversion over Apache.. on a Windows domain

Filed under: Subversion — clevercoder @ 11:56 am

Wow… this was fun. I just figured out the way to allow Windows domain users against a Subversion repository on using the Apache SSPI authentication module… Well, I had it working before, but I finally got it working so that:

1) Windows domain users can be authenticated and,
2) Non domain users can be authenticated (using Apache's htpasswd file)

Cutting to the chase, here was a useful link: http://svn.haxx.se/users/archive-2004-07/0544.shtml

And, here is my resulting config entry in the httpd.conf file:

<location /svn>
	DAV svn
	SVNParentPath "R:\SVN Repositories"
	#
	AuthName "Subversion Repositories"
	Require valid-user
	AuthAuthoritative Off #enables multiple authorities
	#
	AuthType SSPI
	SSPIAuth On
	SSPIAuthoritative Off
	#SSPIAuthoritative On - Forces only SSPI auth, skips passwd file
	SSPIOmitDomain On
	SSPIDomain YOURDOMAIN
	SSPIOfferBasic On
	#
	AuthType Basic
	AuthUserFile "R:SVN Repositorieshtpasswd"
	#
	AuthzSVNAccessFile "R:SVN Repositoriessvnaccessfile"
</location>

Another strange behavior I noticed was being double prompted for credientials. Once for "", the second for the actual name specified here. Ultimtately, changing the order of the items in this config section helped.Hope this saves someone elses day.

-Sean

March 22, 2006

Flickr me this.. Flickr me that

Filed under: Personal — clevercoder @ 7:54 pm

Blog at WordPress.com.