Subscribe to A Programmer's Casual Blog        RSS Feed
-----

An Interesting Revisit to basic algibra notation while working with DataSets

Icon Leave Comment
So... A DataSet is a place where you can store Tables of information, right? Well I was working on a project where I needed to find the difference between two tables (held in two different DataSets, so when I say DataSet, you may take it also to mean it's DataTable).

I was floundering on the issue, because, if you haven't worked with DataSets before, they're kind of a bitch. You need to specially use their .Copy() method to make a new one, because anytime you store a DataSet in a variable, it's actually just a reference to the same DataSet in memory. To me that's wicked bizarre, I mean, what if you made an int variable...

int myInt = 5;

int secondInt = myInt;

secondInt = 3;




Yeah, and what if setting secondInt to 3 also made myInt return 3 as well!? That's crazy! But that's how DataSets are. DataSets are full of weird things like that which I I could probably fill up a blog post about, but we'll leave at that for now.

So anyway, I'm trying to take DataSet A and see what it has in its table which DataSet B doesn't have. I eventually figured out, you could express this problem in a mathematic notation --it's good to know all that high school math wasn't a complete waste =p

A = a new DataSet that has the most recent datarow in it

B = an older DataSet that doesn't have the latest datarow in it

...

A = B + Additions - Deletions

...



This was a groundbreaking revelation for me! I got excited and then pooped. Then when I came back I pondered this question:

In C#, how could I possibly solve for Additions across two DataSets?



After some searching, I found my way to:
(don't bother analyzing this, I just explain it)
        public static DataSet GetNewRecords(DataSet dsOld, DataSet dsNew)  // dsOld would be B and dsNew A
        {
            DataSet ut = new DataSet();   // upload this

            // having a primary key is the only way to get the merges to work.  
            dsOld.Tables[0].PrimaryKey = new DataColumn[] { dsOld.Tables[0].Columns["P_Id"] };
            dsNew.Tables[0].PrimaryKey = new DataColumn[] { dsNew.Tables[0].Columns["P_Id"] };

            DataSet delta = new DataSet();

            delta.Merge(dsOld, true);
            delta.AcceptChanges();
            delta.Merge(dsNew, true);   // also try with true parameter


            // this is where the algo gets impossibly weird.  Read the GetChanges Parameter, I'm getting all rows that didn't change...
            DataSet changes = delta.GetChanges(DataRowState.Unchanged);
            // this must work because the new rows of dsNew didn't get modified, since they didn't exist in dsOld, they were simply coppied (which doesn't count as changed?)

            return changes; // returns records originally only in dsNew
        }



OK, don't mind that C# code, basically all it can do is tell me if dsNew contained records that dsOld didn't contain. What about deletions? Well, it's screwy about deletions and so the returned dataset will contain additions and deletions...
Not very practical, because there's no way to tell deletions from additions based on the DataSet returned. But at least I made progress.

I wrote down in my comments, modifying my notation to be more clear:
// dsNew =  dsOld + Additions - Deletions
//
// Additions = A.Merge(B); A.GetChanges(DataRowState.Unchanged);     {I thought this worked better at the time}
//
// Deletions = (dsNew - dsOld) - Additions
//


And so once I got to that last line I was like, "WOOT! I can solve for each individual factor of the expression to the right of the equal sign, therefor I can solve the left!"

...But I was mad wrong.

//   (dsNew - dsOld) != dsOld.Merge(dsNew); dsOld.GetChanges(DataRowState.Unchanged);!!???
//
//   (dsNew - dsOld) = dsAdditions + dsDeletions ??? wtf ???



I was wrong because, how the heck do you actually solve (dsNew - dsOld)? My merge method fails to work as I realized while making notes. If you have deletions they get returned in the return DataSet along with the additions =(

So I wrote in...
// A - B = A.Minus(B)




Then I began writing an extension function that actually worked the way I wrote it above.

Here's my actual journal, which might be cool to look at on a blog:
Posted Image
At the bottom, instead of using a plain old equal sign, I decided to go with hash rocket since equals didn't feel right at the time.

Here's the code that eventuated:
    public static class DataSetExtensions
    {


        public static DataSet Minus(this DataSet mainDataSet, DataSet subtractorDataSet)
        {
            DataSet Result = mainDataSet.Copy();

            string formatterText = ""; // this is used to format the svl query on the dataset    eg "{0} = {1}"
            string[] myColumnNamesAndVals = new string[subtractorDataSet.Tables[0].Columns.Count*2];
            for (int i = 0; i < subtractorDataSet.Tables[0].Columns.Count*2; i = i + 2)
            {
                myColumnNamesAndVals[i] = subtractorDataSet.Tables[0].Columns[i/2].ColumnName;


                formatterText += " {" + i + "} = {" + (i + 1) + "} AND";
            }
            formatterText = formatterText.Substring(1, formatterText.Length - 4);
            

            foreach (DataRow subtractorRow in subtractorDataSet.Tables[0].Rows)
            {
                // select query:  index = 1, data = bla
                List<string> nullDateOrCurrencyColumnNames = new List<string>();
                bool doNulls = false;

                for (int i = 0; i < subtractorDataSet.Tables[0].Columns.Count; i++)
                {
                    string rowsType = subtractorDataSet.Tables[0].Columns[i].DataType.Name.ToLower();

                    bool skip = false;

                    string rowsDataAsString = "";

                    switch (rowsType)
                    {
                        case "autoincrement":
                                rowsDataAsString = Convert.ToString(subtractorRow[i]);
                                break;
                        case "datetime":
                                if (subtractorRow[i].GetType().Name == "DBNull")
                                {
                                    //throw new Exception("You have a null date field in one of your databases, and I don't know what to do to select a null date filed in a dataset.  Sry =(");
                                    // Alt:
                                    // 
                                    //rowsDataAsString = "\"\"";
                                    myColumnNamesAndVals[(i * 2)] = myColumnNamesAndVals[((i - 1) * 2)];
                                    myColumnNamesAndVals[(i * 2) + 1] = myColumnNamesAndVals[((i - 1) * 2) + 1];
                                    skip = true;

                                    doNulls = true;
                                    nullDateOrCurrencyColumnNames.Add(subtractorDataSet.Tables[0].Columns[i].ColumnName);
                                }
                                else
                                {
                                    DateTime theDate = (DateTime)subtractorRow[i];
                                    rowsDataAsString = "#" + theDate.ToShortDateString() + "#";
                                }
                                break;
                        case "int32":
                                rowsDataAsString = Convert.ToString(subtractorRow[i]);
                                break;
                        case "decimal":
                                rowsDataAsString = Convert.ToString(subtractorRow[i]);
                                if (rowsDataAsString == "")
                                {
                                    myColumnNamesAndVals[(i * 2)] = myColumnNamesAndVals[((i - 1) * 2)];
                                    myColumnNamesAndVals[(i * 2) + 1] = myColumnNamesAndVals[((i - 1) * 2) + 1];
                                    skip = true;

                                    doNulls = true;
                                    nullDateOrCurrencyColumnNames.Add(subtractorDataSet.Tables[0].Columns[i].ColumnName);
                                }
                                break;
                        case "double":
                                rowsDataAsString = Convert.ToString(subtractorRow[i]);
                                break;
                        case "string":
                                rowsDataAsString = "'" + (string)subtractorRow[i] + "'";
                                break;
                        //case "yesno":
                        //        rowsDataAsString = Convert.ToString(subtractorRow[i]);
                        //        break;
                        case "boolean":
                                if (subtractorRow[i].GetType().Name == "DBNull")
                                {
                                    break;
                                }
                                rowsDataAsString = Convert.ToString(subtractorRow[i]);
                                break;
                        default:
                                throw new Exception("unexpected datatype.  Add this type to the case switch in the Minus extention.");
                    }

                    if (!skip)
                        myColumnNamesAndVals[(i * 2) + 1] = rowsDataAsString;
                }

                

                string svl = string.Format(formatterText,   //  eg  "{0} = {1}, {2} = {3}"
                    myColumnNamesAndVals);                  //  eg  { "Index", "1", "Data", "bla" }

                DataRow[] HitRows = Result.Tables[0].Select(svl);

                foreach (DataRow rw in HitRows)
                {
                    // I could check for nulls here manually, but that might be a sloppy method
                    if (doNulls)
                        foreach (string nullDate in nullDateOrCurrencyColumnNames)
                        {
                            if (rw[nullDate].GetType().Name == "DBNull")
                            {
                                rw.Delete();
                                break;
                            }
                        }
                    else
                        rw.Delete();
                }
            }
            Result.AcceptChanges();

            return Result;
        }
    }



Because of the way databases are, this function might throw an exception. I'm going to ask in the forums for some feedback and maybe make some corrections.


Anyway, that was my fun day writing a DataSet extension. All thoughts are welcome below =)

Here's another notebook image of a page that I used during testing. As you can see, I wrote some debugging notes down at the bottom.

Posted Image

0 Comments On This Entry

 

Trackbacks for this entry [ Trackback URL ]

There are no Trackbacks for this entry

November 2018

S M T W T F S
    123
45678910
11121314151617
1819 20 21222324
252627282930 

Tags

    Recent Comments

    Search My Blog

    0 user(s) viewing

    0 Guests
    0 member(s)
    0 anonymous member(s)

    Categories