Page 1 of 1

Collections - In Depth

#1 andrewsw  Icon User is online

  • Fire giant boob nipple gun!
  • member icon

Reputation: 3320
  • View blog
  • Posts: 11,229
  • Joined: 12-December 12

Posted 01 September 2013 - 02:30 PM

*
POPULAR

Learning C# Series
Learning C# Series - Basic Collections

There were two main resources that helped me construct the following notes/tutorial:

  • C# 4.0 in a Nutshell J & B Alhabri, O'Reilly
  • Professional C# 4 and .NET 4 Nagel, Evjen et al., Wrox

This is not a hold-my-hand tutorial so I recommend that you go through the Basic Collections tutorial linked above first.

Collection Classes

Collection classes mostly appear in the System.Collections namespace, although some are based on Dictionary structures within the System.Collections.Specialized namespace. Comparing arrays to collections:

  • have to use an integer index to access elements
  • are of a fixed size *
  • store value types, although the values may point to reference types (an array of objects)
  • accept, hold and return elements as object types (but see Generics later in these notes)


* This is not entirely the case since Microsoft introduced the Array.Resize method (for a one-dimensional, 0-indexed, array) from .NET 3.5 (but supported earlier). Resize has a performance cost so using a Collection is preferable.

Posted Image

Collections that support the Add method can be initialized using syntax similar to that for arrays. Collections such as Hashtable's can be initialized by specifying each key/value pair as an anonymous type.

ArrayList numbers = new ArrayList() { 10, 9, 8, 7, 6 };
Hashtable people = new Hashtable() { {"John",34},{"Mary",22},{"Pete",40} };


Stacks and Queues

A Stack is a last in, first out (LIFO) collection of items. The most recent thing to go on the stack ("pushed") is also the first to come off ("popped"). Stacks (and Queue's) also support "peeking" , which returns the object on top of the stack (or queue) without removing it. You can also use foreach to loop through the items on the stack or queue.

Random r = new Random();
System.Collections.Stack myStack = new System.Collections.Stack();
for (int i = 0; i < 6; i++)
{
    myStack.Push(r.Next(1,49));
    Console.WriteLine(myStack.Peek().ToString());
}
for (int i = 0; i < 3; i++)
{
    if (myStack.Count > 0) Console.WriteLine("Item popped is " + myStack.Pop().ToString());
}


A Queue is just like a Stack except that objects collected are first in, first out (FIFO). Instead of Push and Pop they have Enqueue and Dequeue methods. Again you need to check the queue's Count property when dequeue-ing.
Items on stacks and queues are maintained as objects. You can implicitly cast from string to object (when pushing or enqueue-ing), but not from object to string when popping (dequeue-ing).

The ArrayList

The ArrayList works like an array except that it is dynamically resized, depending on how many elements are stored in it.

The ArrayList is essentially deprecated in favour of List<T>

Spoiler

As with arrays, if you use foreach, you cannot use the iteration variable to modify the contents of the ArrayList. You can add any object to an ArrayList so it is not typesafe:

Spoiler

Dictionaries

Dictionaries are structures that implement the IDictionary interface, providing for a collection of keys and values – the key is used to acces the value. There are several dictionary classes: HashTable and SortedList (in the System.Collections namespace) and HybridDictionary, ListDictionary and StringDictionary (in the System. Collections.Specialized namespace).

The HashTable Class

A Hashtable cannot contain duplicate keys. Attempting to Add a duplicate key will generate an exception, although using square brackets' notation prevents this exception. Hashtable implement indexers to enable this alternative array [] notation.

A Hashtable is a sparse data structure that operates best with plenty of memory. The size of a Hashtable in memory can grow quite quickly as you insert more elements.

Spoiler

Using foreach to iterate through a Hashtable returns a DictionaryEntry:

foreach (DictionaryEntry elem in myH) string name = (string) elem.Key;

The SortedList Class

One of the more useful classes that implements IDictionary is SortedList, which is a cross between a dictionary key/value list and an array accessed by index. The keys array is always sorted.

Spoiler

Collection Interfaces

An interface provides a binding contract with any class that uses the members specified in the interface. In other words, when a class implements an interface it tells any object that uses the class that it will support the methods, properties, events and indexers of the named interface.

Posted Image

Generics

Many of the classes in the System.Collections namespace can store objects of any kind. This presents problems:

  • A collection may be intended to store objects of a particular kind, but any object could be added
  • Items returned from a collection usually need to be cast to an appropriate type (to prevent an exception)
  • Value types need to be boxed and unboxed when being added and removed from a collection.


Generic classes and methods accept type parameters, which specify the type of objects that they operate on. The .NET Framework class library includes generic versions of many of the collection classes and interfaces in the System.Collections.Generic namespace, such as List<T> and Dictionary<TKey,TValue>. You should always use these in preference to the non-generic versions.

Most of the time you can just use a suitable generic collection to store your objects. Alternatively, you can either derive your own collection class from a generic collection, or implement a generic interface. These and other options are discussed in the section "Creating a Collection Class".

[Within your generic classes you can use default(T) to obtain the appropriate default value for the type parameter. That is, null for an object, 0 for an integer, etc..]

The List<T> Class

Use generic collections (List and Dictionary) in preference to arrays, HashTables or ArrayLists. The List class gives everything that an ArrayList gives but is also typesafe. It lives in the System.Collections.Generic namespace.

List<Account> accountList = new List<Account>();


The characters < and > tell the list what kind of things it can store, making it typesafe.

List<int> scores = new List<int>();       // a list of integers
KitchenSink k = new KitchenSink();
accountList.Add(k);                 // would cause an error.


A List has members (Add, Count, Remove, Contains) that work the same as ArrayList members. We don't have to cast an item in the list as the type has already been established. A collection uses Count whereas an array uses the Length property. However, they are indexed in the same way using accountList[j].

The Dictionary<TKey,TValue> Class

Just as the List class extends the ArrayList class using generics, so the Dictionary class extends the Hashtable class to make it typesafe.

Dictionary<string, Account> accDictionary = new Dictionary<string, Account>();
accDictionary.Add("Rob",robsAccount);
accDictionary["Rob"].PayInFunds(50);


Attempting to reference a key element that doesn't exists results in a KeyNotFoundException. Use accDictionary.ContainsKey("keyValue") to prevent this.

foreach (KeyValuePair<string,Account> accKeyVal in accDictionary)
{
    Console.WriteLine("Acc Key {0} Name {1}",accKeyVal.Key,accKeyVal.Value.Name);
}


Constraints

Using a constraint you can, for example, limit the type parameters of a generic class to those that implement a particular set of interfaces, and therefore provide the methods defined by those interfaces.

public class PrintableCollection<T> where T : IPrintable


Posted Image

The IComparable<T> Interface

Implementing either the non-generic IComparable or generic IComparable<T> interface in a class requires that we create a CompareTo method. We can use this method ourselves, but it is also enables us to make use of other methods, such as a Sort method, which uses types that implement this interface. Note that sorting can also be achieved with the IComparer interface, which enables a method parameter for the Sort method.

Spoiler

If, instead, we implemented the non-generic IComparable interface, then we would need to perform an explicit cast to a Circle:

  public int CompareTo(object obj)
    {
        Circle circ = (Circle)obj;
        if (this.Area() == circ.Area()) ..


Generic Methods

A generic method declares type parameters within the signature of the method.

static void Swap<T> (ref T first, ref T second)
{
    T temp = first;
    first = second;
    second = temp;
}

int a1 = 1, a2 = 5;
Swap<int> (ref a1, ref a2);


Covariance and Contravariance

Type variance is a new feature in C# 4.0 that allows generic interfaces and generic delegates to mark their type parameters as covariant or contravariant (using out or in modifiers). This enables code such as the following to work:

IEnumerable<string> x = ... ;
IEnumerable<object> y = x;


The following example uses generic delegates to demonstrate the principles of variance.

Spoiler

The following example uses generic interfaces. Note that, although our Stack class implements both the IPoppable and IPushable interfaces, they remain distinct. That is, we are still unable to Push a Bear and Pop a Camel!

Spoiler

  • Covariance If the methods in a generic interface can return strings they can also return objects. (All strings are objects.)
  • Contravariance If the methods in a generic interface can take object parameters they can take string parameters. (If you can perform an operation using objects, you can perform the same operation using a string because all strings are objects.)


Creating a Collection Class

You can create your own collection classes by inheriting from one of the many .NET Framework collection classes and adding code to implement custom functionality. The System.Collections namespace contains classes Stack, Queue and Dictionary which are specialized for specific roles. Others such as CollectionBase and DictionaryBase, which are abstract (MustInherit) classes, can be customized for our purpose.

The CollectionBase class already has implementations for the Clear method and the Count property, and it maintains a protected property called List, which it uses for the internal storage and organization. Other methods, such as Add and Remove, as well as the Item property, require implementation.

Spoiler

This is a viable approach to maintaining a type-safe collection, but it does not take advantage of generics. In particular, the protected List object that it uses is still a non-generic List. There are a number of alternatives for maintaining a collection of our objects:

  • Use a non-generic collection such as ArrayList or Dictionary. However, generic collections should always be preferred over their non-generic counterparts.
  • Create a class which derives from the CollectionBase or DictionaryBase abstract classes. This is the approach taken above and, before generics, was a very common approach.
  • Use a generic collection such as List<YourType> or Dictionary<TKey,YourType>. This is the recommended, type-safe approach. It is straight-forward to implement and will provide most of the functionality you might need.
  • Create a class which derives from a generic collection. This is as straight-forward as:

public class Persons : List<Person> { /* no code required */ } 
Persons pers = new Persons();


This has some advantages over the previous suggestion:
  • It is almost as simple to implement
  • 'Persons p' is preferable to List<Person> p when instantiating the class
  • We can easily add our own methods to extend List<Person>:

public class Persons : List<Person>
{
    public void Add(string first, string last)
    {
        base.Add(new Person(first, last));
        Console.WriteLine("Added");
    }
}


This approach is discussed further in the section, Customizable Collections.
  • Create our own collection class with, for example, custom Add and Remove methods. However, we would still need to store items in, for example, an ArrayList or List<OurClass>. This creates unnecessary work for ourselves and has little to recommend it.
  • Create our own custom generic collection class, usually extending an existing generic collection class:

public class CustomCollection<T> : List<T>        // or
public class CustomCollection<T>
{
    private List<T> myList;
}


This could be considered if we have a number of classes that we want to store in custom collections. It would require careful planning to ensure that each collection behaves as expected, for the different types that it might store.

It is even possible to implement a collection's interface:

public class Persons : IList<Person>

This would require us to implement every method and property of the interface, and possibly to create additional (helper) classes. Not recommended.

Customizable Collections

Previously we could use CollectionBase, DictionaryBase or ReadOnlyCollectionBase as base classes to create our own collection classes. Now, the .NET Framework provides Collection<T>, KeyedCollection<TKey,TItem> and ReadOnlyCollection<T> generic versions for this purpose (in the System.Collections.ObjectModel namespace).

An example in the previous section derived from List<Person>. This is perfectly acceptable, but it does not give direct access to the "inner list", or control over the process of adding or removing items. Collection<T> is a customizable wrapper for List<T>.

public class Collection<T> : IList<T>, ICollection<T>, IEnumerable<T>, IList, ICollection, IEnumerable
{
    // includes these additional members:
    protected IList<T> Items { get; }

    protected virtual void ClearItems();
    protected virtual void InsertItem (int index, T item);
    protected virtual void RemoveItem (int index);
    protected virtual void SetItem (int index, T item);
}


These virtual methods enable us to change or enhance the list's normal behaviour, and the protected Items property allows direct access to the "inner list".

The following code is adapted from 'C# in a Nutshell' by the Albahari brothers. The virtual methods (ClearItems, etc.) do not need to be included at all, but then there would be no advantage over a simple List<Person>. By overriding these methods we add extra functionality (such as business rules). In this example, we override these methods to maintain a reference to the Society that each Person belongs to (which is a property of Person).

Spoiler

The advantage of this approach is that PersonCollection can encapsulate business rules that should apply when adding or removing items, and the Society class can provide public methods without worrying about these rules.

KeyedCollection<TKey,TItem> subclasses Collection<TItem>. It adds the ability to access items by key, similar to a dictionary, but removes the ability to proxy your own inner list; that is, to specify an existing list in construction.

Enumerating Collections

The foreach construct is an elegant way to iterate through the items in an array or collection. To use this construct the array or collection must be enumerable. That is, it must implement the System.Collections.IEnumerable interface. Arrays are instances of the System.Array class which, together with all other collection classes, implements the IEnumerable interface.

If we create a collection class that derives from an existing collection class, then our class will be enumerable. If we decide to implement the IEnumerable (or IEnumerable<T>) interface, then we can either borrow the collection class's enumerator – using 'return base.GetEnumerator();' – or create our own enumerator by implementing the IEnumerator interface in another (perhaps nested) class. The following example demonstrates creating an enumerator but rarely would you need to do this.



Remember that the control variable within a foreach structure cannot be used to modify item(s) in the collection. This variable provides a read-only copy of each item. If we wish to modify an item while iterating, we should use a for loop, terminated at .Count-1 or .Length-1 (depending on the collection type).




Spoiler

  • MoveNext – The first time this is called it should initialize the data to be enumerated and move to the first piece of data. It doesn't return any data, but returns true or false to indicate whether there is anything left to retrieve.
  • Current – All this needs to do is return the current piece of data. It should check that the data has been initialized (meaning that MoveNext has been called first) and raise an error if not, such as InvalidOperationException.
  • Reset – Should return the pointer to before the first item of data. Implementing it is optional and instead you could just throw a NotSupportedException exception.


Enumerating Using an Iterator

The enumerator can be supplied as an iterator. An iterator is a block of code that yields an ordered sequence of values.

  IEnumerator<Person> IEnumerable<Person>.GetEnumerator()
    {
        foreach (Person p in this)
            yield return p;
    }


C# internally builds an enumerator from the sequence. The keyword yield identifies which value should be returned for each iteration. For our example we had to use this (or base.ToArray()) as we don't have direct access to the base class's List.

To provide the data in a different sequence we can create different properties that implement the IEnumerable interface, and use an iterator for returning data:
  public new IEnumerable<Person> Reverse
    {
        get {
            for (int i = this.Count - 1; i >= 0; i--) yield return this[i];
        }
    }
    
    foreach (Person p in pers.Reverse) Console.WriteLine("Last name is :{0}", p.LastName);


See all the C# Learning Series tutorials here!

This post has been edited by andrewsw: 10 October 2013 - 10:49 AM


Is This A Good Question/Topic? 6
  • +

Replies To: Collections - In Depth

#2 Curtis Rutland  Icon User is online

  • (╯□)╯︵ (~ .o.)~
  • member icon


Reputation: 4451
  • View blog
  • Posts: 7,752
  • Joined: 08-June 10

Posted 02 September 2013 - 04:27 PM

This is an excellent tutorial!
Was This Post Helpful? 0
  • +
  • -

Page 1 of 1