Published on

Sets in C#

image
Authors
  • avatar
    Name
    David Jimenez
    Twitter

What is a set?

A set is an unordered collection of items containing no duplicates. This means that { 3, 1, 2 } is a set; but { 1, 1, 2, 2 } is not. Because order does not matter, two sets are equal when they contain the same elements; for example, { 1, 2, 3 } is equal to { 3, 1, 2 }.

There are many interesting questions that can be asked about how two sets relate to another. The most common are:

  1. What elements do they have in common?
  2. What happens when we combine the two sets into one?
  3. What elements are in one set, but not the other?
  4. What elements are in one set, or the other, but not both?
  5. Are the two sets equal?

HashSet<T>

Hashset<T> is the generic implementation of this type of collection in C#. This class offers a number of methods, but the most commonly used are Add and Contains. These two methods operate on a single set by adding an element to the set, or checking if an element is already in the set, respectively.

Hashset also offers methods to modify a set with another set. These methods help with answering the questions mentioned above, but they modify the set that the method is called on.

1. What elements do two sets share?

A.IntersectWith(B) modifies A so that it only contains elements that exists in both A and B. If no elements are shared, then A becomes an empty set. B remains unchanged by the operation.

var A = new HashSet<int> { 1, 2, 3 };
var B = new HashSet<int> { 3, 4, 5 };
 
A.IntersectWith(B);
// A = { 3 }
// B = { 3, 4, 5 }

2. If we combine the two sets to form a new one, what does the new set look like?

A.UnionWith(B) modifies A to have all elements in both sets:

var A = new HashSet<int> { 1, 2, 3 };
var B = new HashSet<int> { 3, 4, 5 };
 
A.UnionWith(B);
// A = { 1, 2, 3, 4, 5 }
// B = { 3, 4, 5 }

3. What elements are in A but not B?

A.ExceptWith(B) updates A to only have the elements that are in it, but not in B:

var A = new HashSet<int> { 1, 2, 3 };
var B = new HashSet<int> { 3, 4, 5 };
 
A.ExceptWith(B);
// A = { 1, 2 }
// B = { 3, 4, 5 }

4. What elements are in A or B, but not both?

If we want A to have only the elements that are in A or in B, but not both, then would use the method SymmetricExceptWith.

var A = new HashSet<int> { 1, 2, 3 };
var B = new HashSet<int> { 3, 4, 5 };
 
A.SymmetricExceptWith(B);
// A = { 1, 2, 4, 5 }
// B = { 3, 4, 5 }

Something to note about the methods we have looked at so far: the arguments they take are of type IEnumerable<T>. This means, for example, that you can update a set to contain only the elements it shares with a different collection: after calling A.IntersectWith(new List<int> { 3, 3, 3 }), A = { 3 }.

5. When are A and B equal?

To determine if a set contains the same elements as another collection, you can use SetEquals. As the previous methods, the collection we are comparing our set to is of type IEnumerable<T>. This means that if we had the set C = { 3, 2, 1 }, A.SetEquals(C) would evaluate to true. Equally, if we had the list D = [ 1, 2, 2, 3, 3, 3 ], A.SetEquals(D) would also return true.

var A = new HashSet<int> { 1, 2, 3 };
var C = new HashSet<int> { 3, 2, 1 };
 
A.SetEquals(C); // Return True
 
var D = new List<int> { 1, 2, 2, 3, 3, 3 };
 
A.SetEquals(D); // Returns True as well

What about LINQ?

Performing set operations can also done using LINQ extensions. The LINQ equivalent methods to the ones we have reviewed so far are Intersect, Union, and Except. LINQ does not have an equivalent for SymmetricExceptWith or SetEquals.

Three main differences between HashSet and LINQ:

  • LINQ does not modify objects. Instead, it generates a new collection. A.Intersect(B) does not modify A. Instead, it returns another IEnumerable collection:
  • LINQ extensions can be called on any collection implementing the IEnumerable interface, not just sets.
  • HashSet contains additional operations that are not available in LINQ. We have already covered SymmetricExceptWith and SetEquals, but there are many others.
var A = new List<int> { 1, 2, 2, 3, 3, 3 };
var B = new List<int> { 3, 4, 5, 5 };
 
var C = A.Intersect(B);