Comparing two arrays (or IEnumerables) in C#

Much to my surprise I found that .NET 3.5 doesn’t seem to have a native method for comparing two arrays or collections of any type.  The LINQ extension methods offer a whole lot of added functionality for IEnumerable<T> collections, but not native comparison.  The Equals method is still the base object.Equals method that does a reference equality (i.e. Are these two the same object?) not a value equality (do they contain the same values?)

So, finding myself needing an equality comparison between two arrays, I’ve written the following extension method:

   1: /// <summary>
   2: /// Checks whether a collection is the same as another collection
   3: /// </summary>
   4: /// <param name="value">The current instance object</param>
   5: /// <param name="compareList">The collection to compare with</param>
   6: /// <param name="comparer">The comparer object to use to compare each item in the collection.  If null uses EqualityComparer(T).Default</param>
   7: /// <returns>True if the two collections contain all the same items in the same order</returns>
   8: public static bool IsEqualTo<TSource>(this IEnumerable<TSource> value, IEnumerable<TSource> compareList, IEqualityComparer<TSource> comparer)
   9: {
  10:     if (value == compareList)
  11:     {
  12:         return true;
  13:     }
  14:     else if (value == null || compareList == null)
  15:     {
  16:         return false;
  17:     }
  18:     else
  19:     {
  20:         if (comparer == null)
  21:         {
  22:             comparer = EqualityComparer<TSource>.Default;
  23:         }
  24:  
  25:         IEnumerator<TSource> enumerator1 = value.GetEnumerator();
  26:         IEnumerator<TSource> enumerator2 = compareList.GetEnumerator();
  27:  
  28:         bool enum1HasValue = enumerator1.MoveNext();
  29:         bool enum2HasValue = enumerator2.MoveNext();
  30:  
  31:         try
  32:         {
  33:             while (enum1HasValue && enum2HasValue)
  34:             {
  35:                 if (!comparer.Equals(enumerator1.Current, enumerator2.Current))
  36:                 {
  37:                     return false;
  38:                 }
  39:  
  40:                 enum1HasValue = enumerator1.MoveNext();
  41:                 enum2HasValue = enumerator2.MoveNext();
  42:             }
  43:  
  44:             return !(enum1HasValue || enum2HasValue);
  45:         }
  46:         finally
  47:         {
  48:             if (enumerator1 != null) enumerator1.Dispose();
  49:             if (enumerator2 != null) enumerator2.Dispose();
  50:         }
  51:     }
  52: }
  53:  
  54: public static bool IsEqualTo<TSource>(this IEnumerable<TSource> value, IEnumerable<TSource> compareList)
  55: {
  56:     return IsEqualTo(value, compareList, null);
  57: }
  58:  
  59: public static bool IsEqualTo(this IEnumerable value, IEnumerable compareList)
  60: {
  61:     return IsEqualTo<object>(value.OfType<object>(), compareList.OfType<object>());
  62: }

Updated: Jugen (see comment below) made some quality suggestions that I’ve used to improve the code here.  To see the state of the code when the comment was made, see here.

It gives you an extension methods on any collection that implements IEnumerable<T>.  There is an optional parameter of type IEqualityComparer<TSource> which if not null will be used to compare each item in the collections.  Otherwise it will use the default comparer for TSource. It will also work for untyped IEnumerable collections, the overloaded method passes the collections through to the IsEqualTo<TSource> method with object as the TSource.  This is really just there for backwards compatibility.

As far as speed goes, I ran a test with 2 collections of 10,000,000 (I stopped there because I started getting out of memory exceptions when I was populating the test collections after that!) items & it took 0.89 seconds, so I think that’d do for most scenarios.  If you want to use this code you can grab a copy of it here.

22 thoughts on “Comparing two arrays (or IEnumerables) in C#

  1. Don’t know whether comparing two IEnumerable has a real use but ok.

    One point though : maybe you should avoid the use of Count() extension method on IEnumerable.
    As you surely know Count() tries to convert the IEnumerable to a ICollection first but if that conversion fails, il enumerates on the IEnumerable.
    In my opinion that wouldn’t be very efficient : you already enumerate the IEnumerable so why would you do it twice for each of them ?

    So I would get rid of the lines 15-18 and fix line 24 in order to be able to receive IEnumerable with different “lengths”.

    Of course then, you may want to speed up the evaluation of the comparison if _both_ IEnumerable are ICollection.

    I have no 3.5 compiler here (and though am unsure of what follows) but isn’t it possible to call an extension method with a null “this” ? (I think I may call the method without the extension-method thingee, no ? Just calling Namespace.Class.IsEqualTo((IEnumerable)null, (IEnumerable)null); ) If this is the case then your code misses a check.
    Moreover :
    * what about (value == compareList) ? No need to enumerate then…
    * as an extension, you haven’t defined what happens when both IEnumerable are null : true of false ? :)

    Despite what I’ve said, this is a good try and please see no offense in this comment (remember it’s easier to comment than to produce) : none of my .NET developers colleagues here is able to write such code (they don’t know about generics and extension methods).

    By the way, when you test a method that receives IEnumerable, don’t forget to test with a real IEnumerable that has no Count property (ie not an array or ICollection). As an example you may just wrap up your collection with the (untested) method :
    static IEnumerable GetIEnumerableOverIEnumerable(IEnumerable enumerable)
    {
    if(enumerable == null) return null;
    foreach(T item in enumerable)
    yield return item;
    }
    Hope you see my point.
    Doing so you will (in theory) see that your test with two collections of 10,000 items takes a bit longer.

  2. Jurgen, thanks for your comments,honestly, I’d forgotten about the Count() method enumerating over IEnumerable if the cast failed, good point. I’d also forgotten about the ability to call the method on null. Noob mistake, but that’s why I like publicly releasing code.

    I’ve updated the code with the changes.

    If both IEnumerables are null I’m returning true, based on the idea that .net returns true for (null == null). Obviously if either but not both are null, then return false. Also I’m no longer checking count, I’m just enumerating over the collections until one of them runs out, then checking whether either still has more items. If either do, then it returns false, as they can’t be equal. If not return true.

    In reference to your first point, as to why IEnumerable, I’m guessing you’re asking why not just ICollection. I guess I did IEnumerable because I like making things as decoupled as possible. It can compare iEnumerables, so we may as well. I’ve run some test now comparing 2 strings, which inherit from IEnumerable and it works fine.

  3. Glenn,

    [As generic types are "eaten" by the commenting system, I would replace <T> with (T).. I don't like that kind of VBish syntax, but anyway...]

    Sorry I wasn’t clear enough: I wasn’t wondering why you use IEnumerable(T) as your interface; that’s good and consistent with what the .NET Framework already exposes.

    What I was trying to say is to perform something like :

    /* Some tests keep being done before */
    /* … */
    /* Try cast as ICollection(T) */
    ICollection(T) valueCollection = value as ICollection(T);
    ICollection(T) compareListCollection = compareList as ICollection(T);
    /* If both are collections and have different count, then they are not equal */
    if((valueCollection!=null) && (compareListCollection!=null) && (valueCollection.Count != compareListCollection.Count))
    return false;
    /* … */
    /* Continue with rest of code */

    I don’t know whether this might be a good idea… I guess it depends on the “length” of the IEnumerables ! I think I’d personally put that code in the method, but that may be a bad choice (won’t be the first of my life !!)

    I’ve read your updated code and I see a new mistake. I’ll try to explain it before giving you a solution. You “MoveNext” the enumerators in your while condition. If one MoveNext fails, your while loop is over. Great, that’s expected !
    But just after that you’re doing another MoveNext on the same enumerator. That’s wrong; your code will say that strings “123″ and “12″ are equals (in this order) :
    “123″.IsEqualTo(“12″); // returns true; false expected.

    Hope you see my point.

    Here’s a fix based on your solution. I only performed a few tests.
    Of course, feel free to comment, use, etc. I’m your guest !

    public static bool IsEqualTo(T) (IEnumerable(T) value, IEnumerable(T) compareList, IEqualityComparer(T) comparer)
    {
    if (comparer == null)
    {
    comparer = EqualityComparer(T).Default;
    }

    if (value == null && compareList == null)
    {
    return true;
    }
    else if (value == null || compareList == null)
    {
    return false;
    }
    else
    {
    IEnumerator(T) enumerator1 = value.GetEnumerator();
    IEnumerator(T) enumerator2 = compareList.GetEnumerator();

    bool enum1HasValue = enumerator1.MoveNext();
    bool enum2HasValue = enumerator2.MoveNext();
    while (enum1HasValue && enum2HasValue)
    {
    if (!comparer.Equals(enumerator1.Current, enumerator2.Current))
    {
    return false;
    }
    enum1HasValue = enumerator1.MoveNext();
    enum2HasValue = enumerator2.MoveNext();
    }

    return !(enum1HasValue || enum2HasValue);
    }
    }

    Regards.

  4. Jurgen, I’ve tested the function with different length strings & it returns false as expected. The reason is that it keeps looping until either enumerators has no more items, then it checks if both have no more items. If they both have no more, they’re the same length, so return true. If the other enumerator has more items then they’re different lengths, so it returns false.

    Both the following tests pass:

    [Test]
    public void TestDifferentLengthStrings()
    {
    string a1 = “fi”;
    string a2 = “fish”;

    Assert.IsFalse(a1.IsEqualTo(a2));
    }

    [Test]
    public void TestDifferentLengthStringsReverse()
    {
    string a1 = “fish”;
    string a2 = “fi”;

    Assert.IsFalse(a1.IsEqualTo(a2));
    }

  5. Don’t know whether comparing two IEnumerable has a real use but ok.One point though : maybe you should avoid the use of Count() extension method on IEnumerable.As you surely know Count() tries to convert the IEnumerable to a ICollection first but if that conversion fails, il enumerates on the IEnumerable.In my opinion that wouldn’t be very efficient : you already enumerate the IEnumerable so why would you do it twice for each of them ?So I would get rid of the lines 15-18 and fix line 24 in order to be able to receive IEnumerable with different “lengths”.Of course then, you may want to speed up the evaluation of the comparison if _both_ IEnumerable are ICollection.I have no 3.5 compiler here (and though am unsure of what follows) but isn’t it possible to call an extension method with a null “this” ? (I think I may call the method without the extension-method thingee, no ? Just calling Namespace.Class.IsEqualTo((IEnumerable)null, (IEnumerable)null); ) If this is the case then your code misses a check.Moreover :* what about (value == compareList) ? No need to enumerate then…* as an extension, you haven’t defined what happens when both IEnumerable are null : true of false ? :)Despite what I’ve said, this is a good try and please see no offense in this comment (remember it’s easier to comment than to produce) : none of my .NET developers colleagues here is able to write such code (they don’t know about generics and extension methods).By the way, when you test a method that receives IEnumerable, don’t forget to test with a real IEnumerable that has no Count property (ie not an array or ICollection). As an example you may just wrap up your collection with the (untested) method :static IEnumerable GetIEnumerableOverIEnumerable(IEnumerable enumerable){ if(enumerable == null) return null; foreach(T item in enumerable) yield return item;}Hope you see my point.Doing so you will (in theory) see that your test with two collections of 10,000 items takes a bit longer.

  6. Jurgen, thanks for your comments,honestly, I’d forgotten about the Count() method enumerating over IEnumerable if the cast failed, good point. I’d also forgotten about the ability to call the method on null. Noob mistake, but that’s why I like publicly releasing code. I’ve updated the code with the changes. If both IEnumerables are null I’m returning true, based on the idea that .net returns true for (null == null). Obviously if either but not both are null, then return false. Also I’m no longer checking count, I’m just enumerating over the collections until one of them runs out, then checking whether either still has more items. If either do, then it returns false, as they can’t be equal. If not return true.In reference to your first point, as to why IEnumerable, I’m guessing you’re asking why not just ICollection. I guess I did IEnumerable because I like making things as decoupled as possible. It can compare iEnumerables, so we may as well. I’ve run some test now comparing 2 strings, which inherit from IEnumerable<char> and it works fine.</char>

  7. Glenn,[As generic types are "eaten" by the commenting system, I would replace <T> with (T).. I don't like that kind of VBish syntax, but anyway...]Sorry I wasn’t clear enough: I wasn’t wondering why you use IEnumerable(T) as your interface; that’s good and consistent with what the .NET Framework already exposes.What I was trying to say is to perform something like :/* Some tests keep being done before *//* … *//* Try cast as ICollection(T) */ICollection(T) valueCollection = value as ICollection(T);ICollection(T) compareListCollection = compareList as ICollection(T);/* If both are collections and have different count, then they are not equal */if((valueCollection!=null) && (compareListCollection!=null) && (valueCollection.Count != compareListCollection.Count)) return false;/* … *//* Continue with rest of code */I don’t know whether this might be a good idea… I guess it depends on the “length” of the IEnumerables ! I think I’d personally put that code in the method, but that may be a bad choice (won’t be the first of my life !!)I’ve read your updated code and I see a new mistake. I’ll try to explain it before giving you a solution. You “MoveNext” the enumerators in your while condition. If one MoveNext fails, your while loop is over. Great, that’s expected !But just after that you’re doing another MoveNext on the same enumerator. That’s wrong; your code will say that strings “123″ and “12″ are equals (in this order) :”123″.IsEqualTo(“12″); // returns true; false expected.Hope you see my point.Here’s a fix based on your solution. I only performed a few tests.Of course, feel free to comment, use, etc. I’m your guest !public static bool IsEqualTo(T) (IEnumerable(T) value, IEnumerable(T) compareList, IEqualityComparer(T) comparer) { if (comparer == null) { comparer = EqualityComparer(T).Default; } if (value == null && compareList == null) { return true; } else if (value == null || compareList == null) { return false; } else { IEnumerator(T) enumerator1 = value.GetEnumerator(); IEnumerator(T) enumerator2 = compareList.GetEnumerator(); bool enum1HasValue = enumerator1.MoveNext(); bool enum2HasValue = enumerator2.MoveNext(); while (enum1HasValue && enum2HasValue) { if (!comparer.Equals(enumerator1.Current, enumerator2.Current)) { return false; } enum1HasValue = enumerator1.MoveNext(); enum2HasValue = enumerator2.MoveNext(); } return !(enum1HasValue || enum2HasValue); } }Regards.

  8. Glenn,

    I’m ok to say “fi” != “fish” (no matter what order).
    However, did you try “fish”.IsEqualTo(“fis”); ? [or to take my example "123".IsEqualTo("12");]
    I did with the latest version of the code of your post above (in .NET2.0; I removed the “this” in the method parameters). It returns true.

    When the enumerators “lengths” differ by 1 element and the extension method is run on the “bigger” enumerator, the problem arises. The sequence is the following :
    A- looping on common elements. Ok.
    B- enumerator2 runs out of elements (MoveNext == false) while enumerator1 points to the last element of the “bigger” enumerator.
    C- the while condition is thus false. Moving to return statement.
    D- enumerator2.MoveNext still remains false.
    E- enumerator1.MoveNext becomes false because it’s now pointing after the last element. We’ve sort of ignored the last element of enumerator1 (we made twice MoveNext on it : at steps B then E).
    F- Return statement thus returns true.

    To solve this issue you :
    - either have to MoveNext() transactionnally both enumerators in the while condition. I don’t know how to do this properly in an elegant way…
    - or have to store the result of the MoveNext() as I did in the previous code I posted. These values are the states of the last calls to MoveNext(). I can therefore use them in the return statement, without “MoveNexting” the enumerator…

    As an alternative to my code, you may also want to remove additional MoveNext() calls in the body. This gives the following code (we have here one single call MoveNext for each enumerator) :

    public static bool IsEqualTo<T>(IEnumerable<T> value, IEnumerable<T> compareList, IEqualityComparer<T> comparer)
    {
    /* begin of method goes here */
    else
    {
    IEnumerator<T> enumerator1 = value.GetEnumerator();
    IEnumerator<T> enumerator2 = compareList.GetEnumerator();

    bool enum1HasValue;
    bool enum2HasValue;
    while ((enum1HasValue=enumerator1.MoveNext()) & (enum2HasValue=enumerator2.MoveNext()))
    {
    if (!comparer.Equals(enumerator1.Current, enumerator2.Current))
    {
    return false;
    }
    }
    return !(enum1HasValue || enum2HasValue);
    }
    }

    Note that this last snippet uses a non-short-circuiting & operator in the while condition. If you short-circuit with a &&, you’ll have a bad value in enum2HasValue whenever enumerator1 first runs out of elements.
    However I only recommand this implementation if it’s well documented and not subjet to frequent code review, because the use of the non-short-circuiting & operator may lead to confusion or anti-optimisation (“hey, man, && is cooler, because it doesn’t evaluate more than needed !”).
    I personally wouldn’t use such a cryptic implementation : the performance should be the equivalent (if not same), and readability decreases.

  9. Ah, now I see it. They Joys of working with enumerators. I like the non-shortcuting solution, but I tend to agree with your opinion on readability over nominal performance improvement. Besides, I speed tested both ways on a massive collection & I couldn’t detect any material difference.

    Thanks for your help with this, and for your patience in getting your point through! I added in some exception handling to dispose of resources in this revision too.

  10. Jurgen, I’ve tested the function with different length strings & it returns false as expected. The reason is that it keeps looping until either enumerators has no more items, then it checks if both have no more items. If they both have no more, they’re the same length, so return true. If the other enumerator has more items then they’re different lengths, so it returns false.Both the following tests pass: [Test] public void TestDifferentLengthStrings() { string a1 = “fi”; string a2 = “fish”; Assert.IsFalse(a1.IsEqualTo(a2)); } [Test] public void TestDifferentLengthStringsReverse() { string a1 = “fish”; string a2 = “fi”; Assert.IsFalse(a1.IsEqualTo(a2)); }

  11. If you compare the IL of both versions you’ll see the non-shortcurting solution is a bit smaller (uses fewer instructions : it has less branches because ‘&&’ shortcut feature creates branches), but it shouldn’t be really faster. What takes time on big IEnumerable<T> is callvirt to MoveNext(), and both versions call MoveNext as much.

    Two more remarks and I’m done/gone (and I stop bugging you !) :
    A- You can remove the catch block. A try/finally is more than enough in that case.
    B- Don’t you want to return true also when value == compareList ? (If you do you can replace line 15 with if(value == compareList) )

    Now you’ve provided your readers with an “IsEqualTo” method, I propose an exercise for their spare time : implement “StartsWith” (easy) and “EndsWith” (ouch ! seems harder) extension methods on IEnumerable<> ! :)

    Regards

  12. Glenn,I’m ok to say “fi” != “fish” (no matter what order).However, did you try “fish”.IsEqualTo(“fis”); ? [or to take my example "123".IsEqualTo("12");]I did with the latest version of the code of your post above (in .NET2.0; I removed the “this” in the method parameters). It returns true.When the enumerators “lengths” differ by 1 element and the extension method is run on the “bigger” enumerator, the problem arises. The sequence is the following :A- looping on common elements. Ok.B- enumerator2 runs out of elements (MoveNext == false) while enumerator1 points to the last element of the “bigger” enumerator.C- the while condition is thus false. Moving to return statement.D- enumerator2.MoveNext still remains false.E- enumerator1.MoveNext becomes false because it’s now pointing after the last element. We’ve sort of ignored the last element of enumerator1 (we made twice MoveNext on it : at steps B then E).F- Return statement thus returns true.To solve this issue you :- either have to MoveNext() transactionnally both enumerators in the while condition. I don’t know how to do this properly in an elegant way…- or have to store the result of the MoveNext() as I did in the previous code I posted. These values are the states of the last calls to MoveNext(). I can therefore use them in the return statement, without “MoveNexting” the enumerator…As an alternative to my code, you may also want to remove additional MoveNext() calls in the body. This gives the following code (we have here one single call MoveNext for each enumerator) :public static bool IsEqualTo<T>(IEnumerable<T> value, IEnumerable<T> compareList, IEqualityComparer<T> comparer){/* begin of method goes here */ else { IEnumerator<T> enumerator1 = value.GetEnumerator(); IEnumerator<T> enumerator2 = compareList.GetEnumerator(); bool enum1HasValue; bool enum2HasValue; while ((enum1HasValue=enumerator1.MoveNext()) & (enum2HasValue=enumerator2.MoveNext())) { if (!comparer.Equals(enumerator1.Current, enumerator2.Current)) { return false; } } return !(enum1HasValue || enum2HasValue); }}Note that this last snippet uses a non-short-circuiting & operator in the while condition. If you short-circuit with a &&, you’ll have a bad value in enum2HasValue whenever enumerator1 first runs out of elements.However I only recommand this implementation if it’s well documented and not subjet to frequent code review, because the use of the non-short-circuiting & operator may lead to confusion or anti-optimisation (“hey, man, && is cooler, because it doesn’t evaluate more than needed !”).I personally wouldn’t use such a cryptic implementation : the performance should be the equivalent (if not same), and readability decreases.

  13. Ah, now I see it. They Joys of working with enumerators. I like the non-shortcuting solution, but I tend to agree with your opinion on readability over nominal performance improvement. Besides, I speed tested both ways on a massive collection & I couldn’t detect any material difference.Thanks for your help with this, and for your patience in getting your point through! I added in some exception handling to dispose of resources in this revision too.

  14. If you compare the IL of both versions you’ll see the non-shortcurting solution is a bit smaller (uses fewer instructions : it has less branches because ‘&&’ shortcut feature creates branches), but it shouldn’t be really faster. What takes time on big IEnumerable<T> is callvirt to MoveNext(), and both versions call MoveNext as much.Two more remarks and I’m done/gone (and I stop bugging you !) :A- You can remove the catch block. A try/finally is more than enough in that case.B- Don’t you want to return true also when value == compareList ? (If you do you can replace line 15 with if(value == compareList) )Now you’ve provided your readers with an “IsEqualTo” method, I propose an exercise for their spare time : implement “StartsWith” (easy) and “EndsWith” (ouch ! seems harder) extension methods on IEnumerable<> ! :)Regards

  15. “Much to my surprise I found that .NET 3.5 doesn’t seem to have a native method for comparing two arrays or collections of any type.”

    Well, I’ve just discovered it does : SequenceEqual<TSource> (see http://msdn2.microsoft.com/en-us/library/bb342073.aspx) enumerates the elements and compares them.
    The main difference with IsEqualTo<TSource> is that it throws exceptions whenever one of the IEnumerable<T> is null.

    However, the names of the methods are consistent :
    - IsEqualTo compares the containers and their content if any.
    - SequenceEqual compares the content (the sequence). Therefore the containers are supposed to exists (ie not null)

    The implementation of method SequenceEqual (seen through Reflector) :
    - does not use local vars. As a result, there is one more call to MoveNext than in IsEqualTo, but the code looks straightforward in my opinion.
    - reminds me you could remove try/finally and replace that with using statements. It then behaves the same but it’s much more readable. (Haven’t seen that !.. I must still be disturbed by the fact IEnumerator<T> implements IDisposable…)

  16. “Much to my surprise I found that .NET 3.5 doesn’t seem to have a native method for comparing two arrays or collections of any type.”Well, I’ve just discovered it does : SequenceEqual<TSource> (see http://msdn2.microsoft.com/en-us/library/bb3420…) enumerates the elements and compares them.The main difference with IsEqualTo<TSource> is that it throws exceptions whenever one of the IEnumerable<T> is null.However, the names of the methods are consistent :- IsEqualTo compares the containers and their content if any.- SequenceEqual compares the content (the sequence). Therefore the containers are supposed to exists (ie not null)The implementation of method SequenceEqual (seen through Reflector) :- does not use local vars. As a result, there is one more call to MoveNext than in IsEqualTo, but the code looks straightforward in my opinion.- reminds me you could remove try/finally and replace that with using statements. It then behaves the same but it’s much more readable. (Haven’t seen that !.. I must still be disturbed by the fact IEnumerator<T> implements IDisposable…)

  17. You can't use the UnitTesting stuff because that is part of Visual Studio, not part of the .NET Framework. And the DLL files for Visual Studio are not redistributable.

  18. This is exactly what I needed, thank you! I battled linq for a few hours, then found this – just what I was looking for. Generics are already “old school” I guess, but this is clear what’s happening, and it works.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>