Arrays and Strings: Is Unique

Source
CtCI, Chapter 1, Question 1.1

Topic
Arrays and Strings

Language
C#

Question
Is Unique: Implement an algorithm to determine if an array has all unique strings. What if you cannot use additional data structures?

Process
My first though was to iterate through the array, adding each item to a new array, which I would search through each time I added an element.

//Solution 1
public bool IsUnique(string[] arr){
  string[] newArray = new string[arr.Length];
  for (int i = 0; i < arr.Length; i++){
    if (newArray.Contains(arr[i])){
      return false;
    }
    newArray[i] = arr[i];
  }
  return true;
}

Then I tried to solve the problem without using the additional “newArray”. My first though was if there was some LINQ tool that could solve this for me. I didn’t know off the top of my head, so I looked it up. While there is a .Distinct within LINQ, it does not return a boolean; it returns a new list of distinct items from the original collection. I could use that and compare the counts, but I would technically have created an additional data structure, so this does not solve the problem. It did seem better than my original solution though, so wrote it out:

//Solution 2
public bool IsUnique(string[] arr){
  string[] newArray = arr.Distinct().ToArray();
  if (newArray.Length < arr.Length){
    return false;
  }
  return true;
}

My next idea was to compare each item in the array to each other item by iterating through the array twice. This would solve the problem, but I wasn’t sure if it was possible. Turns out that is was pretty simple (though I accidentally the for to set j=i+1 at first, which meant that the function always returned false. But, it did solve the entire problem!

//Solution 3
public bool IsUnique(string[] arr){
  for (int i = 0; i < arr.Length; i++){
    for (int j = i+1; j< arr.Length; j++){
      if (arr[i] == arr[j]) {
        return false;
      }
    }
  }
  return true;
}

After finding my own solution, I checked out the hints in Cracking the Coding Interview. The first one was: Try a Hash Table. A solution wasn’t immediately clear to me, but I figured that if I assigned each word in the array to a key in the hash table, a duplicate key exception would lead to a false return. However, I didn’t want to use the exception as part of my logic flow, so I thought I would try using the word as the key and then incrementing the count as the value. While writing this code, I realized that I could simply check if the hashtable contains the key, and if it does, I could return false; I wouldn’t have to increment the value. The value wouldn’t really actually matter. That said, this is pretty much identical to my first solution, just using a hashtable instead of an array.

//Solution 4
public bool IsUnique(string[] arr){
  Hashtable ht = new Hashtable();
  foreach (string s in arr){
    if (ht.ContainsKey(s)){
      return false;
    }
    ht[s] = 0;
  }
  return true;
}

After reviewing the book’s solution, I realized that my last solution efficiency could be improved by using an if statement, and testing if the value was not null. This would prevent me from having to loop (via Hashtable.ContainsKey).

//Solution 5
public bool IsUnique(string[] arr){
  Hashtable ht = new Hashtable();
  foreach (string s in arr){
    if (ht[s] != null){
      return false;
    }
    ht[s] = true;
  }
  return true;
}

Analysis

  • Solution 1: Loops through each element once, and then loops through the entire array (Array.Contains) for each iteration of the first loop. O(n²)
  • Solution 2: Loops through the array once when it queries it for distinct values. O(n)
  • Solution 3: Loops through the array once, and then loops through each remaining element for each element. O(n²)
  • Solution 4: Loops through each element, and then loops through each key in the hashtable for each element (Hashtable.ContainsKey). O(n²)
  • Solution 5: Loops through the array until it “fails”. O(n)

Notes

  • I wrote all of my Array.Length with the wrong case originally; I wrote Array.length. Writing C# outside of Visual Studio is new to me, and this is an example of the kind of thing that I’m going to need to memorize for when I can’t rely on IntelliSense. That said, this one should be pretty easy to remember. Array.Length is a public getter, and is therefore capitalized.
  • This was the first time using a HashTable (I had only used Dictionaries). I’ll want to study up on the available helper function for this data structure.
  • I did not think of using the hashtable until the prompted by the book. I’ll try to keep this method in mind for other problems.