Find unique strings for a string array
Page 1 of 19 Replies - 29249 Views - Last Post: 24 December 2008 - 09:58 AM
#1
Find unique strings for a string array
Posted 22 December 2008 - 10:16 PM
I have a string array, but may have duplicate strings. Any built-in or smart way to remove the duplicate ones and generate a string array contains only unique ones?
For example, the input array is {"abc", "bcd", "abc"}, the unique output array is {"abc", "bcd"}.
thanks in advance,
George
Replies To: Find unique strings for a string array
#2
Re: Find unique strings for a string array
Posted 22 December 2008 - 10:30 PM
public static ArrayList UniqueValues(string value)
{
ArrayList values = new ArrayList();
//make sure "value" doesnt already exist
if (!(values.Contains(value)))
{
//since we've made it this far we can add it
values.Add(value);
}
return values
}
For Generics
public static List<string> UniqueValues(string value)
{
List<string> values = new List<string>();
//make sure "value" doesn't already exist
if (!(values.Contains(value)))
{
//since we've made it this far we can add it
values.Add(value);
}
return values;
}
Now granted in your situation your ArrayList of Generic Collection would have to be a form level global so you can add to it as you go, these examples create a new object each time.
Now if you're in a position where you have to use a string array (such as a homework assignment) here is an example of passing a string array to a method and ensure that the value being passed doesn't already exist in the new string array
public static string[] UniqueValues(string[] value)
{
//create a string array the length of the values
//being passed to it
string[] values = new string[value.Length];
//loop through the initial array being passed
//to the method
for (int i = 0; i < value.Length; i++)
{
//make sure that "newValue" doesnt already
//exist in our new string array
if(!(values[i] == value[i]))
{
//add it to the new array
values[i] = value[i];
}
}
return values;
}
Hope that helps
#3
Re: Find unique strings for a string array
Posted 22 December 2008 - 10:33 PM
See ArrayList Methods
#4
Re: Find unique strings for a string array
Posted 22 December 2008 - 10:46 PM
// Items with duplicates
String[] values = { "item1", "item2", "item1", "item3", "item2" };
// Create a hashset of strings
System.Collections.Generic.HashSet<String> hash = new System.Collections.Generic.HashSet<String>();
// Loop through values and add to hashset
foreach (String val in values) {
hash.Add(val);
}
// Now loop through the hashset to show you no more duplicates
foreach (String hval in hash)
{
MessageBox.Show(hval);
}
As you will notice the messagebox then shows you item1, item2 and item3. No duplicates. This is because each value is hashed and stored and items which are duplicates will hash to the same value and thus overwrite one another in the hashset.
Enjoy!
"At DIC we be hashset tossing code ninjas... and no we do not do hash. Period."
#5
Re: Find unique strings for a string array
Posted 23 December 2008 - 06:44 AM
PsychoCoder, on 23 Dec, 2008 - 12:30 AM, said:
I would like to correct this. A string array is more efficient than an ArrayList because with the string array, the objects remain the same type(string). With an ArrayList, when you add a string to the list, it is converted to an OBJECT datatype, and when you loop through the list "foreach string", those OBJECT datatypes are converted back to strings. This boxing/unboxing issue with the ArrayList causes a hit on performance more than it would if using a string array.
Now an ArrayList(and Generic List) are easier to use and have more ways of manipulating the data than arrays, but the ArrayList is certainly not more efficient than an array.
#6
Re: Find unique strings for a string array
Posted 23 December 2008 - 07:27 AM
I prefer this solution :
var values=new[] {"abc","acd","abc"};
var distinctValues=(from value in values select value).Distinct();
//or even better
var distinctValues=values.Distinct();
You should use System.Linq namespace.
This post has been edited by beatles1692: 23 December 2008 - 07:29 AM
#7
Re: Find unique strings for a string array
Posted 23 December 2008 - 07:42 AM
eclipsed4utoo, on 23 Dec, 2008 - 07:44 AM, said:
You are assuming considerable overhead in boxing; I'm not sure I'd agree.
In any case, I'd prefer this method, for the original poster:
string[] GetUnique(string[] list) {
List<string> uList = new List<string>();
foreach (string s in list) {
if (!uList.Contains(s)) { uList.Add(s); }
}
return uList.ToArray();
}
Here, we're using the a generic List for string, presumably with less "boxing" issues than ArrayList. We're also taking advantage of the builtin ToArray method that will feed us back an array based on the generic type automatically.
While I agree with the Hashtable method or Dictionary method, my feeling is that the price of a unique key lookup is being paid somewhere, so I'm not sure if it's particularly more effective. In some languages that practically run on associative arrays, it probably would be best, though.
#8
Re: Find unique strings for a string array
Posted 24 December 2008 - 01:23 AM
1.
PsychoCoder, on 22 Dec, 2008 - 09:30 PM, said:
Could you describe what do you mean "efficient" please? And why you think ArrayList is efficient than string array and generics is more efficient than ArrayList?
2.
PsychoCoder, on 22 Dec, 2008 - 09:30 PM, said:
What do you mean "form level global"? Do you mean a general solution which could be used in the future?
The new object created you mean new List<string>() and new ArrayList?
3. I think the following code is wrong. You need two loops. :-)
PsychoCoder, on 22 Dec, 2008 - 09:30 PM, said:
public static string[] UniqueValues(string[] value)
{
//create a string array the length of the values
//being passed to it
string[] values = new string[value.Length];
//loop through the initial array being passed
//to the method
for (int i = 0; i < value.Length; i++)
{
//make sure that "newValue" doesnt already
//exist in our new string array
if(!(values[i] == value[i]))
{
//add it to the new array
values[i] = value[i];
}
}
return values;
}
Hope that helps
regards,
George
I like your method, thanks n8wxs!
n8wxs, on 22 Dec, 2008 - 09:33 PM, said:
See ArrayList Methods
regards,
George
Thanks Martyr2,
Martyr2, on 22 Dec, 2008 - 09:46 PM, said:
// Items with duplicates
String[] values = { "item1", "item2", "item1", "item3", "item2" };
// Create a hashset of strings
System.Collections.Generic.HashSet<String> hash = new System.Collections.Generic.HashSet<String>();
// Loop through values and add to hashset
foreach (String val in values) {
hash.Add(val);
}
// Now loop through the hashset to show you no more duplicates
foreach (String hval in hash)
{
MessageBox.Show(hval);
}
As you will notice the messagebox then shows you item1, item2 and item3. No duplicates. This is because each value is hashed and stored and items which are duplicates will hash to the same value and thus overwrite one another in the hashset.
Enjoy!
"At DIC we be hashset tossing code ninjas... and no we do not do hash. Period."
Your method works!
regards,
George
Thanks eclipsed4utoo,
eclipsed4utoo, on 23 Dec, 2008 - 05:44 AM, said:
PsychoCoder, on 23 Dec, 2008 - 12:30 AM, said:
I would like to correct this. A string array is more efficient than an ArrayList because with the string array, the objects remain the same type(string). With an ArrayList, when you add a string to the list, it is converted to an OBJECT datatype, and when you loop through the list "foreach string", those OBJECT datatypes are converted back to strings. This boxing/unboxing issue with the ArrayList causes a hit on performance more than it would if using a string array.
Now an ArrayList(and Generic List) are easier to use and have more ways of manipulating the data than arrays, but the ArrayList is certainly not more efficient than an array.
I disagree with you, I think string does not need to box/unbox. It is not value type but reference type. Any comments?
regards,
George
Sorry, beatles1692!
beatles1692, on 23 Dec, 2008 - 06:27 AM, said:
I prefer this solution :
var values=new[] {"abc","acd","abc"};
var distinctValues=(from value in values select value).Distinct();
//or even better
var distinctValues=values.Distinct();
You should use System.Linq namespace.
I need to use .Net version 3.0. LINQ is from .Net 3.5.
regards,
George
Hi baavgai,
baavgai, on 23 Dec, 2008 - 06:42 AM, said:
eclipsed4utoo, on 23 Dec, 2008 - 07:44 AM, said:
You are assuming considerable overhead in boxing; I'm not sure I'd agree.
In any case, I'd prefer this method, for the original poster:
string[] GetUnique(string[] list) {
List<string> uList = new List<string>();
foreach (string s in list) {
if (!uList.Contains(s)) { uList.Add(s); }
}
return uList.ToArray();
}
Here, we're using the a generic List for string, presumably with less "boxing" issues than ArrayList. We're also taking advantage of the builtin ToArray method that will feed us back an array based on the generic type automatically.
While I agree with the Hashtable method or Dictionary method, my feeling is that the price of a unique key lookup is being paid somewhere, so I'm not sure if it's particularly more effective. In some languages that practically run on associative arrays, it probably would be best, though.
Sorry I disagree with you. Box/unbox related to value type, but string is a reference type. So, no box/unbox is needed. Please feel free to correct me if I am wrong.
regards,
George
#9
Re: Find unique strings for a string array
Posted 24 December 2008 - 09:48 AM
George2, on 24 Dec, 2008 - 03:23 AM, said:
PsychoCoder, on 22 Dec, 2008 - 09:30 PM, said:
What do you mean "form level global"? Do you mean a general solution which could be used in the future?
The new object created you mean new List<string>() and new ArrayList?
This is what he means...
// ....using statements....
namespace WindowsFormsApplication4
{
public partial class Form1 : Form
{
//This is a form level global variable.
//It is accessible to all methods and events of the form.
List<string> listOfStrings = new List<string>();
public Form1()
{
InitializeComponent();
}
private void Form1_Load(object sender, EventArgs e)
{
listOfStrings.Add("some new string");
}
}
}
#10
Re: Find unique strings for a string array
Posted 24 December 2008 - 09:58 AM
George2, on 24 Dec, 2008 - 02:23 AM, said:
You are correct, of course. My nomenclature was a play of the poster I was referencing and not accurate.
What I should have said is type obfuscation that comes from referencing the parent type of an instance. In searching for equality this should be irrelevant, since Equals is a base method of all objects and can be safely called without casting. The only overhead would come from casting from object to string and that should be minimal.
Hope this makes sense. It's not really relevant to the code I offered, but you asked.
|
|

New Topic/Question
Reply




MultiQuote





|