http://blog.bodurov.com/Performance-SortedList-SortedDictionary-Dictionary-Hashtable
Please note that the advantage of Hashtable over generic Dictionary for insert and search operations demonstrated here is actually because the tests are based on NON generic IDictionary interface, so each insert or search action is accompanied with check for
the key type. For more information see the Sean's comment below. (Thanks Sean!) Without that Dictionary seem to perform better than the Hashtable. The rest of the test conclusions will not be affected. So to summarize, generic Dictionary is the absolute winner for insert and search operations if used with generic dictionary interface or directly. This is a sequence of tests comparing the performance results for four different implementations of IDictionary and in particular generic Dictionary, generic SortedDictionary, the old non-generic Hashtable and generic SortedList. I performed several tests, comparing the following parameters: memory used in bytes, time for the insertion in ticks, time for the item search in ticks, and the time for looping with foreach in ticks. The test was performed 8000 times for all the four implementation, I performed the tests in five stages, to observe the relationship between the number of entries and the performance. In the first stage the collections had 50 items, in the second 500, in the third 5,000 in the fourth 50,000 items. In this particular test, lower numbers of memory usage or time taken for the execution means better performance. So if we want to present visually a performance chart for each of the parameters we have to deduce a performance coefficient from the raw data. Performance Coefficient for value x = min(value 1, value 2, … value n) / value x
This way, because the best performing value is the lowest one it will be transformed as value 1 of the performance coefficient and any other value will be a fraction of that value. This is the chart of the memory usage: The results stay consistent with the increase of the number of items in collections. Best memory footprint we see in the SortedList, followed by Hashtable, SortedDictionary and the Dictionary has highest This is the chart for the time taken for insert operations: When the number of records is small the differences between all four implementations are not significant but with the increase of items in the collection the performance of the SortedList drops dramatically, SortedDictionary is This is the chart for the time taken for search operations: The absolute leader is Hashtable, but the test does not consider the type of item being stored. That could be a possibility for a future test. The next best performer is the generic Dictionary followed by the other two implementations. This is the chart for the time taken for for-each collection loop operations: Here the leader is SortedList then Dictionary consistantly better than Hashtable and the words performer isSortedDictionary Here you can see the task manager during the test it shows us a picture of the memory usage. Area A is during the insertion phase when more and more memory has been allocating. Area B is from the end of the insertion until the garbage collection of the object. This is the code I used for this test. The variable NumberInsertedKeys was changed to 50, 500, 5000, 50000, or 10000 for the different stages. The computer used for the test has the following characteristics: These is the raw data. Memory Used is in bytes and the insert, search and looping time in ticks: Share this post: digg |
feedback temporarily disabled
Moritz Kroll
Posted on 1/3/2008 11:45:36 AM
Thanks for these interesting measurements! |
Vinit
Posted on 4/25/2008 1:17:46 AM
Thanks for the above test! |
Flavius
Posted on 6/11/2008 5:18:08 AM
Thanks! I just started to think about what to use and then found your artikle. It's good! |
Flavius
Posted on 6/11/2008 5:55:46 AM
You forgot to add measure units (bytes) for memory allocation chart. |
Vladimir Bodurov
Posted on 6/11/2008 8:15:29 AM
Thanks Flavius, I've added that now. |
Gary Chojnowski
Posted on 8/12/2008 8:37:12 PM
I found your results useful. However, it may be important to point out that the hashtable does not support generics. |
Tanveer Ansari
Posted on 8/25/2008 6:23:02 AM
Thanks for an article with immediate practical use. It helped in finalizing on which data structure to use for my app. In my test for retrieval time using the hashtable with unboxing vs. generic dictionary the hashtable still performed three times as fast as the generic dictionary. --Time in ticks-- |
JR
Posted on 2/15/2009 1:27:20 PM
This article was awesome! Thanks for sharing your findings with us. It was very helpful to the code I am writing. One thing that would have been nice to see would have been insertion time for the sorted collections in the case that the strings were already sorted, as that happens sometimes. Thanks! |
Jeffery
Posted on 4/17/2009 12:28:42 AM
Hi, I just did the same test but using ASP.NET 3.5 and IIS 7.0, the results seems to indicate otherwise: Dictionary is better than SortedDictionary! Take a look at http://jefferytay.wordpress.com/2009/04/16/performance-of-generics-sorteddictionary-and-dictionary/ |
Vladimir Bodurov
Posted on 4/17/2009 12:47:25 AM
Hi Jeffery, First of all thank you for the excellent work you have done! I like the attitude to never trust any source in the Internet and always check your self. Undoubtedly that’s the way to go. However there is one problem that I see with your test. You always process SortedDictionary before Dictionary. If you read the description of my tests I have tried those in different order and what is more important each was performed separately. When you |
ET
Posted on 4/23/2009 1:00:30 PM
I think you're missing an important variable/dimention in your tests - number of items in the collection. |
Vladimir Bodurov
Posted on 4/23/2009 1:14:18 PM
Yes you are right in the comming days I will make a revision of the test as it seems that as pointed by Jeffery tests done with bigger sample show Dictionary performing better than SortedDictionary |
Jeffery
Posted on 4/23/2009 8:16:50 PM
Hi Vladimir, Actually based on your results, i actually got my team to convert all Dictionary objects to SortedDictionary when they are doing their development. Because i remembered that dictionary does not maintain any sort order, and SortedDictionary, which uses a However during when i was doing some coding, i realized that for some wierd reason, my new codes seem to be performing much slower, that's when i decided to take our code for a test run |
Vladimir Bodurov
Posted on 4/23/2009 9:43:43 PM
Sorry man, my data is just not big enough and when I was doing it something influenced SortedDictionary to perform better than Dictionary and because of the small sample that wasn’t normalized. I'll have soon results with very large sample, but my preliminary |
Jeffery
Posted on 4/29/2009 2:32:08 PM
Hi Valdimir, Thanks for your extremely detailed test which tallies with mine. Now it just makes me wonder. Since Dictionary is better than SortedDictionary, why is it still there? Any ideas? |
Vladimir Bodurov
Posted on 4/29/2009 3:20:00 PM
Because if you try foreach iteration with SortedDictionary the keys will be sorted while Dictionary will not give you sorted keys. |
Jeffery
Posted on 4/30/2009 7:01:55 PM
Well if i really did a foreach, that is a possibility, but it does not make sense for SortedDictionary to perform a sort on the key lists each and everytime we requests for it, i do not believe Microsoft developers will write such lousy codes. Interesting thing is that i cannot seem to be able to find the source for Dictionary using reflector, any idea where it is residing in? Will try to perform a test next week to test the differences in doing foreach and direct indexing for Dictionary and SortedDictionary, there must be some benefits of SortedDictionary |
Vladimir Bodurov
Posted on 4/30/2009 10:37:13 PM
SortedDictionary does not peform sort each time you request a key it simply stores the data as a binary tree, that's why it can iterate them in order. If you had to show the keys in order how would you do it with Dictionary? If the keys are not just 1,2,3,4 but something like 2,24,255,999 and you don't know what are the gaps? With SortedDictionary you just use foreach. The code of Dictionary is in mscorlib |
Nugpot
Posted on 5/5/2009 4:10:36 PM
Your pictures looks nice, and it looks like you tried your best, but your results are definitely not accurate. I did similar tests today, and went on the web to see what other people found. Just to help everybody I am going to add my results. 5/5/2009 2:22:39 PM - Array 5/5/2009 2:22:40 PM - ArrayList 5/5/2009 2:22:41 PM - List<> 5/5/2009 2:22:42 PM - HashTable 5/5/2009 2:22:44 PM - Dictionary<> 5/5/2009 2:22:46 PM - SortedList<> 5/5/2009 3:57:05 PM - SortedDictionary<> |
Vladimir Bodurov
Posted on 5/5/2009 4:34:43 PM
Thanks for posting your results Nugpot, but I really don't understand what do you mean when you say "but your results are definitely not accurate", because your results are just like my results, in particular: It is also not clear what you mean by "unload data" as you didn't post reference to where your code is. Your insert and search results are just like mine I can't speak for the rest because I haven't seen it but if you compare things they must be similar |
Shital Shah
Posted on 10/13/2009 1:15:41 PM
What was the hardware used for these tests? |
Sean
Posted on 1/5/2010 1:52:21 PM
FYI your average calculation is wrong. Averaging 2,3,4 would come out as ((((2 + 3) / 2) + 4) / 2) or 3.25, instead of 3. Also you should time how long it takes to execute all the lookups at once and take the average off of that instead of timing each individual one. I ran some tests designed purely to test performance of lookups, no calling of GC(as .NET should pick when for you in the real world), although i did semi-track inserts. Test was inserting X keys, then doing x*2 lookups (x where the key existed x where it didn't) I let each of those tests of X keys run 10-15 times, and I came up with |
Vladimir Bodurov
Posted on 1/6/2010 4:32:33 AM
Thanks for sharing your results Sean. You are right about the avg formula. The way I was doing it would shift the result toward the latest run but at the end the result will still average out. Still I corrected my code to make it more clean and I also extended http://blog.bodurov.com/images/IDictTest01_old.gif |
Jeffery
Posted on 1/7/2010 5:56:08 PM
Hi Valdimir, appreciate if you can change the link to my post tohttp://jefferytay.wordpress.com/2009/04/16/performance-of-generics-sorteddictionary-and-dictionary/ |
Sean
Posted on 1/15/2010 12:58:05 PM
I made a few slight changes to your code and came up with different results With string keys dictionary performed searches roughly 10% faster with all combinations of keys I tried. 1: I created another test method called TestGeneric which accepts IDictionary<String, String>. If you look in reflector, calling Dictionary as an IDictionary reference results in validation code being called at runtime to ensure the objects passed in are 2: I changed the key to C_key{i} instead of {random_letter}_key{i} (to make #3 possible) 3: I changed your search from a single lookup to NumberInsertedKeys lookups, and just checking for each value that had inserted as the key (so not using a foreach loop to eliminate that possible conflict) 4: I commented out the randomness to determine which Also when I commented out the garbage collect code to test memory usage Dictionary tests seemed to increase more (I personally prefer to test without garbage collection since in production my code will rely on .NET to select when to run it for me anyways). I switched it back to using random letters and looking up every entry and got roughly the same results, I had to make more changes to do that though by caching the random entries and reusing them. |
Eric
Posted on 3/3/2010 8:10:05 AM
Thank you this was really very helpful. Many of the constructors for these collections support an initial capacity. While I understand that it is not always possible to predict this capacity, I do quite often make a range estimate. If I recall, the default |
Raj Aththanayake
Posted on 4/10/2010 5:13:28 PM
Excellent blog. I was looking for these type of information. I was comparing the SortedDictionary and SortedList. Now I know SortedList is much better on performances. |
Joshua DeLong
Posted on 5/26/2010 8:07:56 AM
Has anyone tried this type of thing with millions? I'm looking at putting roughly 30 million integer pairs into a hashtable for lookup purposes only. Any ideas which storage method would be best? |
phoog
Posted on 11/25/2010 3:54:20 AM
Another factor: string.GetHashCode() iterates through the entire string. So presumably, the longer the keys, the worse performance you'll find from Dictionary. Another point about SortedList: If the items are added in order, its insertion performance is |
Scanpst
Posted on 4/13/2011 6:02:20 AM
Sean, Joshua, |