Jeffrey Stedfast's (fejj) series on sorting algorithms has spurred me to compare his results with equivalent implemetations in C#/Mono. I expected Mono's performance to be no match to the direct implementation in C. After all with .NET one has the additional layer of the VM and no direct access to memory. But the actual results I got were quite astonishing.
What did I do?
I did two comparisions: One with a standard array and one with an ArrayList.
System.Array
I simply took fejj's source code and changed "int a[100000]" to "int[] a = new int[100000]" - and by the way removed the unnessesary size_t n Parameter of the sorting functions. .NET's arrays are more intelligent than their C/C++ counterparts, which are mostly just a convenient shorthand for direct pointer arithmetic. System.Array objects are conscious of their own size and check if the programmer stays within them.
Of course the initialization of the array is different to C, so I got following skeleton for the tests:
using System;
public class Sort{
static void Main(){ int[] array = new int[100000];
Random random = new Random(); for (int i = 0; i < 100000; i++) { array[i] = random.Next(); }
DoTheSorting(array);}}
System.Collections.ArrayList
Additionally I wanted to know how well the ArrayList does scale.
using System;using System.Collections;
public class Sort{
static void Main(){ ArrayList list = new ArrayList(100000);
Random random = new Random(); for (int i = 0; i < 100000; i++) { list.Add(random.Next()); }
DoTheSorting(list)}}
The system
Everything was tested on my AMD Athlon XP 2200+ system running Gentoo Linux, gcc 3.3.3 and Mono 1.0. It was also done while the whole desktop was running and timed using the time command. The numbers are just approximations but should be good enough to give an impression of the picture. I also do not know how good the two pseudo-random number generators are and whether they influenced the performance.
Doing nothing
The first thing I wanted to know was how long it takes just for starting the programm and initializing the array. I'll just give you the figures for 100.000 and 2.000.000 entries (I the latter for an additional comparision of the faster algorithms):
- C: 0.018s / 0.100s
- Mono/Array: 0.075s / 0.23s
- Mono/ArrayList: 0.12s / 1.13s
Nothing unexpected here: ArrayList are the slowest (and scaled the worsed) and Mono has some additional startup overhead compared to plain C.
Bubble sort
For C code and explanations see fejj's original post on bubble sort. I only used the second (improved) version. The C implementation finished after 43 seconds. The C# versions needed minimal changes:
public static void BubbleSort(int[] a){ int tmp, max, j;
for (int i = a.Length - 1; i >= 0; i--) { for (max = 0, j = 1; j < i + 1; j++) { if (a[j] > a[max]) max = j; }
if (max < i) { tmp = a[max]; a[max] = a[i]; a[i] = tmp; } }}
and
public static void BubbleSort(ArrayList a){ int max, j; object tmp;
for (int i = a.Count - 1; i >= 0; i--) { for (max = 0, j = 1; j < i + 1; j++) { if ((int)a[j] > (int)a[max]) max = j; }
if (max < i) { tmp = a[max]; a[max] = a[i]; a[i] = tmp; } }}
Running the compiled code was an ambivalent experience. The array version finished sorting after 35 seconds. That is 8 seconds off the C version! I just could not believe my eyes. But even rerunning the programm did not change the outcome. How can it be that the C# version with all these additional layers of complexity and abstraction (the VM, the more intelligent arrays...) is so much faster than simple, straightforward C code. Is the JIT just better at optimizing the code than gcc is? I just do not believe it and have not explanation of these results.
The ArrayList on the other hand performed so bad that I got impatient an killed the test after about 10 minutes. For 10.000 items the sorting takes about 2 seconds. As I figure access to ArrayList items is pretty slow and gets slower with increased list size - and bubble sort has a lot of item accesses (~100mio for the 10 thousend items sample I ran the profiler on).
Tomorrow I will show you comparisions for insertion sort and how one can speed up moving large portions of arrays in .Net.
No comments
Comments feed for this article
Trackback link: http://caramdir.at/blog/2004/08/sorting-in-mono/trackback/