Programming

You are currently browsing the archive for the Programming category.

I just made my first few experiments with PovRay. While the syntax is surprisingly accessible, PovRay's use of a left-handed coordinate system was rather strange to me. In mathematics it is common to use a right-handed system and at least in Austria it is also used in other disciplines. So one of the first things I wanted to do, was telling PovRay to use my standard coordinate system and with the help of the excellent documentation came up with the following:

camera {
location <2,2,2>
up <0,1,0>
right <-1.33,0,0>
look_at <0,0,0>
sky <0,0,1>
}

The factors in up and right have of course to be set according to ratio of the resulting image. The "-" in right makes the coordinate system right handed and the "<0,0,1>" sky-vector makes the z-coordinate pointing upwards.

So now I can use a "decent" (meaning the one I am accustomed to) coordinate system! :)

Vielleicht hilft es ja auch jemand anderem bei den Visualisierungs-Aufgaben.

I have witten a small C# (Mono) program to generate Cayley tables for symmetric and alternating groups. The source code is available (put into public domain). It also contains a small class to represent permutations.

The resulting tables for S3, A3, S4 and A4 are also online.

Wanted to try generics support in Mono, and write a simple matrix class. Sounds pretty easy, but soon I noticed, it wasn't. The are no constraints for operators, and so how does the compiler now if something like

public class Calculator {	public T Add(T val1, T val2) {		return val1 + val2;	}}

is valid? So it throws an error. Looking through the MSDN Library and such I found no obvious solution to this problem, but Google eventually found me this very good article at The Code Project. The proposed solution is not the simplest one, but has (when the compiler is goof enough) no performance drawback, is quite transparent to the user of the generic class, and the downloadable code supplied with the article does much of the work anyway.

Oh, and by the way, I found to other articles at MSDN, which are wort a read: Introduction to C# generics and Create Elegant Code with Anonymous Methods, Iterators, and Partial Classes.

To finish this trilogy of benchmarks, I'll give you the results for two faster algorithms (shell sort and merge sort) and tell you why C is not slower after all.

But first I have to add a small, but nevertheless important, reason for ArrayList's slowness to my discussion in the last post: ArrayList use a System.Object[] array for internal storage of its contents. This simply means that a lot of extra boxing/unboxing is going on and thus ArrayLists impose lots of overhead. I'll come back to this a later.

I did not even think about this, but when invoked without any flags, gcc uses -O0 by default, which translates to "just do the fasted compilation possible, don't think about code performace and do not alter the code in any way." This is very practical for debugging, but of course not for production use. A friend of mine pointed this out, when we were discussing the results of my tests. Obviously mcs/mono do optimize the execution. With -O3 I got totally different results.

Here is a table of all average execution times I got on my machine. Please note that they are just approximations.

  C (gcc) C (gcc -O3) Mono C# (Array) Mono C# (ArrayList)
Initialization 0.018/0.100 0.018 0.075/0.23 0.12/1.13
Array[List].Sort() - - 1.5/32 0.89/20.5
BubbleSort 43 22 35 too long
InsertionSort 32 7.7 19 too long
InsertionSort (optimized) 25 13.5 18.5 7m 24s
BinaryInsertionSort 23 7.0 17 too long
BinaryInsertionSort (optimized) 6.2 6.5 7.1 7.8
BinaryInsertionSort (no recursion) 6.5 6.4 6.0 7.8
ShellSort (value set 1) 7.5 4.2 5.8 2m 20s
ShellSort (value set 2) 0.09/4.6 0.056/3.3 0.15/5.5 1.6/61
MergeSort 0.054/1.091 0.045/0.917 0.132/1.66 0.78/21
MergeSort (no recursion) 3.3 3.3 - -

The "optimized" algorithms are the ones with memmove, Array.Copy(), etc. Where two values are given, the first one is for 100.000 items and the second one for 2.000.000 items. A discussion of gcc's perfomance with optimizations turned on (especially in the case of the optimized insertion sort) is out of scope for this small series.

So there are only two questions left: Why is Array.Sort() so slow, and why is ArrayList.Sort() faster. The answer to the former one: Array.Sort() uses the IComparable interface. This means boxing and function calls, i.e. loss in efficency. To get a feeling for the impact of (un)boxing, I tested morge sort (version 2) with an object[]-array of 100.000 (boxed) integers. It was considerably (about 3 times) slower than the version with an (not boxed) integer array.

The answer to the latter question is probably, that the values are already boxed in an ArrayList, so there is only the overhead of function calls.

To sum this up: With .NET one should really look out for (unintentional) boxing of value types. This might get (at least a bit) easier with the inclusion of generics in .NET 2.0. Until then, one has to watch out for this pitfall, especially when writing performance ciritical code.

After I realized (as described in the first part of this series), that Mono was faster than I had imagined, I continued my benchmarks with the two versions of insertion sort and wondered if .NET could somehow counter the efficency of memmove.

Insertion Sort

Again, explanations and C source code can be found in fejj's original post. A description of the testing procedure is in my first article. The unoptimzied C code finished after about 32 seconds. Converting the code to C# was again very straightforward and I won't bore you with listings.

Mono arrays finished even better than last time. Only 19 seconds! The ArrayList again was so slow, that I aborted the test.

So, how could I optimize the C# code to "counter" C's memmove? "Luckily", the optimized C code takes 25s to sort the 100.000-item-array, which is still slower than the unoptimzed C# code! At first I thought that there was no way to do something like memmove in C# as it is a very low-level operation and the .NET Framework is all about abstraction. I didn't even bother looking through the docs. But I did take I closer look at the memberlist of the ArrayList class. Two interesting functions caught my attention: GetRange() and SetRange(). The former returns an ArrayList, but "does not create copies of the elements: the new System.Collections.ArrayList instance is a view window into the source list." (from MonoDoc). The latter takes and ICollection and a startindex and copies the values from the collection to itself. If the implementation was done well and recognized that the source and the target were the same, then it could speed up things quite a bit. And indeed, it did so. Insertion sort on ArrayLists was now down to 7 minutes and 24 seconds...

After this success, I thought that I might take a closer look at System.Array and found the Copy(Array sourceArray, int sourceIndex, Array destinationArray, int destinationIndex, int length) method. From the docs: "[The function] Copies the specified number of elements from a source array starting at the specified source index to a destination array starting at the specified destination index. [...] If sourceArray and destinationArray are the same array, [Copy()] copies the source elements safely to their destination as if the copy were done through an intermediate array.". Looks promising, but in this case it only sped up execution by about 0.5 seconds. Not much, but this technique might prove to be of good use later.

Binary Insertion Sort

fejj's code (without memmove) takes 23s, C# Arrays are again faster, but only by 6 seconds, and ArrayLists are again far too slow. Plugging in the optimizations I got a bigger improvement than last time: 6.2s (C), 7.1s (C#), 7.8s (ArrayList). The world is normal again! C code is faster than C# code! This is also the first algorithm, here ArrayLists can compete with ordinary arrays. Supposedly get/set operations are very slow with ArrayLists as the optimized binary insertion sort has very few of them compered to all former algorithms. This is probably due to the fact that the System.Array indexer can be directly mapped to a few assembler instructions (like C arrays), while when using an ArrayList there is an additional method call, which produces quite a lot of machine instructions. And of course there is the second checking of list bounds and the internal version bumping.

Eliminating the recursion only speeds up the C# array version (don't ask me why), so that it runs in about 6.0s.

Jeffrey Stedfast's (fejj) series on sorting algorithms has spurred me to compare his results with equivalent implemetations in C#/Mono. I expected Mono's performance to be no match to the direct implementation in C. After all with .NET one has the additional layer of the VM and no direct access to memory. But the actual results I got were quite astonishing.

What did I do?

I did two comparisions: One with a standard array and one with an ArrayList.

System.Array

I simply took fejj's source code and changed "int a[100000]" to "int[] a = new int[100000]" - and by the way removed the unnessesary size_t n Parameter of the sorting functions. .NET's arrays are more intelligent than their C/C++ counterparts, which are mostly just a convenient shorthand for direct pointer arithmetic. System.Array objects are conscious of their own size and check if the programmer stays within them.

Of course the initialization of the array is different to C, so I got following skeleton for the tests:

using System;

public class Sort{

static void Main(){   int[] array = new int[100000];

   Random random = new Random();   for (int i = 0; i < 100000; i++) {       array[i] = random.Next();   }

   DoTheSorting(array);}}
System.Collections.ArrayList

Additionally I wanted to know how well the ArrayList does scale.

using System;using System.Collections;

public class Sort{

static void Main(){   ArrayList list = new ArrayList(100000);

   Random random = new Random();   for (int i = 0; i < 100000; i++) {       list.Add(random.Next());   }

   DoTheSorting(list)}}
The system

Everything was tested on my AMD Athlon XP 2200+ system running Gentoo Linux, gcc 3.3.3 and Mono 1.0. It was also done while the whole desktop was running and timed using the time command. The numbers are just approximations but should be good enough to give an impression of the picture. I also do not know how good the two pseudo-random number generators are and whether they influenced the performance.

Doing nothing

The first thing I wanted to know was how long it takes just for starting the programm and initializing the array. I'll just give you the figures for 100.000 and 2.000.000 entries (I the latter for an additional comparision of the faster algorithms):

  • C: 0.018s / 0.100s
  • Mono/Array: 0.075s / 0.23s
  • Mono/ArrayList: 0.12s / 1.13s

Nothing unexpected here: ArrayList are the slowest (and scaled the worsed) and Mono has some additional startup overhead compared to plain C.

Bubble sort

For C code and explanations see fejj's original post on bubble sort. I only used the second (improved) version. The C implementation finished after 43 seconds. The C# versions needed minimal changes:

public static void BubbleSort(int[] a){   int tmp, max, j;

   for (int i = a.Length - 1; i >= 0; i--) {       for (max = 0, j = 1; j < i + 1; j++) {           if (a[j] > a[max])               max = j;       }

       if (max < i) {           tmp = a[max];           a[max] = a[i];           a[i] = tmp;       }   }}

and

public static void BubbleSort(ArrayList a){   int max, j;   object tmp;

   for (int i = a.Count - 1; i >= 0; i--) {       for (max = 0, j = 1; j < i + 1; j++) {           if ((int)a[j] > (int)a[max])              max = j;       }

       if (max < i) {           tmp = a[max];           a[max] = a[i];           a[i] = tmp;       }   }}

Running the compiled code was an ambivalent experience. The array version finished sorting after 35 seconds. That is 8 seconds off the C version! I just could not believe my eyes. But even rerunning the programm did not change the outcome. How can it be that the C# version with all these additional layers of complexity and abstraction (the VM, the more intelligent arrays...) is so much faster than simple, straightforward C code. Is the JIT just better at optimizing the code than gcc is? I just do not believe it and have not explanation of these results.

The ArrayList on the other hand performed so bad that I got impatient an killed the test after about 10 minutes. For 10.000 items the sorting takes about 2 seconds. As I figure access to ArrayList items is pretty slow and gets slower with increased list size - and bubble sort has a lot of item accesses (~100mio for the 10 thousend items sample I ran the profiler on).

Tomorrow I will show you comparisions for insertion sort and how one can speed up moving large portions of arrays in .Net.

As I needed a library to read/write ID3 tags in a C# program, I had to write bindings to Scott Wheeler's TagLib yesterday. You might want to download the source code. Since directly accessing C++ classes is not possible from .NET, and the only C++ to .NET bytecode compiler I am aware of is Microsoft's (which I don't have), I currently only wrapped the C bindings of taglib. They are not as powerful as the C++ classes, but at least the most important functionallity is there. I have not really tested the C# bindings yet (especially the setters), but they should work...