Author: Rory Winston

Coding

Visual C# Very Slick

I must confess, I’ve always been a fan of Microsoft’s IDEs for developers. One thing you have to give the Redmondians credit for is that they realized at a very early stage that healthy developer support was the key to maintaining interest and support of their products in the marketplace. To that end, they released more APIs, SDKs, and IDEs than you could shake a large stick at, and some of them were even quite good. The Visual Studio family of tools, for instance, was a mixed bag, but it had a couple of gems. Alongside the tragic Visual Interdev, the utterly terrible Visual SourceSafe, and the vomit-inducing Visual Basic, there were some slick and useful applications. One of these was ( and is ) Visual C++, which easily wiped the floor with any other C++ IDE, even in Borland’s heyday. This product has matured and improved with age – I recently gave a quick test drive to the Visual C++ 2005 beta whilst fixing a bug in some legacy code, and I was impressed.

Another very nice addition to the Visual Studio family was unfortunately short-lived. Way back when, MS introduced a product called Visual J++. This was their first (and last) step into the Java IDE market. The language was Java, but being Microsoft, it wasn’t 100% “pure” Java – it had some custom extensions that made it very useful in a Windows-only environment. Chief among these was J/Direct, which basically was a mechanism that inserted proprietary bytecode into generated .class files, relying on the extensible bytecode mechanism written into the JVM specification by Sun. This turned out to be really powerful, and a godsend for us (at the time) Windows programmers. It allowed you to declare and invoke native function calls in a similar manner to how VB did, and all the parameter and datatype marshalling was done under the hood. The entire Windows API was at your beck and call, and all for very little effort. A set of wrapper classes for common chunks of the Win32 API was supplied, and was called WFC (Windows Foundation Classes). I remember giving a presentation at a Java conference in San Jose back in 1998 on this stuff, and the Java guys who managed to get over their disgust at me presenting on a MS-specific topic were amazed at what it could do. There were two other great attributes to this package as well – it had a very nice event model (called delegation, which you can find in Visual C#), and a drag-and-drop visual form editor, good examples of which were relatively rare in the Java world at the time. So you got the elegance of the Java syntax, with the ease of GUI construction that at the time only existed in VB.

Unfortunately, it couldn’t last. MS and Sun fell out, and took their mutual dislike to the courts. Visual J++ gradually become sidelined, and then an MS technology evangelist told me at a conference that J++ was quietly being dropped, and a replacement was being mooted (at the time, it was codenamed “Cool”).

Which brings us back to 2005, and Visual C#. What you get when you work with Visual C# is the direct descendent of Visual J++. The same guy (Anders Hejlsberg) has lead the development stream for both J++ and C#, and he has fused a lot of the original ideas of J++ with many (many) more directly from Java. And it’s really, really good. I’m actually going to try and earmark a piece of work that I know will be Win desktop-specific and I’m going to do it using Visual C# 2005. It took me no time at all to knock up a desktop client for CruiseControl, and I was rather pleased with the result.

Of course, there are limitations – portability being the obvious one. But I do think it’s wise to have as many tools at your disposal as you have individual problem areas to attack (also a famous theory in economics). And I think Visual C# will be able to solve quite a few problems for me in the future.

Coding

Blob Hell

I was happy to see that the JDBC 4.0 spec will contain support for improved BLOB and CLOB handling. Blob handling is still one of the most awkward areas of cross-database implementation work. It would be nice to have a consistent interface for blob creation, update, and streaming across databases. Ostensibly, that’s what we have in JDBC 3 . However, in practise, things are not so simple.

One of the fundamental issues is how the RDBMS handles LOB data. To perform an INSERT or UPDATE of blob data, the RDBMS hands the client back a pointer or “locator” to the blob data. The manner in which they do this varies across implementations and vendors. This means in practise that you have different limitations depending on which DBMS you are using. Oracle has always been a culprit (until version 10) with its infamous 4k limit on streaming data, which has resulted in hundreds of applications and frameworks (Hibernate among them) having to write Oracle-specific JDBC-level code to work around this issue.

We are currently working on a system where we have written a server component which accepts standard FTP connections from clients. The clients send large binary files via FTP, which are streamed and stored into the database within a single transaction. Various consistency checks are performed along the way, and if any of the checks fail, an error is returned to the client and the transaction is rolled back. This means that if the client receives a successful return, they can be sure that the data is stored and ready to be processed.

The main problem has been with trying to successfully stream the blob data in a clean and portable way. Our production database is Oracle, but it’s nice to be able to test on say, MySQL or HSQL. If you want to go straight to JDBC, you can use Oracle’s empty_blob() function, grab a blob locator from that, and then use the oracle.sql.BLOB type’s getBinaryOutputStream() to get an OutputStream to write to. You can then chunk the data from a socket directly to the database server. However, achieving this at a higher level is altogether more challenging. Thankfully, Spring has some streaming LOB support, which is what we have based our current solution on.

Coding

Debugging CVSGraph

Continuing on from my last post about fixing some issues related to MySQL and Python on Solaris, I came across another issue this morning which also necessitated digging out a copy of good ole gdb. The basic issue was that CVSGraph segfaulted every time I attempted to generate a revision graph. This was reproducible every time, no matter what the input. Turning the verbosity to the maximum allowed level did not produce anything useful.

First thing I did was download a copy of gdb, and did a configure/make/make install. I initially thought that if I could get cvsgraph to produce a core dump on exit, I would be able to examine it within gdb and get some clues about the cause. However, I couldn’t get CVSGraph to automatically core dump, even after setting core policy using coreadm, as shown here.
UPDATE: I found the missing piece of the puzzle – the maximum core dump size had not been set via ulimit. Setting this enabled automatic core dumping.

The next step was to actually load cvsgraph into gdb and run a test session inside the debugger. After a couple of runs, I had isolated the problem to a specific routine. I set a breakpoint and ran through the test case again:

First, I set up the command line arguments:

(gdb) set args -c /usr/local/viewcvs-1.0-dev/cvsgraph.conf -r /usr/cvsroot cobra/build.xml,v

Then set a breakpoint at the relevant location:

(gdb) break cvsgraph.c:1092
Breakpoint 1 at 0x1311c: file cvsgraph.c, line 1092.

Then kick off the target program:

(gdb) run
Starting program: /root/cvsgraph-1.5.1/cvsgraph -c /usr/local/viewcvs-1.0-dev/cvsgraph.conf -r /usr/cvsroot cobra/build.xml,v

Once gdb hits the breakpoint, it stops and waits for instructions:

Breakpoint 1, expand_string (s=0x50d51 “d”, rcs=0x54450, r=0x545b8, rev=0x50770, prev=0x0, tag=0x0) at cvsgraph.c:1092
1092 t = mktime(&tm);

I manually stepped forward a couple of times until I hit the problem:

(gdb) n
1094 if(env)
(gdb)
1095 setenv(“TZ”, env, 1);
(gdb)

Program received signal SIGSEGV, Segmentation fault.
0xff3a0510 in memcpy () from /usr/platform/SUNW,Sun-Fire-V240/lib/libc_psr.so.1

Now we can do a stack backtrace to see where we were at the time:

(gdb) bt
#0 0xff3a0510 in memcpy () from /usr/platform/SUNW,Sun-Fire-V240/lib/libc_psr.so.1
#1 0x0001ed6c in setenv (name=0x1f4e0 “TZ”, value=0xffbfffac “GB”, replace=1) at ../../../libiberty/setenv.c:156
#2 0x00013140 in expand_string (s=0x50d51 “d”, rcs=0x54450, r=0x545b8, rev=0x50770, prev=0x0, tag=0x0) at cvsgraph.c:1095
#3 0x00016a28 in make_layout (rcs=0x54450) at cvsgraph.c:2937
#4 0x00019ecc in main (argc=327696, argv=0x50010) at cvsgraph.c:3879

So know we have isolated the problem down to setenv(), which is implemented in the GNU libiberty adapter library. I exited gdb and wrote a simple test case based on what CVSGraph was doing at the point in question, and found that the problem can be easily reproduced by calling putenv() to create an environment variable, and then immediately calling setenv() to reset the value. This may be due to a bug in the libiberty putenv implementation.

In reality, we dont really need the call to putenv() here – its actually redundant, as setenv() will allocate space for the new variable if necessary. So I simply commented out the offending line, remade CVSGraph, and now we have the (very useful IMHO) graphical branching display from ViewCVS.