Daft Developer: April 2006

Sunday, April 30, 2006

Apply a Default Sort Order

Always sort your object lists

I tend to get pretty lazy with my coding sometimes. My TODOs occasionally require more typing than it would take to actually do the TODO (I sometimes get lazy with the thinking part). So why would I recommend* (*insist on) sorting every list of objects in your system? Two reasons.

It makes testing a whole lot easier

You're less likely to have an unsorted list of items appear in your GUI

I have only just started doing this with my code* (* I may post a more educated and experienced blog later which includes the phrase 'it depends' when talking about when to sort lists) and it seems to be paying of so far.

Writing Tests on Sorted Lists

In the past I would put together a small 'IsInList' method to test that an object was in a list. Or I would create a 'GetObjectInList' to get an object and test it for expected values. The fact that the objects were stored in a random order of course made this necessary. The unsorted nature of the lists also meant the search had to be sequential which seemed amateurish (the performance implications were negligible since the lists were tiny, but...).

In any event, the nice thing about using a database is the ease of adding sorting. With a simple ORDER BY in MS Access I can sort on any column ascending or descending. Very little code. The .Net Framework also provides some simple and powerful ways to sort lists. Sorting my lists makes testing easier since object location is predictable. Testing is all about ensuring predictability anyway, right?

Your GUI lists should have default sort orders

If you have object lists and a GUI, you probably have list controls. I'll bet users are expecting data in those lists to be sorted either according to a scheme of their choosing, or in a commonly expected order. Ah, but if you use a layered design* (*like most kind-hearted developers) the question here becomes... where should sort logic exist? In the presentation layer or in the business model? I prefer a thin* (*super thin) presentation layer, the only code in the presentation layer should deal with form and page controls. I suggest then, that sorting objects is the responsibility of the model and is a business rule. Sorting is part of the analysis of data, like calculations. Testing logic in the GUI is also more difficult, so move as much logic as possible into to the model layer. If you are consciously sorting your object lists and your tests are checking this, there is very little chance of a nasty unsorted list control bug appearing in your gui (this happened to me once, and was caught by a customer! egad.)

The .Net Framework offers some powerful sorting tools namely the IComparable interface, and Comparer objects. These two tools give you unlimited sorting power, easily and elegantly.

Monday, April 24, 2006

ASP.Net Timeout

HTTPUnhandledException: Request Timed Out.

Every now and then, when a request had a lot of work to do, I would get this exception. It didn't 'ripple up' from a method in the regular way that an exception does. It just caused execution to cease and was caught by the default error handler. There are a few places where timeouts are set. One is in IIS. In the Web Site properties there is a setting called 'connection timeout'. I thought this might be the candidate. So I increased the timeout, but no luck. The operation causing the timeout was an data integration over a network. The integration API had a connection timeout setting too. Changing this timeout had no effect.

Another strange behavior was that in debug mode the problem went away. The operation would complete without the time out. Sadly, I had to debug the problem in release mode, with many symbol files missing. At around the same amount of time I would get an 'ObjectDisposedException' on my NetworkStream object, followed by a ThreadAborted exception.

Turkey Trails

I found some threads online about the .Net garbage collector prematurely destroying objects if it thought they were out of scope (or might as well be since there were no subsequent references to them). So I daftly started adding Garbage Collector control statements to my code to preserve my objects. Of course this just 'felt' wrong, but hey, desparate times... Well of course no luck.

Day 2

It always helps to be fresh, and to bring in a fresh perspective. So, with the help of a coworker we clarified 2 important clues.

Something is terminating the worker thread

The thread is terminated after 90 seconds

There was definitely a timeout somewhere which was bringing everything to a halt. I already knew it wasn't the IIS connection timeout, but what about this ASP Script timeout under virtual directory configuration? It was set to 90 seconds too! Alas, it was not the culprit. Which of course made sense after I tested it, but ya never know.

The Almighty Interweb

How did I code before the Internet? (Yes I am that old). Some Googling later I discovered, da na na naah! An obscure, yet highly important setting in machine.config. Something that ASP.Net uses to terminate threads. Something that is defaulted to 90 seconds. The answer, the culprit! A glorious little piece of XML.

ExecutionTimeout

Allow me to introduce executionTimeout. It looks like this,

<configuration>
<system.web>
<httpRuntime executionTimeout="90"/>
</configuration>
</system.web>

It is set to 90 seconds in machine.config, and does not appear in the default web.config file that VS.NET creates for Web Applications. Every ASP.NET developer should be aware of this setting. Why is it kept hidden away? Ahhh, the eternal mysteries... Anyhow, there it is. Add it to your web.config. Pass it on.

Thursday, April 20, 2006

ToString()

What should I do with ToString()?

In the Microsoft .Net Framework, ToString is defined in Object and is therefore ubiquitous (and everywhere too). You have the chance to override ToString whenever you want. But if you're thinking 'hmm... I don't override tostring() very much', then I would suggest maybe it's time to start. It certainly is handy to call whenever you need to output an error message. I almost just expect it to work (but nothing in life is free). Just call ToString() on your object and add to your trace messages. Very nice, but for the fact that error log messages aren't the only possible use for ToString(). Alas, with such a ubiquitous method, come multitudinous options for usage. Int.ToString() is an example of the obvious and straightforward, but what do you with an object with multiple properties and subobjects, and inheritences and so forth?

Proposed Behavior

I like the idea of ToString() displaying object identity. If you work with objects that model something real, like a business concept then output information that identifies the object. Not the internal key or database id, but the natural key. The data that people undestand to represent the essence of the thing. For instance, if your object represents a customer, ToString() would return the customer's name. Return a string that would be identifiable to a user using your software. Take a look at how DateTime.ToString() works, it is a good, simple model. The default ToString() returns the date as a string formatted according to the current thread culture. It displays the date in a way that I can read it. DateTime.ToString() takes the various members (day, month, year etc.) of DateTime and assembles them into a reasonably formatted presentation. Most objects have a 'name' property or a similar identifier. This is what ToString() should output. Converting an Address object with ToString() would probably output the address as it would be written on a letter.

Complex Objects

What about something like a Report object? Should ToString() return the report content as a properly formatted string? I would suggest that this kind of behavior is beyond the intent of ToString(). I would be inclined to only output details that again, identify the report, like its title, run date, and so on, not the content of the report. This should be left to specialized methods that can be expanded to more flexible use. Output a string that would would be written on a file folder tab. The content is in the folder, ToString is the human readable index. The information you would need use to retrieve the object.

Format Providers

Now it gets complicated (well, more complicated for me). What about customizing the output of ToString()? Back to my favorite, DateTime. People want to see dates in all sorts of different ways. DateTime.ToString() solves this by taking an optional IFormatProvider to display dates according to cultural preferences. Exposing ToString() with parameters is a good way to provide extra control over formatting. Again it still only outputs identity, just in a different format.

I am starting to implement the base ToString() on every class I write, and it is very convenient. I truly dislike seeing the Namespace.classname output when I call ToString(), very annoying in my view. Implement ToString and make users of your code happy (including yourself).

Monday, April 17, 2006

Elegance is Not Optional

I have an old book (1994) on Prolog called The Craft of Prolog written by Richard A. O'keefe (MIT Press). One of the crucial 'values' of the book is that "Elegance in Not Optional". I find this to be a very striking statement, as I have always felt that design suffered under the realities of practical use. Application performance always seems to ruin a program. It may be that Prolog, because it runs in a isolated workspace has the pleasure of elegance. There are no nasty operating systems or databases to deal with. Yet, I am still drawn to this vision, I want to believe. I have added a small quotation from the text.

Elegance is not optional (1)

What do I mean by that? I mean that in Prolog, as in most halfway decent programming languages, there is no tension between writing a beautiful program and writing an efficient program. If your Prolog code is ugly, the chances are that you either don't understand your problem or you don't understand your programming language, and in neither case does your code stand much chance of being efficient. In order to ensure that your program is efficient, you need to know what it is doing, and if your code is ugly, you will find it hard to analyse. (The Craft of Prolog, Richard A. O'keefe, MIT Press 1994)

I believe that a developer should strive for readability first and foremost. Most of the guidance in Code Complete (Steve McConnell, Microsoft Press) is aimed at improving code readability. I would even go so far as to say readability first, function second. You can always debug a readable program and make it right. But a mess of code that works might as well not exist since any changes to it probably mean rewriting it anyway.

Readability is About Design

Comments, documents, unit tests, all of these things merely support readability and understandability. In the way that a textbook supports a professor. The professor must still be able to teach and must stand on their own with or without the text to be truly effective. Such is also true with code. It should make sense without the supporting pieces, since so often these pieces fall into disrepair. Code is never 100% covered by tests. Comments are usually sparse, out of date or poorly written, as with documentation. The organization and naming in the code must point to its intent and function. The code expresses the solution, it doesn't just implement it. This should be the developers prime concern, express the problem and its solution with the program itself.

Elegant Design

What is elegance? Have you ever heard the expression 'Form follows function'? Everything in the system must have a good reason for being there. Nothing is superfluous, there is no programmer ego, or 'interesting' technology, or 'feats' in the code. The code should read like a well written technical document. Step by step to a predicatible ending. Detail at each level is consistent, methods and data are grouped logically. Nothing is out of place or mysterious. Design also subscribes to the expression 'Function follows form' whereby choosing an elegant form results in a superior design. This implies an intuitive element to coding where you can't just apply rules and refactorings. If it looks good it probably is good.

If you understand the problem, you should be able to look at the solution and say ahhh... yes, I see.

Sunday, April 16, 2006

Exceptions vs Returns Codes - Performance Implications

I am working on inventing or finding a satisfactory validation strategy for my .Net web applications. The final model must appease the following criteria,

I must be able to validate the entire screen in one shot.

There must be no duplicated code.

Objects in the system must always be 'valid'.

I want to use as much C# as possible, and minimize javascript

I must be able to validate the entire screen in one shot

This is purely a usability concern. Who wants to go through one error at a time on a page? Fix this, now fix that, opps... this one too. Oh, that would be a duplicate, you'll have to come back later. I think the system should be able to point out all problems immediately.

There must be no duplicated code

I don't want to maintain two sets (or more) of validation code. As in, validation objects on the page, and validations in the model layer. This requirement is in conflict with the other two, which makes my demands demanding.

Objects in the system must always be valid

I don't like the idea of an object being somehow malformed or incomplete. A situation where I have to check an object before I do anything seems to contradict my understanding of encapsulation. An object that starts throwing errors because it is invalid contradicts the very reason for using objects. Which is, keeping things simple. This requirement conflicts with requirement 2 because now you can't just take the validations out of the object, to ensure validity the object has to do validation.

I want to use as much C# as possible, and minimize javascript

I like C#. Visual Studio is an excellent IDE for coding in C#. C# is strongly typed and Object Oriented and is a fine language all aroung. I do not like Javascript. It is not strongly typed, and not really Object Oriented. Javascript is difficult to debug and VERY browser specific. The end.

The requirements may be asking for too much, but I feel that there is somehow a way to satisfy them all, and still have a fairly elegant solution.

Back to Performance

So, I am currently comparing validation through exceptions against the more traditional (pre OO) return codes. My first test is performance. I have read much on the Internet regarding the poor performance of exceptions so I wanted to see it for myself and get some hard numbers. The evidence indicates that Exceptions are slower (Using 1.1 of the .Net Framework). Slower by about 100%!

The Tests

The tests were fairly simple. I created an class whose constructor took 2 arguments, a string and an integer. If the string was null, or empty that was a validation error. If the int wasn't between 0 and 100 that was an error. The exceptions method just threw the various 'Argument' exceptions provided by the framework and filled in a nice descriptive error message which was passed to Debug.WriteLine. The Return code method was more difficult to code and even required a trip through the debugger (Remember, I am a bit daft after all). I had to pass an error object into the class constructor so I could get the same amount of information about the validation errors. Foolishly I declared this object as a 'struct' which meant it was being copied to the stack and thrown away, with all my error information too!

The Results

My test tried to create an object with a problem that would be caught with each of the validations. This code was run in a 10,000 iteration loop, with DebugView (sysinternals.com) open to verify the output. Here are the numbers (ms) I got for 8 runs

Exceptions	Return Codes
9,157	5,234
9,953	5,204
9,247	5,235
10,469	5,360
10,204	5,484
10,203	4,734
10,875	4,610
10,843	5,156

I can't account for why the exceptions test speed slowed down. It would probably be worth looking into. The tests were extremely quick to run considering they were attempting to create 30,000 objects each. I was amazed that so many exceptions could be handled in 10 seconds. So, while the exceptions method is much quicker to code and debug, the return codes method is the clear winner when it comes to performance.

Thursday, April 13, 2006

Config File Mayhem

There are config files all over the place!

I don't know if you've ever gotten stuck in the config file swamp, but I'm there now and it is stiiinnkkkeeee! Each Project in my VS2005 solution has a config file. There is one for the Unit Test case project, one for the web app (web.config), and now I've just added one for the Console App. Part of the reason for all of these config files is the fact that all the projects rely on the model library. You know, the library that does all the work.

Notice in the spiffy diagram, how the model library depends on each config file. This creates what I would call 'tight coupling' between layers of the application. The model layer loses resuability and independence. A sort of 'indirect cohesion' between the subsystems through the config file.

My first thought is to refactor the code to move the dependency on the config file up to the 'owners'. The configs are after all owned by these assemblies, so it makes sense for those assemblies to have exclusive interaction with the config files.

So, by the diagram, all interaction with the config file would be up to the config file owners. Settings would then have to be passed into the model as parameters. If this creates a long parameter list, I could group settings into structures.

As usual, foreseeing such problems is always beyond me and it will probably take me some time to work this out to my satisfaction. More postings on this to come :)

Tuesday, April 11, 2006

Creativity and TDD

I just finished reading a blog post on creativity and speed (Posted to the only blog I actually read regularly) which made me think about Test Driven Development as a creativity booster. Kent Beck describes the TDD cycle as

Write a test

Make it run

Make it right

Step 2 is what intrigues me. He suggests "Quickly getting that bar to go green dominates everything else." (Test Driven Development By Example, Ken Beck, p. 11). "Quick green excuses all sins". I believe Kent's intent here is to keep up development momentum, to be driven by the tests. Moving quicky ensures focus and doing only what is necessary, the point is to ensure that you only write code to satisfy the test. However, Kathy Sierra's post made me realize there is another (perhaps unintentional) outcome from being test driven.

I have discovered an interesting side-effect to being Test Driven - creativity. Making that test pass, quickly, stimulates creativity. I've noticed that during the frantic push to green my mind races for options and crazy (Hackery for sure) solutions to make the test pass. The feeling of 'How can I make this pass, quick! you only have 10 minutes! Time! Time!' is a very invigorating, demanding process, and super-rewarding when that test goes green. Of course, once the test passes, I'm usually left with a horrible, disorganized, non-optimal mess. But I've usually learned something new, or done something creative with what I already know. This is completely consistent with the creativity/speed theory however. Creative thought comes from the messy part of the mind and its not going to be clean coming out. Worse yet, trying to create 'clean' kills ideas. You've gotta get those ideas out and down. Don't think, just go! Get that test to pass then you can clean it up later (refactor).

Refactoring of course satisifies the right side of the brain, where you get a chance to be calm and analytical. Creating organization and collecting tasks, and related concepts. Giving proper names to objects etc. I think this mental flip-flop, left-right activity is what makes TDD appealling to use. The brain is fully utilized with TDD, creativity is allowed to prosper without compromising quality. This may explain why Test Driven developers can't imagine coding any other way.

Thursday, April 06, 2006

ADO.NET OLEDB Syntax Errors

I have a large MS Access Sql Script that creates my Access database. I need to run it programatically to create databases on the fly. So I've created a C# application that reads each line of the file and applies the SQL against the database. However, whenever I use the OleDb data provider I get syntax errors. The following SQL for example causes a 'CREATE TABLE' syntax error.

CREATE TABLE Action (
      Id INTEGER NOT NULL,
      ActionTypeId INTEGER NOT NULL,
      Name TEXT(50) NOT NULL,
      Description TEXT(255) NULL,
      ParentId INTEGER NULL,
      SortOrder TEXT(255) NULL,
      Tooltip TEXT(255) NULL,
      RequiredLicense INTEGER NULL,
      OptionalLicense INTEGER NULL,
      TableItem TEXT(50) NULL,
      SystemPrefEnabled TEXT(255) NULL,
      PRIMARY KEY (Id)
)

If I paste this SQL into Access it runs sucessfully. Hmm... I cut down the statement to create a table with only the first column, still fails. So I change the name 'Action' to 'tblAction' and the statement executes without error and creates my table with the name 'tblAction'. So it would seem that 'Action' is some kind of important or reserved word. But why does it work from a SQL window in MS Access? I do not know.

I was about to give up on this and change careers, when I thought I would try connecting to my access database through ODBC. Well, it's been a while since I've used ODBC, but the syntax is pretty much the same so the change is easy. I found a good connection string at www.connectionstrings.com and voila! My entire SQL script runs without error. What!?

I would dig into this and understand why my SQL runs in some cases and not others, but alas, I am already late on this deliverable...

Wednesday, April 05, 2006

ExecuteNonQuery?

I create an Insert query string and pass that into my command 'CommandText' property, and then I call 'ExecuteNonQuery'? NonQuery? Shouldn't that be 'ExecuteQuery'?

But wait, if I set my CommandType to StoredProcedure, then I guess I'm not executing a query. Why wouldn't I call ExecuteStoredProcedure? Well I suppose MS Access doesn't support Stored Procedures.

But hold on, my command object is a SqlCommand object not an OleDbCommand object. Hmmm... so mysterious...

Sunday, April 02, 2006

Aggregate or Association?

Ever since reading Domain Driven Design by Eric Evans, I have been obsessively object modelling every application I write. I am surprised to find that I can hardly ever use aggregation (although I think I should be modelling aggregates). My limited use of aggregates comes from my inability to follow the Aggregation rules, which are,

Objects whose lifecycle are tied to another object indicate a possible aggregate relationship

Objects outside of the aggregate cannot hold references to objects inside the aggregate, only the aggregate 'root' is referenceable

The Aggregate is resposible for ensuring invariants (consistency and integrity) within the aggregation

Domain Driven Design encourages 'Aggregate Boundaries', which means classes outside of the aggregate do not hold references to objects within the boundary. Eric Evans suggests that "Transient references to internal members can be passed out for use within a single operation only." (P. 129), which I imagine means convenience access for one-off operations, but no persisted references.

My aggregate issue tends to crop up whenever I have relationships between objects that validate combinations for proper construction of another object (Does this make any sense?). An example is required to explain here. Take the following diagram,

Imagine an application designed to help people configure and purchase PCs. This application restricts the Processors that can be included in a PC by displaying and allowing the user to only select valid processors for a chosen manufacturer. The system wouldn't allow you to select an "Intel Athlon 50Mhz", because such a chip doesn't exist. Manufacturers produce certain classes of chips which in turn run at certain clock speeds. I believe this is an aggregate relationship since If I delete a manufacturer from the system, all the related chip classes should be removed as they cannot be shared across manufacturers. The model seems to support the requirements.

However, the problem comes when I have an actual instance of a PC with a selected Chip. I want the Chip object to have a specified manufacturer, class and clock speed, which are a valid combination and are references to these objects in the system (I.e. if I change a chip class name it should be applied to every PC that has it selected). Connecting my PC object to a Chip object and a Clock speed violates the aggregate boundaries. See the following "UML" diagram, which demonstrates this violation.

I have now run into this problem on 3 of the real world applications I have Domain Modeled. My suspicion is that my concepts are not quite modelled correctly, or that perhaps my aggregate is not really an aggregate. In any case, I have left the relationships as simple associations which I feel doesn't completely express the domain.