Saturday, September 23, 2006

Attaching Debugger takes Forever!

Why does it take so long for Visual Studio to start my web app in debug mode?


Check your Symbol Server


If you have a symbol server set up, and have the following environment variable '_NT_SYMBOL_PATH', your debugger may be retrieving symbols from a slow store. After some experiments (like uninstalling and reinstalling VS addins, etc.), I removed the _NT_SYMBOL_PATH variable from my system and voila! attaching the debugger is fast again.


Investigation Required


Some analysis required here. My symbol path is quite long.


SRV*C:\data\symbols\OsSymbols *http://msdl.microsoft.com/download/symbols; c:\data\symbols\ProductionSymbols; C:\Program Files\Microsoft Visual Studio .Net 2003\SDK\v1.1\symbols;C:\winnt\system32

I thought to slowly remove stuff in the path until performance returned. It appears that VS caches the symbol path however (blah), making the investigation slow. I am suspicious of the microsoft web address.


The Culprit


It appears that removing the microsoft web address has fixed the problem. It's not really necessary to check for those symbols over and over again anyway. So I have removed it from the _NT_SYMBOL_PATH variable and I will put it back when I need it.

Sunday, September 17, 2006

Less Up Front Design == More Supple Design?

Does reducing the amount of up front design encourage a cleaner, more supple application design?


One Case


My little (ongoing) project, the car maintenance reminder app, is a good example of how designing for one feature at a time encourages a highly extendable code base. Since the amount of time I can spend on it is highly variable, I am not able to plan sprints. So, I merely tackle each item on the backlog as they come. Once I'm happy with the feature, I release it. In a more usual agile development environment it is fairly inefficient to release software each time a feature is completed. There is usually a fair amount of process involved. In my case however, not having paying customers, or customers at all for that matter, means I'm pretty much free to take risk. I don't need the release rigor.


Anyway, back to suppleness I noticed that adding a new feature meant fully re-examining the system design as a whole. Talk about reducing the cognitive load! I only had to ensure that 1 new feature could be integrated with the current model. If it couldn't, I would look at what needed to be added or changed to make it work. Re-examining the design helped me look at awkward areas and forced me to think about them over and over. In the beginning the design refactoring took longer than the changes to add the feature. I swear that this ratio changed as the model built up however. At the moment the model is extremely extendible and I have a collection of 'patterns' that I have used throughout the system that I can draw upon for new features. I have not created frameworks, I have common code and patterns but no frameworks!


Getting Real


My experience with carcarecalendar.com does not match my experience on real projects with budgets and ROI. Corporations are very concerned about risk these days. Planning is still the favourite for risk mitigation. I think planning is good as a communication tool. It is very considerate to inform people that they will be needed to do some work for you, in advance, and preferably with a time frame. I dislike 'emergency' panic situations that come out of bad planning. It is not this type of planning that I question. It is system architecture and design. My aging brain is having more and more difficulty remembering, and grasping massive and intricate systems (or I just don't care so much anymore). I am more capable and successful at designing for a handful of needs than a multitude. Using 100 requirements to accurately design a system is just not possible.


But lack of planning means risk, doesn't it? If we can plan, we must plan. And system design is planning. I would suggest that the only 'planning' exercise that is worth pursuing is proof of concept type stuff. Where you're breaking new ground. Everything else is just project manager CYA.

But UML?


UML should be used to describe a design as it exists at a certain point in time, not as a way to plan creation of software, but to tell the story of already functioning code. By the way if you like UML, try StarUML it's much better than Rose or Visio (it doesn't beat a white board though).


Design as You Go


So dispense with the docs and pictures, keep things thin. Nobody wants to read that stuff, and anyway software never turns out as designed. Stop wasting time doing stuff you hate and start building, your software will be softer, your customers will be happier and you will be too.

Wednesday, September 06, 2006

Agile Methodologies - Anti-Reusability?

If you subscribe to the Agile notions of YAGNI (You aren't gonna need it) and DTSTTCPW (do the simplest thing that could possibly work), are you potentially writing code with limited reusablility?


The Purpose of a Routine


In Steve McConnell's Code Complete he identifies many reasons to create a routine. Avoiding duplicate code is the most popular, but there are other reasons too. One of them is 'Promoting code reuse'. In the Agile age then, is it still prudent to try to make code reusable? Or should developers simply abstract when the need arises? Developing for the immediate need, and not some uncertain future.


If we apply the Agile XP practices in their purest form, it could be argued that new routines should only be created when needed. Perhaps that's not quite right. Perhaps we should create a function or method when a refactoring requires it. This still means blocking out any thoughts of reusability. Making a method more 'reusable' than necessary still contradicts the spirit of agile.


For example, if my code always multiplies a number x by y and y is always 2 my function should look someting like,


public int multiplyXAxisBy2(x)
{
return x * 2;
}

If my multiply method took two arguments x and y and multiplied them it could be argued that I am 'building for the future'. Alright, I'll admit this is a contrived and extreme example. But I have seen developers argue for hard-coded values in their methods on the grounds that exposing those values as parameters violates the YAGNI principle.


Balance


Like most things in software, there is no simple answer. The best you can hope for is 'it depends'. In the case of agile design and reuse, the 'it depends' postulate seems to hold. YAGNI is perhaps a reaction to the 'modeling the world' design dreams of the past. It is a way to pull back on the programmers reigns and say 'Hey, the customer needs something real, today! stop dreaming and get on track'.


Achieving balance between YAGNI and REUSE means looking at you method interface and asking 'does it stand on its own, does it make sense?'. Constantly changing method names through refactorings is probably and indication of a poor interface. The method names should hardly change at all, so make them specific and understandable. The parameters should jive with the method name. For example, a method like SaveAttachment() should take a parameter like an Attachment object. It should not take parameters that leave the caller trying to understand the internals of the method. Something like SaveAttachment(UserLogin, Attachment, UrlLink, AttachmentType) is probably a sign of a bad object design (some of these parameters should probably be contained in the object itself).


So, in sum I have to say use your judgement and try to look at your interfaces in isolation, not as interconnected pieces, and hopefully you will create reusable code without straying too far from YAGNI.

Saturday, August 05, 2006

An Acceptable Validation Strategy

How do you architect business validation logic without creating duplicate code, but ensuring a positive user experience?


Validation as Business Logic


Mandatory data fields, or duplicate checks are user/business imposed requirements. These types of requirements are best served by the code in the system that reflects the business - the Domain Model. The business 'Domain' is the part of the system that immediately reflects the needs and requirements of the business and other vested interests. The 'Model' or 'Domain Model' is the code that implements these requirements. All business validations must exist in the model or business layer. This is why the model exists, its job is to model the business, and it is the part of the system that is accountable for the business requirements.


Implementation


In C# I like to separate the model classes from other utility classes with a namespace, say the 'model' namespace. If the model is large, the key classes can be further separated into a 'core' namespace. At the moment my prefered method for communicating validation errors from the model is through exceptions, using the built in .NET framework exceptions and implementing my own custom exceptions (make sure to run FX Cop against your code to ensure your exceptions are CLS compliant).


Validation as Presentation Logic


Exceptions from the model layer passed up to the application screens lead to a very poor user experience. The user is forced to work through each validation one at a time, and the exceptions messages may not be very helpful.


It is not the responsibility of the model layer to solve this problem. The model reflects and ensures the business requirements, not the user requirements. The presentation layer is the user part of the system, if it enforces business requirements, it is merely doing this to improve the user experience. Re-implementing business validation with validation controls or code behind checks is a good way to improve the user experience. The presentation layer understands the user and can provide far better messages and help than the model layer. The presentation layer should wrap the model by either ensuring that the model doesn't return errors, or if that is not possible, by translating errors and retrieving other information about an error to help the user fix the problem.


Validation as Data Logic


Databases also provide tools to enforce business logic. Unique constraints, foreign key constraints, non-null fields, these are all database imposed business validations. It is possible to structure a relational database without these things and make it the responsibility of the application to manage the data, but that is not a practical solution. Many times data must be manipulated 'behind the scenes', usually for technical/performance reasons. The database exists to store the model, and it must be able to do this reliably. Reliablity and integrity is why validations exist in the database.


Validations in all Layers


Although it may be more difficult to manage/change, a good system has business validations all layers.


  • Presentation Layer - screens may implement entry validations to provide a rich user experience

  • Business Layer - Model objects enforce business rules because that is their job

  • Data Layer - Databases enforce business rules to ensure data integrity, which ensures that the data layer will satisfy the demands of the model.


I have seen many strategies that try to move validation logic into a single place. For example, Data driven strategies where a validation can be changed by simply updating a row in a table. The 'Naked Objects' pattern attempts to improve transparency of business validations in a business layer. Some of these strategies create a great deal of complexity and an unpleasant learning curve. Homegrown meta-data systems require maintenance developers to essentially understand the entire system grasp the impact of a change.


In my straight forward model, admittedly, the simple task of making a field mandatory requires updating code in the presentation layer, changing the business model object, and setting not-null on the database table. While this change has broad impact on the system, it is an intuitive change. Each code change is highly isolated from the rest of the application. It is very easy to understand a system that is implemented in this way, and that is very important. Using a well understood validation pattern is acceptable, but creating an abstract/opaque solution reduces maintainability. Nobody likes maintenance work, so make it straightforward. How the change is made must be obvious not necessarily easy. Yes it is easy to change a validation by updating a row in a validation logic table but is it obvious? What is the impact to the system? How do you know it's going to work - talk to the developer that coded it? Not acceptable.


Validations in each of the layers all stem from the same requirement, but have distinct purposes. Trying to combine these purposes is difficult and probably not worth the complexity.

Saturday, June 24, 2006

Can't Open Dump File in Visual Studio

I don't seem to have the option to open 'Dump Files (*.dmp; *.mdmp)' in Visual Studio


Debugging with WinDbg


WinDbg is shipped with the Debugging Tools for Windows, and it is considered the most powerful windows debugging tool. It is truly a tool for the expert. As a senior developer I am often called upon to 'fix the problem' which is usually a production problem that cannot be reproduced in test. Since this has become a frequent occurence, I have decided to take on learning how to use WinDbg. Some good resources I've found so far include the Microsoft Patterns and Practices document 'Production Debugging for .Net Framework Applications' and John Robbins' Debugging Applications for Microsoft .Net and Microsoft Windows.


Why Can't I Open Dump Files?


My current employer provides the development team with a 'corporate' install of Visual Studio .Net 2003, which includes only the features they think we need. Since most development is done in C# and some in VB.Net, C++ is not included in the install. I believe C++ is required to open dump files in Visual Studio .Net. You can see what languages and tools you have installed by going to Help -> About Microsoft Development Environment...


I wonder why I would have to have C++ installed to open dump files? Presumably it is expected that dumps are generated for native C++ code, and if you don't develop in that language you don't look at dump files. Maybe it is not useful to look at a dump of managed code in Visual Studio? I imagine you would have to load the SOS debugger extension to debug a dump in studio, I seem to be having more luck using WinDbg anyway. Stick with WinDbg, I sure it's worth the effort to learn.

Friday, June 09, 2006

.Net Assemblies and Layered Architecture

How should you physically implement a logically layered application?


My Story


In the book Domain Driven Design Eric Evans encourages the use of Repositories. A repository is an object that is responsible for persistence and retrieval of Model or Business Objects. The intent of the repository is to abstract the object storage mechanism. The interface to a repository should not be specific to a database technology, or any technology being used to permanently store or 'persist' objects. This abstraction allows you to easily change the underlying storage technology. Adding support for a new storage technology (I.e. another relational DBMS), means creating another matching set of repository objects.


Multiple Assemblies


With this in mind, I decided to create 2 assemblies to match the 2 layers. A Domain Model assembly and a Persistence assembly. The idea being that the persistence assembly could be dynamically loaded at runtime and thus chosen from a list of assemblies that support various DBMSes



After running FX Cop on my work, I tried adding the CLS Compliant flag to assembly.info to satisfy the FX Cop audit.


using System;
[assembly:CLSCompliant(true)]

The CLSCompliant attribute caused my Persistence assembly to fail on compilation, due to the fact that it referenced and returned types declared in another assembly. The other assembly was of course my Model assembly. This got me thinking...


Single Assembly


In other projects I have worked on, the persistence layer and model layer were combined in one assembly. The repository classes were included in a seperate namespace (and project sub-folder) and were therefore still distinct from the model objects. So logically the layers were seperate, but the physical implementation was a single assembly. The 2 layers worked very closely together and in many ways really acted like one layer. So this configuration also made sense.

The difficulty with the single assembly comes from the desire to 'swap' between persistence technologies. In the combined model, you can no longer load the repository objects you need at run-time. Supporting multiple database technologies requires the Domain assembly to contain multiple sets of repository objects collected in different namespaces, you would use the Strategy pattern to provide the runtime loading (abstraction from specific types), and some sort of controller to serve up the concrete repositories (based on a config setting perhaps). Each of the architectures has benefits and drawbacks. I like to reduce my assembly count though, and systems don't generally support a huge set of relational database technologies. So I think I will move my seperate assemblies into one. As the XP mantra goes, 'you aren't gonna need it' (Y.A.G.N.I.).

Friday, June 02, 2006

Application Specific Exception Classes

When should you create specialized exceptions in your C# application?

The .NET framework provides a few exceptions for use by applications, they are,


  • ApplicationException

  • ArgumentException

  • ArgumentNullException

  • ArgumentOutOfRangeException


and others...

These exception types are very useful and cover most cases. But they are usually not enough.
Let's say you have some special validation where arguments to a method have to 'jive'. Such as a ChangePassword method on a User object. The ChangePassword method takes maybe three arguments, oldPassword, newPassword and verifyNewPassword. The newPassword and verifyNewPassword arguments must match. You could just throw an ArgumentException and populate the message appropriately. This causes a potential issue since by raising a generic exception type you are forcing the caller to handle the exception in a generic way. If the caller wants to do something special for this error, it has to parse the error message. Not very nice.


Error Codes


The structured programming world employs error codes to relay the exception type. You could implement this in C# by defining a special exception type extended from Exception and include an integer ErrorCode property. Then assign every specific exception type a number using a series of enum types defined in the custom exception class. Use enumeration types to logically group exception types, and reduce the need to reserve blocks of numbers.


The error code solution works. However, imagine what the handler looks like from the caller's point of view. There is probably a switch statement of some kind. Not a very Object Oriented construct. In and Object Oriented design switch statements on enum codes do not belong. See Replace Type code with subclass. Enter Custom Exception Classes.


Custom Exception Classes


If you have error type codes, and you have switching logic on those codes, applying 'Replace Type Code with Subclass' will lead you down the path of custom exceptions. This essentially means creating a catalog of specialized exceptions derived from the Exception class in the System namespace.


Deciding what exception classes to create can be difficult. Over specialized classes could result in a huge and unwieldy group of exceptions. Under specialized classes are essentially meaningless. As in all object modelling, you identify classes to represent concepts in the domain. Exceptions are no exception(!) to this rule. Ask, what error condition concept you are trying to communicate. You might create an exception class to represent violation of an object relationship, or an exception to indicate an attempt to create a duplicate object. These classes would most likely include information regarding the objects involved, so as much information can be relayed to the user interface. Depending on the situation your exception classes may be more specific. For example, you may wish to express certain types of object relationship violation errors.


Be Agile


Attempt to model your exceptions based on the current need. Don't attempt to satisfy all possible scenarios. If you know how the caller will deal with the exception, model to that requirement. Extend later. If you find yourself tempted to add error numbers and switch on them - refactor to create exception classes.


It is somewhat painful to create exception classes properly. Run FxCop against your assembly and you'll see what I mean. If you are creating many exception classes it would be wise to create a codegen macro in VS, or at least have some code to copy/paste from to make it easier. You may also find yourself creating exception classes that add nothing. No additional data or methods, just a new type. I think this is OK, so long as you have catch statements for those exception types.

Friday, May 26, 2006

Handling the Browser 'Refresh' Button

When the user hits the 'refresh' button, the page resends the previous request to the server, which usually results in unexpected behavior* for web applications (*bugs).


One of the difficulties with building Web Applications is the fact that they are hosted within a 'browser'. The browser contains features that allow the user to customize their Internet browsing experience. On this note, it is EXTREMELY annoying when an application attempts to mess around with browser settings, it not threatening (in a securityish sort of way). So, in my opinion, disabling the refresh button is not an option! Besides, the user can always hit ctrl-r to get a browser refresh (and yes you could probably catch ctrl-r with javascript but that's not the point).


Response.Redirect


My current preferred method of dealing with the refresh button is to redirect back to the current page for all data modifying postback events. Something like the following,


Response.Redirect("Model.aspx");

where Model.aspx is the current page. It might not be a great idea to hardcode the page name (in case you want to change it). I've seen code where each page exposes a Url property. In this case the code would look like the following.

Response.Redirect(this.Url);

Placing these redirects in the event handlers ensures that if the user hits 'refresh' immediately after a data changing postback the event will not be re-fired. The refresh merely calls the redirect again (essentially) and reloads the page (as expected! how wonderful and easy too!).


Drawbacks


While the redirect method works, it is not without its difficulties. Remember, it is like a fresh navigation to the page, so it refires your "if( !Page.IsPostBack )" code. This can be a problem. Usually the "!IsPostBack" code populates list controls and re-running this code will cause current selections to be lost (very irritating for the user). Normally the ASP.NET Viewstate mechanism ensures the current selections on list controls are maintained through postbacks. My solution to the list selection problem is to store the current selection in Session and set the list selection manually.


Besides the extra management of Session variables, there is also the performance issue to consider. Each postback is now causing 2 hits to the server. Potentially calling into the database for data already displayed on the page. This seems like a high price to pay to deal with the 'refresh' button. Output caching may be a solution here...

Friday, May 19, 2006

Cost of Calling Methods in C#

What does a method call cost in C#? Should I use temps to reduce method calls?


In university, so many years ago, we learned that one of the most performance intensive operations was the method call. The professor explained the work required to create a stack frame, and move values into it, etc. Well... that was then, when maybe compilers weren't quite so efficient. Is this still true? What is the overhead when calling a method with a modern language like C#?


Why do I care about this?


Martin Fowler in his book Refactoring: Improving the Design of Existing Code, (which I am currently re-reading) describes several method creation refactorings, one of which is driven by the desire to remove temporary variables from a method (Replace Temp with Query). The motivation behind this refactoring is based on the idea that temporary variables encourage large methods and make refactoring difficult. Upon reading this refactoring I recalled the university lecture where we discussed the cost of calling methods and use of the C++ inline keyword (inline is a C++ compiler hint to not actually create a function and call it, but to generate inline code instead). In C++ 'inline' exists because function calls can be expensive. So I created a simple test to measure the overhead with method calls.


Method Call Overhead Results


Here is the code I wrote to measure the method call overhead, one method that simply does a calculation inline, and another that calls a function to do the calculation.



public class MethodCallCost
{
private int _iterations;
private int _valueA;
private int _valueB;

public MethodCallCost(int iterations, int valueA, int valueB)
{
_iterations = iterations;
_valueA = valueA;
_valueB = valueB;
}
public void MethodCall()
{
double temp = 0;
for( int i = 1; i < _iterations; i++ )
{
temp = CalculateAmount(_valueA, _valueB, i);
}
}
private double CalculateAmount(int valueA, int valueB, int divisor)
{
int result = valueA * valueB / divisor;
return result;
}
public void Inline()
{
double temp = 0;
for( int i = 1; i < _iterations; i++ )
{
temp = _valueA * _valueB / i;
}
}
}


I created the MethodCallCost object to run 3,000,000 iterations. Here are some results,
































Method Call (Ms)Inline (Ms)
9347
11047
9363
9462
9462
9447
9462
9447
10963
7862

So, my conclusion (with this simple and perhaps insufficient test) is that method calls are cheap. The benefits of the 'Replace Temp with Query' are probably worth it. Now of course you may have a complex method that calls a webservice, or into a database, and in that case of course it probably makes sense the store the result instead of requerying. There are still judgement calls to be made, but go ahead and create method calls, just remember to profile and tune.

Tuesday, May 16, 2006

UrlReferrer - Handle with Care

What page was the user on before this one? Hey! Reponse.UrlReferrer seems to have that information.


Ok, I'll admit I'm a little scared of this feature. It seems to undermine the atomicity of web requests. If I could trust it though, UrlReferrer would be extremely handy for a web app with sophisticated navigation. Imagine multiple ways to get to a screen (as any good app should allow), and the user hits the 'cancel' button, and magically is returned to the previous screen. After all, the user's natural expectation is to be returned to the screen they were just on isn't it?. Here's some code that does just that.




private void btnCancel_Click(object sender, System.EventArgs e)
{
if(Request.UrlReferrer != null)
Response.Redirect(Request.UrlReferrer.AbsoluteUri);
else
Response.Redirect("Dashboard.aspx");
}


This code ensures that the UrlReferrer HTTP Header is set, and if it is redirects the browser to the referrer page. Otherwise the user is sent to a default page. Note here, NUnitAsp doesn't set the UrlReferrer header hence my null check. But...


Postbacks



If the page posts back, the Url Referrer is set to the current page, and your navigation is now broken. You either have to store the referrer in the page load event for later use, or not put any postbacks in the page (danger! danger! you will probably forget about this and break your redirection). I don't know about you, but my pages post back a lot especially in their immature state.


Server.Transfer


You can't use Reponse.Redirect to navigat to pages that reference the UrlReferrer. Reponse.Redirect causes the browser to send a GET request for the new page. The subsequent requests, such a POST to go to another page, now have the Url Referrer of the current page because of the GET request made by Response.Redirect (or something like that, trace the packets and you'll see what I mean).


The more I write about it, the more convinced I am to avoid Url.Referrer. The design limitations to 'make it work' are extremely constraining and easy to forget. It doesn't work with NUNitAsp (important for me, anyway). I am also unsure of which browsers even support this header.


Alternatives


You can potentially change your navigation strategy and use the bread crumb trail approach. Just save a navigation tree in the user's Session. I have also had some success passing navigation information as URL Query parameters (I.e. page.aspx?PreviousPage=Main.aspx). UrlReferrer is a fragile construct, try to find another way.

Friday, May 12, 2006

Protected or Private?

As of late, I've been setting the access level on class properties to protected.


I was creating a class, and the protected keyword came up in intellisense and I paused to think. Maybe if a class inherits from this class it would be useful to have access to the member data of the class, the same goes for private methods, why not make them protected?


So, right now, I am generally going with 'protected' for all internal class stuff. Classes that I know will not be extended will have private members, I ensure to seal those classes.


I imagine that a truly carefully designed class has a mix of private, protected and public access levels, and that setting everything internal to 'protected' is a bit naive. but for now...

Wednesday, May 10, 2006

OleDbCommand Parameters - Order Matters

Having trouble with your MS Access Update statement? Check your parameter order.


So I've written a small parameterized Update statement to create an OleDbCommand.
I was surprised to find that my Update command was returning 0 rows updated. Everything looks fine on inspection, so I pull the command into MS Access and replace the parameters with the specified values and it works fine. My code looks like this,


string sql = "UPDATE tblMemberVehicle SET MemberId=@MemberId, VehicleTypeId=@VehicleTypeId, Identifier=@Identifier WHERE Id=@Id";

OleDbCommand dbCommand = new OleDbCommand(sql, conn );

dbCommand.Parameters.Add("@Id", OleDbType.Integer).Value = vehicle.Id;

dbCommand.Parameters.Add("@MemberId", OleDbType.Integer).Value = vehicle.User.Id;

dbCommand.Parameters.Add("@VehicleTypeId", OleDbType.Integer).Value = vehicle.VehicleType.Id;

dbCommand.Parameters.Add("@Identifier", OleDbType.VarChar).Value = vehicle.Identifier;

Parameter names all match nicely, no exceptions from the database. I check the rows updated after running my ExecuteNonQuery statement, and always - 0 rows updated.


I have another update statement which is working, so I take a quick look at it. Lo and behold, the @Id was the last parameter added, and it corresponds with the parameter order as it appears in the SQL statement. Could it be that OleDbCommand works just like OdbcCommand and requires parameters specified in the SQL specified order? I begin to suspect that the parameter names are actually meaningless and conduct a small experiment. I change the parameter names to nonsensical names and leave the sql parameter names alone.


string sql = "UPDATE tblMemberVehicle SET MemberId=@MemberId, VehicleTypeId=@VehicleTypeId, Identifier=@Identifier WHERE Id=@Id";

OleDbCommand dbCommand = new OleDbCommand(sql, conn );

dbCommand.Parameters.Add("@Foo", OleDbType.Integer).Value = vehicle.User.Id;

dbCommand.Parameters.Add("@Bar", OleDbType.Integer).Value = vehicle.VehicleType.Id;

dbCommand.Parameters.Add("@Try", OleDbType.VarChar).Value = vehicle.Identifier;

dbCommand.Parameters.Add("@This", OleDbType.Integer).Value = vehicle.Id;

Any guesses as to what happens? Surprise, surprise, this code works. The parameter names are truly meaningless, well against MS Access anyway. I believe SQL Server respects these parameter names and actually uses them.


And all this time I've been sooo careful about my parameter names, sigh.

Monday, May 08, 2006

Returning Null Objects

What does it 'mean' when a method call returns a Null object?


I believe you must define interfaces very carefully. Firstly, because once an interface is in use it is difficult to change later. Second, because every method name, parameter, output is part of the description of what the interface does. The interface 'expresses' a mental model to the developer using it. It explains how the library works, or presents a mental model that can be used to apply the library effectively. The interface tells the developer how the library 'works', so while the code only deals with inputs and outputs, we developers use interface I/O to fabricate an understanding of what's happening under the covers. It is therefore important to specify an interface carefully. So, with that in mind, I propose a few instances where it makes sense to return a Null object from a method.


Errors


If you are not using exceptions to relay errors, or unexpected system conditions, and you are instead setting a global or passed-in error structure, your method should return a Null object reference. You don't want the processing to continue as if nothing was wrong. In the case of an error, the application should proceed into recovery mode. Oh look, I got a Null, something bad must have happened. I can either check the error structure and avoid using the Null object, or I can ignore the error structure and get an Null Object exception (No! Bad programmer!).


Object not found


In many cases I have returned Null from a 'find' or 'get' method when it was unable to retrieve a requested object. While it is necessary to check the return value for Null in these cases, the implementation is simple. A Null object is sometimes even acceptable, as it forms an input to another method which allows a null parameter. I am still in favor of this use of Null object references. After all the database supports the concept of Null too.


Null Object Pattern


The Null Object Pattern entails essentially 'stubbing' out methods and creating an indistiguishable object from the real object. The client object can use the Null object as it would the real thing, and no unpleasant Null checks are required.
I believe the Null Object Pattern is only useful for stubbing out code, where the object's behavior is not important to the client object. Maybe I can stub out the application logging object for example. If I haven't configured my logging, the logging system returns a Null logging object that doesn't fail on the log call, but does nothing. I can't however, stub out the Math object from which I am expecting performance of important calculations. It seems that I would be at risk of introducing difficult to find bugs, I would rather get an 'object reference not set' exception than a series of zeroes displayed in a report.


Empty List or Null


If my method returns a list of objects, and there are no objects to return, it makes sense to me to return an empty list. See Tor Norbye's blog post on this. You certainly can't return either Null or empty list and ascribe the same meaning to both, they're two different things. Empty list is easy, no objects found. Null? That just means an error to me, like the object wasn't properly initialized or something.

Tuesday, May 02, 2006

Repository Create Pattern

I'm working on creating a new object creation pattern. This new pattern is an elegant solution that fits within the Repository pattern described in Domain Driven Development by Eric Evans. I'm trying to ensure the following constraints in my application,


  • Objects are valid at all times

  • An Entity object without an identifier (db key) is invalid

  • Don't want to have to use GUIDs as identifiers

  • Objects that cannot be saved (because of their state) are invalid


Essentially, if an object exists I must be able to save it without getting constraint errors from the database.


Introducing 'Repository Create'


The Repository Create pattern works a lot like the factory patterns. You create objects through the repository. 'Entity' objects (objects that must be persisted) must all be created by a repository. That repository ensures uniqueness of business keys (if there are any) and applies an id to the new object. Any errors creating this new object and 'no object for you!'. The application never has partially formed, or duplicate objects floating around.


What about updates? Model objects should not permit updating of their keys. Simple. But what if I modify a property and thus make it a duplicate? You shouldn't be able to do this. Properties that are part of the uniqueness constraint must be modified through the repository, by a repository Update method.

An Example


Let's say I'm writing a project management application and I have a 'Project' object. The Project object must have a unique name so users can identify it, but that name can change too.


I simply create my project objects by calling 'create' and passing the project name to my project repository.

Project p = projectRepository.Create("Project 1");
If the name Project 1 is a duplicate I get an exception from the repository. If not, I get a new Project object, with a valid database Id and I am assured the name is not duplicated. Updating the name would look something like this
pRep.Update(p, "Project One");
Access to the project name has to be restricted either by using the C# internal keyword or Interface casting* (*A slippery way to implement 'friends' in C#).


This is the Repository Create pattern in a nutshell. It seems to be working fairly well so far, of course my requirements have also been fairly simple and I haven't had to do much optimization. More to come on this pattern...

Sunday, April 30, 2006

Apply a Default Sort Order

Always sort your object lists


I tend to get pretty lazy with my coding sometimes. My TODOs occasionally require more typing than it would take to actually do the TODO (I sometimes get lazy with the thinking part). So why would I recommend* (*insist on) sorting every list of objects in your system? Two reasons.


  • It makes testing a whole lot easier

  • You're less likely to have an unsorted list of items appear in your GUI


I have only just started doing this with my code* (* I may post a more educated and experienced blog later which includes the phrase 'it depends' when talking about when to sort lists) and it seems to be paying of so far.


Writing Tests on Sorted Lists


In the past I would put together a small 'IsInList' method to test that an object was in a list. Or I would create a 'GetObjectInList' to get an object and test it for expected values. The fact that the objects were stored in a random order of course made this necessary. The unsorted nature of the lists also meant the search had to be sequential which seemed amateurish (the performance implications were negligible since the lists were tiny, but...).

In any event, the nice thing about using a database is the ease of adding sorting. With a simple ORDER BY in MS Access I can sort on any column ascending or descending. Very little code. The .Net Framework also provides some simple and powerful ways to sort lists. Sorting my lists makes testing easier since object location is predictable. Testing is all about ensuring predictability anyway, right?


Your GUI lists should have default sort orders


If you have object lists and a GUI, you probably have list controls. I'll bet users are expecting data in those lists to be sorted either according to a scheme of their choosing, or in a commonly expected order. Ah, but if you use a layered design* (*like most kind-hearted developers) the question here becomes... where should sort logic exist? In the presentation layer or in the business model? I prefer a thin* (*super thin) presentation layer, the only code in the presentation layer should deal with form and page controls. I suggest then, that sorting objects is the responsibility of the model and is a business rule. Sorting is part of the analysis of data, like calculations. Testing logic in the GUI is also more difficult, so move as much logic as possible into to the model layer. If you are consciously sorting your object lists and your tests are checking this, there is very little chance of a nasty unsorted list control bug appearing in your gui (this happened to me once, and was caught by a customer! egad.)


The .Net Framework offers some powerful sorting tools namely the IComparable interface, and Comparer objects. These two tools give you unlimited sorting power, easily and elegantly.

Monday, April 24, 2006

ASP.Net Timeout

HTTPUnhandledException: Request Timed Out.


Every now and then, when a request had a lot of work to do, I would get this exception. It didn't 'ripple up' from a method in the regular way that an exception does. It just caused execution to cease and was caught by the default error handler. There are a few places where timeouts are set. One is in IIS. In the Web Site properties there is a setting called 'connection timeout'. I thought this might be the candidate. So I increased the timeout, but no luck. The operation causing the timeout was an data integration over a network. The integration API had a connection timeout setting too. Changing this timeout had no effect.


Another strange behavior was that in debug mode the problem went away. The operation would complete without the time out. Sadly, I had to debug the problem in release mode, with many symbol files missing. At around the same amount of time I would get an 'ObjectDisposedException' on my NetworkStream object, followed by a ThreadAborted exception.


Turkey Trails


I found some threads online about the .Net garbage collector prematurely destroying objects if it thought they were out of scope (or might as well be since there were no subsequent references to them). So I daftly started adding Garbage Collector control statements to my code to preserve my objects. Of course this just 'felt' wrong, but hey, desparate times... Well of course no luck.


Day 2


It always helps to be fresh, and to bring in a fresh perspective. So, with the help of a coworker we clarified 2 important clues.


  • Something is terminating the worker thread

  • The thread is terminated after 90 seconds


There was definitely a timeout somewhere which was bringing everything to a halt. I already knew it wasn't the IIS connection timeout, but what about this ASP Script timeout under virtual directory configuration? It was set to 90 seconds too! Alas, it was not the culprit. Which of course made sense after I tested it, but ya never know.


The Almighty Interweb


How did I code before the Internet? (Yes I am that old). Some Googling later I discovered, da na na naah! An obscure, yet highly important setting in machine.config. Something that ASP.Net uses to terminate threads. Something that is defaulted to 90 seconds. The answer, the culprit! A glorious little piece of XML.


ExecutionTimeout


Allow me to introduce executionTimeout. It looks like this,

<configuration>
<system.web>
<httpRuntime executionTimeout="90"/>
</configuration>
</system.web>

It is set to 90 seconds in machine.config, and does not appear in the default web.config file that VS.NET creates for Web Applications. Every ASP.NET developer should be aware of this setting. Why is it kept hidden away? Ahhh, the eternal mysteries... Anyhow, there it is. Add it to your web.config. Pass it on.

Thursday, April 20, 2006

ToString()

What should I do with ToString()?


In the Microsoft .Net Framework, ToString is defined in Object and is therefore ubiquitous (and everywhere too). You have the chance to override ToString whenever you want. But if you're thinking 'hmm... I don't override tostring() very much', then I would suggest maybe it's time to start. It certainly is handy to call whenever you need to output an error message. I almost just expect it to work (but nothing in life is free). Just call ToString() on your object and add to your trace messages. Very nice, but for the fact that error log messages aren't the only possible use for ToString(). Alas, with such a ubiquitous method, come multitudinous options for usage. Int.ToString() is an example of the obvious and straightforward, but what do you with an object with multiple properties and subobjects, and inheritences and so forth?


Proposed Behavior


I like the idea of ToString() displaying object identity. If you work with objects that model something real, like a business concept then output information that identifies the object. Not the internal key or database id, but the natural key. The data that people undestand to represent the essence of the thing. For instance, if your object represents a customer, ToString() would return the customer's name. Return a string that would be identifiable to a user using your software. Take a look at how DateTime.ToString() works, it is a good, simple model. The default ToString() returns the date as a string formatted according to the current thread culture. It displays the date in a way that I can read it. DateTime.ToString() takes the various members (day, month, year etc.) of DateTime and assembles them into a reasonably formatted presentation. Most objects have a 'name' property or a similar identifier. This is what ToString() should output. Converting an Address object with ToString() would probably output the address as it would be written on a letter.


Complex Objects


What about something like a Report object? Should ToString() return the report content as a properly formatted string? I would suggest that this kind of behavior is beyond the intent of ToString(). I would be inclined to only output details that again, identify the report, like its title, run date, and so on, not the content of the report. This should be left to specialized methods that can be expanded to more flexible use. Output a string that would would be written on a file folder tab. The content is in the folder, ToString is the human readable index. The information you would need use to retrieve the object.


Format Providers


Now it gets complicated (well, more complicated for me). What about customizing the output of ToString()? Back to my favorite, DateTime. People want to see dates in all sorts of different ways. DateTime.ToString() solves this by taking an optional IFormatProvider to display dates according to cultural preferences. Exposing ToString() with parameters is a good way to provide extra control over formatting. Again it still only outputs identity, just in a different format.


I am starting to implement the base ToString() on every class I write, and it is very convenient. I truly dislike seeing the Namespace.classname output when I call ToString(), very annoying in my view. Implement ToString and make users of your code happy (including yourself).

Monday, April 17, 2006

Elegance is Not Optional

I have an old book (1994) on Prolog called The Craft of Prolog written by Richard A. O'keefe (MIT Press). One of the crucial 'values' of the book is that "Elegance in Not Optional". I find this to be a very striking statement, as I have always felt that design suffered under the realities of practical use. Application performance always seems to ruin a program. It may be that Prolog, because it runs in a isolated workspace has the pleasure of elegance. There are no nasty operating systems or databases to deal with. Yet, I am still drawn to this vision, I want to believe. I have added a small quotation from the text.


Elegance is not optional (1)


What do I mean by that? I mean that in Prolog, as in most halfway decent programming languages, there is no tension between writing a beautiful program and writing an efficient program. If your Prolog code is ugly, the chances are that you either don't understand your problem or you don't understand your programming language, and in neither case does your code stand much chance of being efficient. In order to ensure that your program is efficient, you need to know what it is doing, and if your code is ugly, you will find it hard to analyse. (The Craft of Prolog, Richard A. O'keefe, MIT Press 1994)


I believe that a developer should strive for readability first and foremost. Most of the guidance in Code Complete (Steve McConnell, Microsoft Press) is aimed at improving code readability. I would even go so far as to say readability first, function second. You can always debug a readable program and make it right. But a mess of code that works might as well not exist since any changes to it probably mean rewriting it anyway.


Readability is About Design


Comments, documents, unit tests, all of these things merely support readability and understandability. In the way that a textbook supports a professor. The professor must still be able to teach and must stand on their own with or without the text to be truly effective. Such is also true with code. It should make sense without the supporting pieces, since so often these pieces fall into disrepair. Code is never 100% covered by tests. Comments are usually sparse, out of date or poorly written, as with documentation. The organization and naming in the code must point to its intent and function. The code expresses the solution, it doesn't just implement it. This should be the developers prime concern, express the problem and its solution with the program itself.


Elegant Design


What is elegance? Have you ever heard the expression 'Form follows function'? Everything in the system must have a good reason for being there. Nothing is superfluous, there is no programmer ego, or 'interesting' technology, or 'feats' in the code. The code should read like a well written technical document. Step by step to a predicatible ending. Detail at each level is consistent, methods and data are grouped logically. Nothing is out of place or mysterious. Design also subscribes to the expression 'Function follows form' whereby choosing an elegant form results in a superior design. This implies an intuitive element to coding where you can't just apply rules and refactorings. If it looks good it probably is good.


If you understand the problem, you should be able to look at the solution and say ahhh... yes, I see.

Sunday, April 16, 2006

Exceptions vs Returns Codes - Performance Implications

I am working on inventing or finding a satisfactory validation strategy for my .Net web applications. The final model must appease the following criteria,



  • I must be able to validate the entire screen in one shot.

  • There must be no duplicated code.

  • Objects in the system must always be 'valid'.

  • I want to use as much C# as possible, and minimize javascript


I must be able to validate the entire screen in one shot


This is purely a usability concern. Who wants to go through one error at a time on a page? Fix this, now fix that, opps... this one too. Oh, that would be a duplicate, you'll have to come back later. I think the system should be able to point out all problems immediately.


There must be no duplicated code


I don't want to maintain two sets (or more) of validation code. As in, validation objects on the page, and validations in the model layer. This requirement is in conflict with the other two, which makes my demands demanding.


Objects in the system must always be valid


I don't like the idea of an object being somehow malformed or incomplete. A situation where I have to check an object before I do anything seems to contradict my understanding of encapsulation. An object that starts throwing errors because it is invalid contradicts the very reason for using objects. Which is, keeping things simple. This requirement conflicts with requirement 2 because now you can't just take the validations out of the object, to ensure validity the object has to do validation.


I want to use as much C# as possible, and minimize javascript


I like C#. Visual Studio is an excellent IDE for coding in C#. C# is strongly typed and Object Oriented and is a fine language all aroung. I do not like Javascript. It is not strongly typed, and not really Object Oriented. Javascript is difficult to debug and VERY browser specific. The end.


The requirements may be asking for too much, but I feel that there is somehow a way to satisfy them all, and still have a fairly elegant solution.


Back to Performance


So, I am currently comparing validation through exceptions against the more traditional (pre OO) return codes. My first test is performance. I have read much on the Internet regarding the poor performance of exceptions so I wanted to see it for myself and get some hard numbers. The evidence indicates that Exceptions are slower (Using 1.1 of the .Net Framework). Slower by about 100%!


The Tests


The tests were fairly simple. I created an class whose constructor took 2 arguments, a string and an integer. If the string was null, or empty that was a validation error. If the int wasn't between 0 and 100 that was an error. The exceptions method just threw the various 'Argument' exceptions provided by the framework and filled in a nice descriptive error message which was passed to Debug.WriteLine. The Return code method was more difficult to code and even required a trip through the debugger (Remember, I am a bit daft after all). I had to pass an error object into the class constructor so I could get the same amount of information about the validation errors. Foolishly I declared this object as a 'struct' which meant it was being copied to the stack and thrown away, with all my error information too!


The Results


My test tried to create an object with a problem that would be caught with each of the validations. This code was run in a 10,000 iteration loop, with DebugView (sysinternals.com) open to verify the output. Here are the numbers (ms) I got for 8 runs












ExceptionsReturn Codes
9,1575,234
9,9535,204
9,2475,235
10,4695,360
10,2045,484
10,2034,734
10,8754,610
10,8435,156

I can't account for why the exceptions test speed slowed down. It would probably be worth looking into. The tests were extremely quick to run considering they were attempting to create 30,000 objects each. I was amazed that so many exceptions could be handled in 10 seconds. So, while the exceptions method is much quicker to code and debug, the return codes method is the clear winner when it comes to performance.

Thursday, April 13, 2006

Config File Mayhem

There are config files all over the place!


I don't know if you've ever gotten stuck in the config file swamp, but I'm there now and it is stiiinnkkkeeee! Each Project in my VS2005 solution has a config file. There is one for the Unit Test case project, one for the web app (web.config), and now I've just added one for the Console App. Part of the reason for all of these config files is the fact that all the projects rely on the model library. You know, the library that does all the work.

Notice in the spiffy diagram, how the model library depends on each config file. This creates what I would call 'tight coupling' between layers of the application. The model layer loses resuability and independence. A sort of 'indirect cohesion' between the subsystems through the config file.


My first thought is to refactor the code to move the dependency on the config file up to the 'owners'. The configs are after all owned by these assemblies, so it makes sense for those assemblies to have exclusive interaction with the config files.

So, by the diagram, all interaction with the config file would be up to the config file owners. Settings would then have to be passed into the model as parameters. If this creates a long parameter list, I could group settings into structures.


As usual, foreseeing such problems is always beyond me and it will probably take me some time to work this out to my satisfaction. More postings on this to come :)

Tuesday, April 11, 2006

Creativity and TDD

I just finished reading a blog post on creativity and speed (Posted to the only blog I actually read regularly) which made me think about Test Driven Development as a creativity booster. Kent Beck describes the TDD cycle as


  • Write a test

  • Make it run

  • Make it right


Step 2 is what intrigues me. He suggests "Quickly getting that bar to go green dominates everything else." (Test Driven Development By Example, Ken Beck, p. 11). "Quick green excuses all sins". I believe Kent's intent here is to keep up development momentum, to be driven by the tests. Moving quicky ensures focus and doing only what is necessary, the point is to ensure that you only write code to satisfy the test. However, Kathy Sierra's post made me realize there is another (perhaps unintentional) outcome from being test driven.


I have discovered an interesting side-effect to being Test Driven - creativity. Making that test pass, quickly, stimulates creativity. I've noticed that during the frantic push to green my mind races for options and crazy (Hackery for sure) solutions to make the test pass. The feeling of 'How can I make this pass, quick! you only have 10 minutes! Time! Time!' is a very invigorating, demanding process, and super-rewarding when that test goes green. Of course, once the test passes, I'm usually left with a horrible, disorganized, non-optimal mess. But I've usually learned something new, or done something creative with what I already know. This is completely consistent with the creativity/speed theory however. Creative thought comes from the messy part of the mind and its not going to be clean coming out. Worse yet, trying to create 'clean' kills ideas. You've gotta get those ideas out and down. Don't think, just go! Get that test to pass then you can clean it up later (refactor).


Refactoring of course satisifies the right side of the brain, where you get a chance to be calm and analytical. Creating organization and collecting tasks, and related concepts. Giving proper names to objects etc. I think this mental flip-flop, left-right activity is what makes TDD appealling to use. The brain is fully utilized with TDD, creativity is allowed to prosper without compromising quality. This may explain why Test Driven developers can't imagine coding any other way.

Thursday, April 06, 2006

ADO.NET OLEDB Syntax Errors

I have a large MS Access Sql Script that creates my Access database. I need to run it programatically to create databases on the fly. So I've created a C# application that reads each line of the file and applies the SQL against the database. However, whenever I use the OleDb data provider I get syntax errors. The following SQL for example causes a 'CREATE TABLE' syntax error.

CREATE TABLE Action (
Id INTEGER NOT NULL,
ActionTypeId INTEGER NOT NULL,
Name TEXT(50) NOT NULL,
Description TEXT(255) NULL,
ParentId INTEGER NULL,
SortOrder TEXT(255) NULL,
Tooltip TEXT(255) NULL,
RequiredLicense INTEGER NULL,
OptionalLicense INTEGER NULL,
TableItem TEXT(50) NULL,
SystemPrefEnabled TEXT(255) NULL,
PRIMARY KEY (Id)
)

If I paste this SQL into Access it runs sucessfully. Hmm... I cut down the statement to create a table with only the first column, still fails. So I change the name 'Action' to 'tblAction' and the statement executes without error and creates my table with the name 'tblAction'. So it would seem that 'Action' is some kind of important or reserved word. But why does it work from a SQL window in MS Access? I do not know.


I was about to give up on this and change careers, when I thought I would try connecting to my access database through ODBC. Well, it's been a while since I've used ODBC, but the syntax is pretty much the same so the change is easy. I found a good connection string at www.connectionstrings.com and voila! My entire SQL script runs without error. What!?

I would dig into this and understand why my SQL runs in some cases and not others, but alas, I am already late on this deliverable...

Wednesday, April 05, 2006

ExecuteNonQuery?

I create an Insert query string and pass that into my command 'CommandText' property, and then I call 'ExecuteNonQuery'? NonQuery? Shouldn't that be 'ExecuteQuery'?

But wait, if I set my CommandType to StoredProcedure, then I guess I'm not executing a query. Why wouldn't I call ExecuteStoredProcedure? Well I suppose MS Access doesn't support Stored Procedures.

But hold on, my command object is a SqlCommand object not an OleDbCommand object. Hmmm... so mysterious...

Sunday, April 02, 2006

Aggregate or Association?

Ever since reading Domain Driven Design by Eric Evans, I have been obsessively object modelling every application I write. I am surprised to find that I can hardly ever use aggregation (although I think I should be modelling aggregates). My limited use of aggregates comes from my inability to follow the Aggregation rules, which are,


  • Objects whose lifecycle are tied to another object indicate a possible aggregate relationship

  • Objects outside of the aggregate cannot hold references to objects inside the aggregate, only the aggregate 'root' is referenceable

  • The Aggregate is resposible for ensuring invariants (consistency and integrity) within the aggregation


Domain Driven Design encourages 'Aggregate Boundaries', which means classes outside of the aggregate do not hold references to objects within the boundary. Eric Evans suggests that "Transient references to internal members can be passed out for use within a single operation only." (P. 129), which I imagine means convenience access for one-off operations, but no persisted references.


My aggregate issue tends to crop up whenever I have relationships between objects that validate combinations for proper construction of another object (Does this make any sense?). An example is required to explain here. Take the following diagram,

Imagine an application designed to help people configure and purchase PCs. This application restricts the Processors that can be included in a PC by displaying and allowing the user to only select valid processors for a chosen manufacturer. The system wouldn't allow you to select an "Intel Athlon 50Mhz", because such a chip doesn't exist. Manufacturers produce certain classes of chips which in turn run at certain clock speeds. I believe this is an aggregate relationship since If I delete a manufacturer from the system, all the related chip classes should be removed as they cannot be shared across manufacturers. The model seems to support the requirements.


However, the problem comes when I have an actual instance of a PC with a selected Chip. I want the Chip object to have a specified manufacturer, class and clock speed, which are a valid combination and are references to these objects in the system (I.e. if I change a chip class name it should be applied to every PC that has it selected). Connecting my PC object to a Chip object and a Clock speed violates the aggregate boundaries. See the following "UML" diagram, which demonstrates this violation.

I have now run into this problem on 3 of the real world applications I have Domain Modeled. My suspicion is that my concepts are not quite modelled correctly, or that perhaps my aggregate is not really an aggregate. In any case, I have left the relationships as simple associations which I feel doesn't completely express the domain.

Thursday, March 30, 2006

Multiply Loaded .Net Assemblies

I have a .Net web application that seems to use an inordinate* (*Outrageously high) amount of server memory. So, using Process Explorer which I downloaded from SysInternals I examined the ASP.Net worker process (aspnet_wp.exe) and displayed loaded dlls and noticed most of the assemblies in my application are loaded multiple times, which probably explains the memory usage.


As usual I have no cure, just a frightening diagnosis. So I did some exploring on the Inter-Web* (*My name for the WWW) and found Shawn Farkas Blog on .Net Security and an article on how Security evidence will cause an assembly to load for each evidence set. Of course I am left confused, as my .Net security knowledge is weak.


I have also heard that thread culture settings will also cause assemblies to load multiple times, but I have not confirmed this behavior.


My quest is now to understand this behavior, and hopefully rectify it.

Wednesday, March 29, 2006

Upgrading NUnitAsp to .Net Framework 2.0

I want to use NUnitAsp with my VS 2005 Web Projects. So I added a reference to the 1.1 framework compiled version in my Test Suite project. But the NUnit GUI doesn't find my tests? The code compiles however. I suspect it may be possible to mix code from different framework versions, but perhaps I need to look at my .Net architecture books again. Anyhow, I thought perhaps mixing versions doesn't work so I proceeded to convert NUnitAsp to version 2.0. Of Course NUnitAsp has references to 1.1 libraries (NUnit.Framework for example). So I also updated these references to point to 2.0 libraries. But still my tests don't show up. I could always create a 2003 GUI Test project and test the 2005 pages, but I would really like everything together. Since it's only test cases, I don't mind the requirement of a mixed environment.


I am not giving up on this one, I really like using NUnitAsp, even if I have to run a separate GUI only test project.


Some further investigation revealed some interesting clues. Browsing the properties of my referenced libraries showed me that NUnitAsp is compiled against 1.0.3705, framework version 1.0! I don't even have that version installed! So... I guess it's not compiled against a specific framework version? How does this work. 2+ years of .Net and I don't even grasp the basics. man!


After a couple of hours of messing with the NUnitAsp files and trying to get the supplied tests working, I discovered that NUnit was not seeing my GUI tests because the test fixture class was not set to public. That is twice now that I have been burned by the new default class visibility. Visual Studio 2005 defaults new classes to private (I hope there is a way I can change this code gen setting), and I have obviously become to used to the VS 2003 defaults. In any case, I managed to get NUnitAsp working against my 2005 code, and I am happy* (*temporary condition only). I suspect that I could have used the 1.0 compiled version, and it would have also worked.

Tuesday, March 28, 2006

Red-Green-Refactor-Profile-Tune

When should code be tuned for performance? I have seen far too many applications built with little or no consideration for performance, or at best where performance is left as an afterthought. Event the phrase 'performance tune' is problematic because it assumes that performance problems can be simply 'tuned' away, and that performance problems are trivial (which is far from the reality). Tuning code is rarely possible as the fundamental design is usually the source of the problem. Then should you design for performance? I've seen this case and it usually means a complicated design and poor usability (sorry, we can't do that it would be too slow). So when do we worry about performance?


If "Premature optimization is the root of all evil", then what does premature mean? and when does it make sense to optimize. Most performance problems stem from a naive approach to design (it works for 10 records so it's good), and ignoring performance until it becomes a problem. I've seen applications that used XML to pass data between every object in the system. I've seen applications where every object is a COM object (I built an app like this once). What I also see is developers making excuses for these poor design decisions and I hear sentences like "It's too late to fix that now, that would require a full rewrite". I've been involved in far too many last minute performance 'blitzes', hacking away at code and patching up anything that can easily be fixed. I propose that optimization should be as constant as any other activity in an Agile development cycle.


Enter RGRPT. I propose an addition to Kent Beck's famous Red Green Refactor mantra. I propose that optimization become part of the refactoring process so that the enhanced mantra might go something like Red-Green-Refactor-Profile-Tune. So, what does this mean? Here's how it works. Follow the same rules for TDD. Write a test, define the desired interface and write code to satisfy the test. Create a naive initial design, just make the test pass. Now define a performance goal and write a test for that too. The performance tests should probably run against a well sized database with indicative data (a real customer database is ideal here). Run your performance test against the large database and if it fails, profile it. I highly recommend the JetBrains profiler. It is important to use a profiler, as assumptions about where code is slow are very often incorrect. Tune the code, fix the design, do whatever is needed to get that test green.


Using TDD (Test Driven Development) in the context of the Agile 'Attitude' and my proposed RGRPT pattern ensures that the entire development process is followed in small rapid cycles. Leaving performance tuning to the end, is like leaving testing to the end, or writing the installer at the end. Ken Schwaber compares Agile development to 'Sashimi', an 'all at once' software engineering methodology. Performance tuning is as much a part of that as testing and documentation and any other product development activity.

Monday, March 27, 2006

Dude, where's my namespace?

So, lots has changed in Web Project land between VS 2003 and 2005 (See Scott's excellent summary explaining the changes). No more project files, no Front Page Server Extensions (FPSE), no compiled binaries - and no namespaces! Yarrrgh!


For example, I created a User Control folder and put a control in there. My new user control is named UserControl_ControlName. Underscore? What happened to my nice namespaces, Web.UI.UserControls? I want my namespaces back, waahh. This isn't good, how do I organize stuff? How do I not get lost in a sea of files with a**x extensions? Sometimes there is a need to create non-visual 'helper' classes in a web project; classes that assist the UI and know all about the presentation layer. I don't want to create another project for these classes, and I don't want their names preceded with underscores. Well, Maybe I can configure VS2005 to give me back my namespaces, and my dlls too.


In my infinite wisdom I decided to take matters into my own hands and wrap my control class in a namespace and change the class name to remove the 'UserControls_' prefix. This was unwise. There is stuff happening that I don't yet understand. I suspect that classes in new VS2005 web projects are not expected to be organized into folders, as tacking the folder name to the front of the class name is just so absolutely ridiculous. I am afraid I am at a loss, and must succumb to the drag and drop development model for the time being.

Sunday, March 26, 2006

Auto-numbering with MS Access

Well, here's my challenge, which I'm sure has been run into umpteen times by every developer who's used MS Access. I have auto-number primary keys in all of my tables, and when I add a record, I want to get that Id back. So, in my research I found this solution (from ADO.net Cookbook by Bill Hamilton, O'Reilly),
First, create data adapter
da = new OleDbDataAdapter(sqlSelect, connectionString);
where sqlSelect is the select statement that retrieves rows from your table. Next, add your insert statement and parameters. You then attach an event handler to the RowUpdated event,
da.RowUpdated += new OleDbRowUpdatedEventHandler(OnRowUpdated);

which will fire for each insert. Lastly, add the event handler.

private void OnRowUpdated(object Sender, OleDbRowUpdatedEventArgs args)
{
if(args.StatementType == StatementType.Insert)
{
// Retrieve the identity value
OleDbCommand cmd = new OleDbCommand("SELECT @@IDENTITY", da.SelectCommand.Connection);
// Store the id
args.Row[ID_FIELD_NAME] = (int)cmdExecuteScalar();
}
}
So, this looks like a lot of work to me, I cut out a lot of code in the examples above too. I looked at this example and thought "why don't I just call that select @@identity statement after my insert?". So I cut out the event stuff (and I felt like I was losing some automatic protection and handling, but the feeling soon left) and the code seems to work. But I'm still hitting the database twice for an insert!
In SQL Server I can pass the id in as an out parameter and hit the DB once. My suspicion is that I can't use auto-number ids and have one-hit inserts. My next option is to use the Identity Field pattern (Patterns of Application Enterprise Architecture, Martin Fowler et al. Addison Wesley).
I haven't implemented this solution yet, and I am nervous about taking identity control out of the database and implementing it manually, as I will probably do it wrong and spend my remaining years debugging Ids.
Another option is to use GUIDs as my ids, however I suspect GUIDs cause slow retrievals from Access.
Here are some results from some preliminary tests. I created an MS Access 2000 database with 2 tables. One using a text(64) field to stored GUIDs as the primary key and a text(50) datafield. The other with an autonumber primary key and a text(50) datafield. I created a test to add 1000 records to each table and retrieve a record using the primary key. The timings came out as follows,




Key Technique1000 InsertsSelect by key
GUID46922 MSec62 MSec
Autonumber38469 Msec16 MSec

So, the Auto-number technique was faster for inserts and retrieves. I was surprised to see the insert numbers were better since every autonumber insert also comes with an identity retrieve using "SELECT @@IDENTITY". I suppose sorting the text strings is much less efficient, due probably to conversions and compares. The retrive was 4 times faster! So, while the GUID technique is very convenient for development, it is very slow. I suspect I can even improve on the Autonumber technique by using the Identity Field pattern.


Well, I did a test with an Identity Singleton and got about the same result as the GUID tests at 46906 MSecs and the retrieve took 47 MSecs. I can't explain these results, but it looks like the autonumber technique is the way to go.

Friday, March 24, 2006

Car Care Calendar

Well my first post is going to be a pitch for my new web site, the car care calendar hosted web application. It is currently being developed (.net 1.1). And of course it's taking a lot longer than I was hoping. Naturally I'm trying to build it 'right', keeping on top of security issues, and performance.
So, just getting started was pain indeed. I wish creating projects in Visual Studio 2003 was easier. Since I have such a limited short term (and long term) memory, I always forget what "automatic" stuff VS creates when you start a new project. I invariable create a new (duplicate) directory one level lower than I wanted, and have to move the project file etc. up a level. The solution file is in some documents and settings user folder completely disassociated from the project files. Now whenever I open the Solution, VS complains that it was unable to refesh some folder or whatever. It always takes me far longer than expected to move stuff, edit .sln and .csproj files, verify and reopen before I ever start coding. It would probably be worthwhile creating a VS macro to automate this... but of course, once I get set up I quickly forget about the setup pain as I being to feel the onset of new pain.
Tools I'm using include
NCover. It's an opensource project so I cut it a lot of slack. Right now I've set up a batch file to instrument my files, compile them and produce a test coverage report. NCover is not great at dealing with errors, like invalid file paths, so it took me a fair amount of trial and error to get it working. As I'm writing this, I'm looking at the NCover site and it looks like they have a new version so maybe it's friendlier. NCover has Nant support, and I suspect that works much better than batch files.
I've also set up
Subversion to help me back out of my usual plethora of bone-headed mistakes. As a one-man development operation it seems strange to use a Source control system, but with my history of wackhackery, a source control system is a minimum must-have. I'm still new to subversion, I like it so far... (sort of, why are my folders always red, and why do I have to update so much), I suspect I will be blogging more on SVN soon. It is not set up to run as a service yet, which I would also like to do.
Of course I'm using
NUnit. Love this tool.
And
NUnitAsp which I really like and think has great potential, but seems to have lost momentum? If had had more time* (*If I was smarter, and knew how browsers worked) , I would join this Sourceforge project and get this project back on the tracks. I need NUnitAsp for .net 2.0 and all of the fancy new ASP.net 2.0 controls.
I would also like to introduce a profiler, maybe
NProf? and probably Cruise Control.net which I am essentially afraid of, basically due to what I remember reading about configuring it. But, I will conquer my fear over time, and get CCnet up and running and cruising and notifying, and I will blog extensively about all of my pain with that too.