Tuesday, December 23, 2008

Alt .Net Seattle Open Spaces is on!

I just heard back from Digipen about using their facility for the Alt .Net open spaces this year and they are once again graciously willing to host the event. Therefore, it's looking like it's going to happen the weekend before the MVP summit, which is Feb 27 - Mar 1, 2009. I don't have a lot of details yet, but last year's open spaces event was phenomenal and this year is going to be at least as awesome also. Keep watching the blogosphere for details. I'll post updates here as well as link to other information once it becomes available. Registration hasn't opened yet but here's the site: http://altnetseattle.pbwiki.com/

Saturday, December 20, 2008

Can't blog now, I have a class to teach!

So it's official, I will be teaching the networking class (CS 260) at Digipen starting in January. This was a bit of a surprise to me (a friend of mine recommended me for the job), but I'm really excited about it. This will be my first official foray into the classroom, so I'm looking forward to the experience and more importantly the chance to teach and influence about 100 aspiring developers (and now would be the time where some of you probably have a heart attack). I've done plenty of public speaking and teaching on a much smaller scale, so I view this as a challenge to my current abilities but not something that is beyond me.

Why am I doing this? Well, it goes back to something that Scott Hanselman said back at Alt.net Seattle last year. I don't remember his exact words, but he said to go out to your local schools and talk about programming and that engaging the next generation of programmers is important. Well, you can't say I wasn't paying attention.

So the obvious question is, why me? Well, to start with, I work with networking every day. I am on the Network Class Library Team at Microsoft, and we are the owners of System.Net, which is where pretty much all low-level networking functionality lives in the .Net framework (technically it lives in winsock and we call into it via P/Invokes). I'm not a game programmer (although I originally started coding to write games) but I've written a few games in my day (tetris clone, text-based RPG, all the common ones that you start tinkering with) so I at least have a good idea about how these things work, and games are very non-trivial to write, even the simple ones.

Also, networking for games isn't horribly different from networking for anything else in terms of passing data, however there are a LOT of interesting considerations in making a networked game. For example, naggling is absolute death for a game. Sending a dataset that is too large is also death because of a few reasons, two of which would be that longer packets are more prone to errors and collisions, and that larger sets of data are more likely to get sent as part of multiple packets, which can delay your messages even further and cause more lag, especially at lower layers than TCP/IP that you're probably not even aware of. On top of that, things like NAT and proxies are a huge problem. Finally, games are real-time sytems, and networks by definition do not have real-time, there is ALWAYS a delay of some type so handling that can be very painful as well.

So, what's the hardest thing about teaching a networking class? So far, it's creating the fucking syllabus! I'm not talking about writing it out and all the class expectations, grading policies, and all that stuff; that's easy. The question is, how do I go from week one to week fifteen and teach people who may or may not know anything about networking how to program networked games? There are obviously a lot of topics that I should cover, and I could probably spend 15 weeks on any one of them. The problem is that I have to figure out what isn't as important and what can be cut out as well as what acedemic information I need to include.

I've made a few choices so far. First, I'm only spending about an hour on the OSI model. It's not horribly useful and TCP/IP, which is the dominant protocol out there right now, doesn't really map very well onto it, despite what lots of people say.

Next, I'm going to jump straight to the top of the TCP/IP model and introduce winsock so that the students can start writing code, then we'll work our way down from there. I will have at least three projects, but I'd like to have four if my current plan for a third project goes the way I intend it to. I won't say what they are (in case some of my students are reading my blog. Yes, I know you are AND I know how you figured out that this is my blog AND I know who told you about it) but they're going to be a lot of fun and a huge challenge. Everything in networking is multi-threaded, it pretty much has to be or you'll get stuck listening and unable to talk or talking and unable to listen. Or maybe you'll just take some of your data and then start doing something with it and meanwhile, your 8k buffer that winsock has is starting to fill up and all of a sudden you're rejecting packets.

Beyond that, I won't say too much, although I will be posting about my experiences here periodically. Anyone who has any ideas or input for the class, please feel free to post up in the comments. I'm really looking forward to this and I hope that my students learn something valuable from it (and I hope they enjoy the class, too, but I'll take learning over enjoyment in this case).

Monday, September 15, 2008

Visual Studio quirk- it works on my box

I encountered an interesting Visual Studio thing today. Someone sent me a bug with a repro. I ran it on my machine and started stepping through the internals of the framework to see what the issue is. It worked fine. Hmm, that was interesting, I wonder why? So I run it again and this time I don't step through it and it fails. Ok, that's strange. When I just execute the code, it fails, but if I step into it and don't do ANYTHING except step through the code, it succeeds. WTF?

Well, I had forgotten about something: Autos and locals. In Visual Studio, when debugging, the debugger creates watches on local variables as well as a few things it just watches automatically. In order to get the values of these, it has to evaluate them, and therein lies the problem: If evaluating any of those variables causes any side effects that don't occur during the normal running of the application, it can cause unexpected behavior. Here's an example:

Let's say that I have some state property on my object that is initialized to null. I have a method that depends on this state property being set. That state property is set when you access another property somewhere. Assuming that property is not accessed in the code path that I'm executing, the state property will not be set. HOWEVER, if I trace through the code and evaluate the property that sets the state property, it will end up setting my state, thus changing the way my code executes. Let's look at a concrete example:

class testthing
{
private string s = null;

public string PropString
{
get { if (s == null)s = "new"; return "new";}
set { s = value; }
}

public bool forceit = false;

public bool DoSomething()
{

if (forceit)
Console.WriteLine(PropString);

return s == null;
}

}

class Program
{
static void Main(string[] args)
{
Console.WriteLine("Starting run");
testthing t = new testthing();
bool x = t.DoSomething();

Console.WriteLine("result was: " + (string)(x ? "true" : "false"));
Console.ReadKey();

}
}
So the class testthing has a property PropString that doesn't set the internal value of the private field s until the Get is called. Therefore, if you never call PropString.Get, it never sets the value of s, and DoSomething() will return true because s defaults to null. Run the example code and observe this, it's pretty straightforward.

Now, run it a second time, except this time put a breakpoint on the first line in DoSomething(). When it breaks, hover over the Console.WriteLine(PropString) so that it forces the debugger to evaluate PropString. Now, execute the rest of the code (f5) and observe the output is false, because the debugger has executed the getter of PropString which had a side-effect.

So, the next time you debug an application in Visual Studio and it works in the debugger but not in the code, look at your variables within the method throwing the exception and see if any of them could possibly be changed through evaluation. If so, then you may have found the problem.

One final word: If you have a unit test for something like this, your unit test will fail since it won't evaluate the property when it's run. It would be far easier to write a failing unit test around the method that has the bug and then figure out why it's failing than to step through the method and hope that you can see where it's going wrong.

Thursday, September 11, 2008

Let's just blame Microsoft!

This is a good one. Some guy named Steven J. Vaughn-Nichols is blaming the Sept. 10th London Stock Exchange crash on .Net. Wow, informative! It crashed. it runs on .Net, so that must be the reason. .Net isn't suited for real time systems! Right? Not so fast, dude.

Full disclosure

Before I start, let me just say that I do work for Microsoft and I work on the .Net framework. Does this make me biased? Probably, but I'm going to attempt to focus on other things besides "Microsoft good, .Net good" here and draw a logical conclusion.

What happens

So, what's the scenario? Well, apparently (according to Steven) the LSE runs some software called TradElec, which is a c# application. It also runs on Windows 2003 with Sql Server 2000. Clearly, the weak point is .Net here, nothing else it could possible be. Right?

You are full of fail


So Steven probably wrote all those "conclusions" down on a mat, which he then placed on the floor, so that he can "jump" to them. He clearly has. Something broke, so it's Microsoft's fault, because .Net just sucks for real-time applications. So does Sql Server 2000 and Windows Server 2003. There's nothing else that could have gone wrong, right?


There's no way it could be human error. No way at all

What he doesn't say is that this could possibly be programmer error. There are thousands of ways that a programmer could mess this up and just write crappy code. For network connections, the Asynchronous programming model is not trivial and requires some reasonably deep understanding before you can really make it work well for you. I see a lot of people mess this up, and unfortunately it's their fault and their problem most of the time because the performance you get through asynchronous programming comes at the price of being complex and involving multiple threads, which is something that a lot of people just don't understand.

Additionally, we don't know how they're doing their DB access here. Maybe they have some sort of transaction hell that's locking the shit out of their DB. Maybe they don't use stored procs (BIG performance issue in Sql2k, fixed in Sql2k5 so not a big deal there). Maybe they don't know how to create an index. My point is that we don't know, so we can't say for sure. Probably, however, this is an issue.

Finally, the .Net framework itself has some interesting quirks if you don't really understand the CLR well. I don't usually recommend books on specific software technologies, but go out and get a copy of CLR via C# by Jeffery Richter; I learned more about the CLR in that book in a month than I did in two years of using .Net every day. Granted, garbage collection takes away a lot of the complexities of memory management, which can be a big performance issue, however as a developer you STILL need to understand what the CLR is doing. Things like boxing and unboxing can take time, mis-using value types and reference types eat performance, even how you allocate objects can affect performance. For example, if you're using buffers for network traffic, if you allocate a new buffer each time, you may trigger garbage collection which will randomly hurt performance and be difficult to track down. if instead you allocate a massive pool of buffers and then just use those, they will live on the large object heap and they will NEVER trigger garbage collection so your app will be more consistent.

Blame Canada . . .um. . . er. . . .Net?

So do we blame .Net? With this much information, we really can't. It's far more likely that Sql 2000 is to blame (if anything), although I've seen shit databases created in open source just as often as MS Sql so it's entirely possible that it was just designed stupidly. It's also equally likely that the people who wrote this just screwed up, either in writing the code or improperly testing it. Again, these things would happen if the same programmers used open source software.

Wow, what a useful solution!!!

What does Steven suggest? Use linux. Wow, that will fix everything! I'll just go install it right now, with KDE and everything!!! Wait, no.

Next, he suggests Oracle. I've used Oracle and in some ways I love it way more than MS Sql Server but in other ways I hate it a lot. Oracle is better than Sql2k but I have yet to see proof that it's better than Sql2k5, however I won't pass judgement on that yet. Maybe Oracle would be a better db choice. Not that Oracle's open source or anything. It also works with .Net. I've used it.

Next, he recommends Java. Java, with the worst threading model in the history of the world (more on that later), is his recommendation for a fix! I have yet to see a case where a Java application works significantly better than a .Net application doing the same thing. A lot of the tools are similar. The languages are similar.

In conclusion, Steven is jumping to conclusions that Open Source software (+Oracle) is better for performance. He has no evidence other than "it was running .net and it crashed" to base this on. He is therefore wrong. I have an idea. So you take this mat, and you write various "conclusions" on it, and put it on the floor, so you can "jump" to them. I'll send him one!

And I KNOW it wasn't a .Net networking issue because

I am on the NCL team at Microsoft. We own the System.Net namespace, which is what handles networking in the .Net framework. It was my turn to handle issues that came that week. If it had been a .Net issue with networking, I would have heard about it. I heard nothing.

Tuesday, July 29, 2008

Teaching Java in school is just as controversial as an interview with Justice Gray

I read a great article today here that was a follow up to an interview here with a Computer Science professor Robert Dewar of New York University. I'll stop for a moment while you go read the articles. Both of them. Like I did. All done? Good, now we can have an intelligent conversation about them, and I have to say that I absolutely agree with Prof. Dewar's main points about today's graduates. (disclosure: I have a degree in Computing and Software Systems from the University of Washington. While Java was used for some things, the majority of my degree was in c++).

One of the main arguments that he makes is that Java has a lot of libraries. A lot. To quote Prof. Dewar as quoted from the article (not sure how to cite a quote from a quote correctly so just pretend that I did):

“If you go into a store and buy a Java book, it’s 1,200 pages; 300 pages are the language and 900 pages are miscellaneous libraries. And it is true that you can sort of cobble things together in Java very easily…so you can sort of throw things together with minimal knowledge,” he says. “But to me, that’s not software engineering, that’s some kind of consuming-level programming.”


Now this is absolutely true, and that's one of the strengths of Java is that there are a lot of libraries for everything. The same can be said of php, .Net, and perl. In fact, .Net has CLR implementations of functional programming as well (F#). There are also thousands of applications out there written in Java, .Net, php, and perl. True, none of them keep airplanes in flight or help launch the space shuttle. However, they do help trade stocks, manage companies, guard private health data, run military equipment, and create useless social -networking sites. So, what's the problem then? Is it that Java isn't as popular anymore in the business world? (Hint: no). Is it that Java is the wrong language to teach students how to program? (hint: no). Is it that we're not teaching computer science properly? (hint: warmer). Well, if you want my opinion, and since you're reading this blog I'm going to assume that you do, then my Answer-with-a-capital-A is:

WE'RE NOT TEACHING THE WRITE THINGS WITH THE RIGHT TOOLS!!!11!1!!11!one!1

Allow me to explain a bit.

Computer Science is too hard so only nerds can take that class

CS is intimidating for a lot of people. Computers are scary. Computers are complicated. Computers are for nerds who stare at the command prompt all day and never see the sun. These are all things that I've heard from non-CS majors.

Except that people in Math, Chemistry, and Physics need to take that class too

When I was in school, there were also a number of degrees that required the introductory computer programming classes (that's CSE 142 and the follow up cse 143 at the UW). I took the cse 142 equivalent class at community college when I was in high school through the running start program. The class was conducted in c (not c++) and I got credit for cse 142 by taking it as well as a year of science credits for my high school so that I could free up an extra period for a year to do nothing. The class was challenging but not overly so (in my opinion) and something that helped was that the class had about 20-25 people in it so there was a lot of opportunity for students to get individual help from the professor.

I took cse 143 at the UW my freshman year (way back in 1999 so now you all know how old I am) and it was in c++ at that time. I already knew c++ but that class was still challenging, even for me. I recall the first quiz we had, I got something like 44 out of 100 and I was still two full standard deviations above the mean. A lot of people dropped the class. I remember a project that was our first big exposure to objects and it was a dll-hell type situation and almost no one (including me) could get the code to actually build and link. The TA's couldn't get it to work. My friend Kevin who was a graduating senior in CS couldn't get it to work. The professor finally said that we should all just turn in what we had and if it didn't build or link, he'd grade more easily on this assignment. This almost made me hate computers. It did make a large number of students in that class say "fuck it, this sucks" and drop.

These are the problems that Java is trying to solve

Java doesn't have dll hell. It had an IDE that is well supported. It's free. It can be run all platforms easily enough and doesn't require special changes to the code to get it to build on different platforms. The syntax is reasonable friendly. It does have a lot of libraries and stuff, but it's still able to implement most data structures and algorithms that you'd commonly find in such classes, such as linked-lists, b-trees, heaps, hashtables, etc. as well as common searching/sorting algorithms. Now, students only have to worry about their code and making it work. There is no (or minimal) frustration with things like getting the damn compiler to work or worrying about environment. If a student likes solving problems in code, they may now choose to pursue that as a degree instead of getting overly frustrated with dealing with their build tool or IDE or whatever. The thoughts of the universities are clearly in making CS not look as intimidating at first and I think that Java solves this problem about as well as it can be solved.

How Java creates at least 10 new problems (that's 10 in like base 50)

The first problem is how students are taught using Java. Just because Java has 100000 different libraries doesn't mean that you have to use them. Ultimately, most of those libraries are written in Java, right? So that's the first problem: when you start teaching data structures and algorithms, you must actually teach them; you can't just let people use the libraries to build applications. Maybe a good example for teaching hashtables would be to show the Java HashTable class and write a working application that stores and retrieves values from one, but write that application against the Map interface (which hashtable implements). Now, write your own Hashtable class that implements Map and the output from your implementation and the Java HashTable class should be the same. Repeat for other data structures and algorithms. Wow, problem solved.

So now that the students learned Java, we can just do everything in Java, right?

Wrong. You just can't teach everything in Java. You can teach some things in Java. I had an operating systems class that was taught about 50% in Java (the other 50% was c++ on a linux kernel). It was taught in Java because illustrating concepts with multi-threading is much more complex in c++. Doesn't mean that I don't know how to multithread in c++, but it was much easier to debug this crap in Java, which let me focus on the concepts of an OS. The same is true of file IO- we had to implement a virtual file system, which Java made easy by handling the actual file IO for me, however we only got one really big file, and inside that file we had to have our inodes and our data and we had to create a "file" implementation that would simulate reading and writing to our big "file." Again, this helped me understand the concept without having to worry about formatting an actual hard drive and interfacing with it. Here's the point:

Java allows you to focus more on the concept that you're trying to study without having to spend a lot of time working on the tools and environment. If focusing on the concept is not dependent on the tools or environment, then Java is an acceptable choice.
Learning about memory management in a language that manages memory for you is hard

This is probably the biggest point in which Java breaks down as a teaching tool. In order to be a developer, you have to have a solid understanding of how the machine handles things like memory and what implications it has on your program. You don't have to worry about memory at all in Java, so this makes it an inadequate tool for the job. A language like C is really the best teaching tool here since you have to do everything manually. In fact, I think the best way to really understand how memory works is in assembly, where you can actually look at the addressing modes and see the difference between them. This helped me understand pointers more than anything else, which brings us to:

You can't learn about machine language without an actual machine

Java is a virtual machine, but we want to know about actual machines. Having a strong working knowledge of the principles behind how computers work is critical, especially when something goes terribly wrong. You probably won't ever use assembly again in your career after college. You will probably never write a driver or a compiler. However, if you are using these tools (yes, I consider a compiler to be a tool), it's important that you know generally how they work because if they ever don't work, you'll never be able to figure out why. You need to learn this by actually looking at hardware and how it's built. You should be able to design a relatively simple logic-based circuit. You should know how these circuits are used to make up a computer. These things aren't that hard if taught well (and I was taught well so thanks Arnie if you're reading this). Assembly language is how the hardware and software interface, so it's pretty important that you learn it as well. I learned motorola 68000 assembly which I think is much simpler than x86 assembly but still illustrates the points well. I now know the difference between
int a = 5
int* b = a
int* b = 5
is really in what assembly instructions are emitted (hint: it's the same instruction but the addressing mode changes). This helps me understand how memory works in programming, and that helps me to ensure that I write programs that don't leak memory (or references in the case of managed code because it's a similar concept).

And now, the things that are missing from the education system.

What I (and many other people) think is missing is a good foundation in object-oriented programming. Most people get a week or two and a homework assignment on polymorphism. That's cool, you understand inheritance not really. What they don't teach is WHY to use inheritance and how to use it correctly. There is nothing on patterns, or refactoring, or just generally how to program with objects. There needs to be, as I think this skill is critical to have and that most college grads don't have it because they were never taught it (I know I wasn't). I think that Java would be a good tool for teaching this (although obviously not the only one).

And then there was testing

No one teaches how to test code or even how to make code testable. I'm not talking about running your app and checking inputs and outputs. I'm talking about unit testing, integration testing, and the automation of those things. It's not enough to just know that you have to test or that unit tests are important. You need to understand things like test doubles, test automation, and how to write code that can be tested in isolation. Java would probably be a good language for this.

And then there was all the "other stuff"

So how do you handle building big projects? How do automate a build? How do you manage source code? What is a branch? These are all things that most developers who have experience with them take for granted, but we all had to learn somewhere. Probably we either figured it out by experiencing the problems that these things solve, or someone at our jobs showed us. This needs to start in schools. Don't focus on a specific technology for any of these but again teach the concepts and why they're important. The code isn't all that important in this type of class so I could argue that you could use any language you want. HOWEVER, remember the purpose of this class is managing code not writing it, so don't force students to do anything complex in the code. Use some existing code and make trivial changes to it that force the students to use version control and to change the build process to take into account the more complex stuff.

The final thought

A lot of universities stick with Java because the students already know it and it's the lowest common denominator. That's fine if you want your students to come out being the lowest common denominators in the world of developers. One critical skill that developers have is the ability to learn new languages, particularly since new languages are developed all the time. This helps stay competitive in the workforce as technology changes. If you just teach the whole thing in Java, then that's a problem because students never get the opportunity to figure out a new language rapidly.

So my solution?

  1. Teach every class in the most appropriate language for the subject. Intro classes should be taught in something that has a minimum of extra crap to make the programs compile and run. Java is really ideal for this but I would be OK with c# also. The point of this class is an intro to programming, not an intro into fucking with the compiler.
  2. At a minimum, each student should be required to work in at least four programming languages while in school, one of which should be assembly and one of which should be object-oriented. HTML is not a programming language.
  3. Teach how to write good code. Comments != good code. This should be enforced in every class but there needs to be a specific class in how to do this and it needs to happen early in a student's career. Class should cover things like patterns, principles of OO design, unit testing, etc.
  4. Require version control to be used by every student for every class past the intro classes. Universities should provide access to a university-run vcs for each student. This isn't as hard to do as it sounds.
  5. Compiler, Hardware, and Operating Systems classes should be mandatory (sometimes some of these are not). I wrote a disassembler in assembly language as a final project in hardware. It was hard but not impossible and everyone in the class got at least something that sorta worked. Mine could disassemble itself accurately.
  6. Students should be forced to collaborate with each other in every class. Collaboration might include working together, but could also include code reviews or paired programming.
  7. Don't ever force a student to have their code reviewed in front of the class unless the student is ok with it, but anonymous code review or review by the professor in a private setting is fine. I realize that the business world will not conform to this but this is school and we don't want to alienate students. I think this is a compromise that will still teach a code review's value and how to conduct one without making people want to drop out of the program (or worse).
  8. Every class should involve writing at least some code.
  9. Professors should provide at least one well-written piece of code that demonstrates something that the class is teaching. It's helpful for students to read good code. It's equally helpful for students to read bad code and know why it's bad.
Finally, if you're a professor, college administrator, or anything similar and you want to talk to me or anyone else in more detail about this, I'd be happy to chat with you any time. I only rant about this because I passionately believe that it's important and I will do everything in my power to try to make Computer Science education better. If you're reading this, I challenge you to make this a priority as well. Go talk to your local college. Email your professors. Go offer to talk to classes at your local schools, particularly at the high school and community college levels. Encourage people to be CS students. You never know what kind of influence you'll have on someone unless you do nothing.

Monday, July 21, 2008

Tag Soup sucks: Hey Jeff, here's a better way

Jeff Atwood of coding horror posted about "tag soup" in web development. I absolutely agree with him on this one: every web development framework currently in existence renders crap HTML code. Remember my HTML wall of shame? Yes, that's a good example of crap HTML being rendered by frameworks. Jeff (Atwood) asks if there's a better solution. Luckily, Jeff (me) has one: it's called writing good HTMLand separation of concerns in rendering. Wow, that's a long phrase. Let's try again: Don't use frameworks because you don't know HTML; the people who wrote the framework don't know HTML either. No, still not good. Let's stick with the old favorite:

It's called HTML and it's not hard

That's right, HTML is not a complex thing and writing clean HTML isn't particularly difficult. In fact, you can leverage a framework and still write good HTML and I'm going to show you how. It's really as simple as using separation of concerns. Let's analyze the various parts of a web page.

HTML

What is the HTML for? It's really a place to store content. Your text, your menu bars, your stupid scrolling marquees, your <blink> tags, etc. All of this goes here. The way I think about it is that you're using HTML tags to create containers for content. A <p> tag is a container for some text. A <table> contains some related data in rows and columns. A <span> tag is going to hold some special line of content. A <div> is going to contain some special stuff inside of it. Notice that I haven't said a thing about formatting, style, or actual content yet. The reason would be that it DOESN'T BELONG IN YOUR STUPID HTML!!!!! One more thing I'd like to bring up here is the hell of nested tables. This occurs when someone wants to do some sort of complex formatting and doesn't know how to use the div tag with CSS. Nested tables are an anti-pattern called "Nested fucking tables" and should be avoided. It won't make your formatting better (Firefox and IE sometimes render different table elements differently so often this actually makes things worse). This brings us to:


Formatting and Style


So wouldn't it be nice if there were some sort of "style" thing you could use to store all your styles so you can keep them in one place (DRY, right?). Maybe some sort of "sheet" where your "styles" could go, and then they would "cascade" throughout the whole site for every page that referenced them. Maybe some sort of "cascading style sheet?" Oh wait, that already exists. Let's use it! Now, you can focus on the HTML only be containers for content and let your CSS define how that content is presented and styled. Separation of concerns, right? Now you have only containers and maybe some information in the containers to identify them to your style sheets (ID and Class are the attributes you're looking for). This is good separation of concerns.


So what about behavior?


This is where the client side stuff comes in. Things get a little trickier here but not that tricky. Ok, I lied, it's not tricky at all if you actually know javascript and treat it as actual application code and not some bastardized client side tag-hiding-style-manipulating crap. Javascript is a language. It is subject to the same rules of all programming: Separation of concerns, DRY, IoC, etc. It should also have its own unit tests. Finally, like CSS, it should be extracted into its own file so that every page can consume it.

So now you have containers that can be identified, styles that can be applied to them, and scripts that can determine their behavior all in separate places. The ID's and classes of your containers help your styles and scripts know what to apply themselves to. There is a minimum of code that exists in your HTML that helps bind these things together, and in those pages that really, really are one-offs, you have inline styles and inline javascript (this should REALLY be the exception though).


But the server blah blah blah . . . .

This is where the example that Jeff shows really breaks down. I'm not going to post it up here, but go look at his post and check out the example code. You'll see something really stupidly obvious: YOU'RE DOING LOGIC IN THE DAMN MARKUP!!!! You're concatenating links, you're looping through stuff, you're doing all kinds of crap. Hell, as long as you're at it, why don't you query the database there too just so all your crap is at least in one file?

There is a simple solution to this problem: You already have containers defined by your html. Use them. Expose them to the server side code and let that code render stuff inside them. For example, in ASP .Net one of my favorite tricks is to have a table on my page and actually use the <asp:table> object so that my codebehind can expose it to my controller (You're using MVC, right?) and my controller can populate it with data. Wait, controllers shouldn't populate data, so wtf am I doing? Am I breaking my own rules? No, I don't directly populate tables from the controller; typically I use an intermediary object to do that for me (more about this in a future post, I promise). This way, the controller is able to provide the model to the view via some other object that is responsible for doing complex formatting. I can reuse my formatting objects where appropriate. I can also change the formatting without changing the model or the view itself. I can change the view even if I want to without caring how the formatting is created (as long as the contract between the view and my formatting object is fulfilled, i.e. if the view is expecting a table then the formatter had best be rendering one).


Here you go, Jeff-

A nice, happy, clean solution looks something like this:

1. The HTML provides containers for content and possibly some content as well.
2. The CSS provides style information and formatting for the containers.
3. Javascript manipulates the containers client-side to create a client-side view when neccessary
4. The server side code populates the HTML containers with content.
5. The server side uses helper objects to populate content that requires more complex rendering (tables with grouping levels I think are a good example here).

The only thing you need in order to pull this off is to know all of these different technologies. This isn't that hard, and as a web developer you really should know all of this stuff anyway. I think Microsoft started a horrible trend with Asp .Net that allowed application developers to write web apps without knowing anything about web technologies. This attitude has brought us the viewstate, page events, chatty controls, and a bunch of other crap that makes your html look like tag soup. Rails and MVC haven't helped this problem at all.

Thursday, June 26, 2008

Handling time zones in .Net is really easy

I have a project right now where we are going to implement a feature that allows an admin to specify the time zone that each user is in. When they view reports on things like system activity or similar that have a time component to them, they will see the times in their own time zone. When they request a report and specify a time, it will parse that time in whatever their own time zone is.

Time is a fairly difficult problem to handle since many different areas have different time zones with different rules. Does it have Daylight Savings time? When does that start/stop? What if the dates it starts or stops on change? What about the 25 hour day you get when you change and the 23 hour day you get when you change back? There are a lot of edge cases and a lot of time zones (for example, Arizona is on Pacific Time but it does not have Daylight Savings Time) that you have to account for. Fortunately, .Net has this nifty thing called the TimeZoneInfo class that makes this a trivial problem.

Some useful methods

There are a number of static methods on this class that are useful to the feature that I'm trying to implement. The first story that I want to implement is "As an Admin, I want to see all available time zones so that I can select the appropriate one for a user." Not a problem. I want to load them all and store them in a listbox. Here's the code:


ListBox list = new ListBox();
IList systemtimes = TimeZoneInfo.GetSystemTimeZones();
foreach(TimeZoneInfo t in systemtimes)
{
list.Items.Add(t);
list.DisplayMember = "DisplayName";
}
This will give me a list box that tells me all the time zones that are currently installed in the registries. I have a property on my TimeZoneInfo class called "Id" that gives me the id of the particular time zone, so if I want to save the time zone that a user is on, I can persist the id of that class somewhere and then use this to retrieve it:

int tzID = MyTimeZoneRepository.GetTimeZoneIDForUser(someUser);
TimeZoneInfo tz = TimeZoneInfo.FindSystemTimeZoneById(tzID);
Alternately, I can serialize and deserialize my TimeZoneInfo class in order to be able to persist it and reconstitute it in situations where I may not want or have available the system time objects. The code for that would be this:

TimeZoneInfo tz = TimeZOneInfo.FindSystemTimeZoneById(1);
//tz now has some time zone value

string serialized = tz.ToSerializedString();
//serialized is some really long string that has all the time zone info in it

TimeZoneInfo tzNew = TimeZoneInfo.FromSerializedString(serialized);

Assert.AreEqual(tz, tzNew); //should return true


This also means that I can create my own custom timezone if I want to by defining a set of rules and then instantiating a TimeZoneInfo class from my rules (I'm not going to give an example here but there's plenty of them out there).

Now, to actually use these things, I use the TimeZoneInfo.ConvertTime() method. So, if I want to fulfil my second user story, which is "As a user, I want to have all my reports reflect my local time zone so that I can see when things happened in relation to where I am" then I have a means to do so.

In order to do this, I need a UserTimeZoneRepository that gets the correct time zone for a user by their ID. First, some helpful interfaces. Let's assume that these all have implementations that do what they look like they should do:

public interface IUser{
public int ID {get;set}
}

public interface ITimeZoneRepository {
TimeZoneInfo GetTimeZoneForUser(IUser u);
}



Now, here's some code to handle the time zone problem. Let's assume that ALL times in my database are converted to UTC when they are stored there:

public class TimeZoneConverter{

private IUser u;
private ITimeZoneRepository r;

public TimeZoneConverter(ITimeZoneRepository r, IUser u){
_repository = r;
_user = u;
}

DateTime ConvertTimeToUTC(DateTime t){
return TimeZoneInfo.ConvertTimeToUTC(t, r.GetTimeZoneForUser(u));
}

DateTime ConvertTimeFromUTC(DateTime t){
return TimeZoneInfo.ConvertTimeFromUTC(t, r.GetTimeZoneForUser(u));
}
}


So why all of this? Well, first, this gives me a centralized place to handle all time zone conversion issues (Don't Repeat Yourself). Next, it decouples the user and the user's time zone from the actual functionality of converting the times (Single Responsibility Principle) so that I can test it in isolation but I also don't have to concern myself with where that time zone is coming from. Finally, this gives me a convenient place to hook date formatting into my application when I need to. Remember, I can't just user DateTime.Parse because I don't know if 04/05/2008 means April 5th or May 4th without having some sort of cultural info associated with this. If I store the user's culture somewhere and have a similar repository, I can then combine this object with the DateTime parsing and formatting commands and create one unified DateTimeService for my application where a Date string goes in and a DateTime comes out (works the other way also). It just so happens that I have another story that reads "As a user, I want to read dates and enter dates in a way that I am familiar with so that I can interperate the meaning correctly" so this is definitely going to come in handy in the future. Some other useful things that the TimeZoneInfo class has:

TimeZoneInfo.SupportsDaylightSavingTime - this property returns a bool that is true if there is a daylight savings time on that instance of TimeZoneInfo. For Pacific Time, this would return true, for Arizona time it would return false.

TimeZoneInfo.DaylightName - this property returns the name of the Daylight Savings Time Zone, so for Pacific time it would return "Pacific Daylight Time"

TimeZoneInfo.Local - this static property returns a TimeZoneInfo class that reflects whatever the computer you're running it on is set to.

TimeZoneInfo.Utc - static property that returns a TimeZoneInfo class that is set to UTC time

TimeZoneInfo.IsDaylightSavingsTime(DateTime) - instance method that tells you if a particular datetime occurs within daylight savings time for that time zone

Anyway, this is my first exposure to having to deal with this stuff and it's convenient that .Net provides some good functionality for dealing with it easily. If you want more info, there's a number of MSDN articles on the TimeZoneInfo class and how to deal with Dates and Times in general that you can read but this post has enough info to get you started.

Alt .Net meeting- why do only developers attend?

At every Alt .Net event I've been to (all three of them but I'm still kinda new to this) as well as every conference, it seems like the vast majority of the attendees are developers. While we're there, we have sessions on things like "how to convince project managers to use Agile" and "how to talk to business people" and stuff like that. Then we return to our jobs and try to convince these people that we're right or that code quality is important or that unit tests help improve your product or other things that are obvious to us (especially after two hours of discussion). So here's my question:

WHY DON'T THE BUSINESS PEOPLE ATTEND OUR CONFERENCES?????

whoa, I just had an epiphany:

WHY DON'T WE SOFTWARE PEOPLE ATTEND THE BUSINESS CONFERENCES????

It seems like we're trying to figure out how to communicate with the "other side" that is the business. We are saying that we value communication, engagement, and people over process, however all we're doing is creating a process for trying to communicate with them. Isn't this a violation of our own principles? Certainly seems like it to me.

It works both ways, people

Instead of just talking to each other about how we communicate with the business people (and I assume the MBA conference probably has similar ideas about talking to software people), why don't we practice what we preach and bring some of these business people to our conferences and directly engage them in this whole communication thing? We can tell them how we communicate and what our needs are and what is important to us and most importantly, why these things are important. In turn, they can tell us the same thing from their perspective. I think that this would be infinitely superior to what we do now.

If you build it, they will come (and they've built it and we're sitting around)

On the flip side of this, how many of us developers actually attend any business-oriented conferences? Have any of us? What about presenting a session at them? Anyone do that? I certainly haven't and I have only myself to blame for this. I think it's time for us as a community to take action and not only bring the business folks and managers into our movement, but actively reach out to them and approach them in their community as well and let them bring us into their world for a change. Anyone out there agree with me? Anyone have any ideas on how we can do this?

Wednesday, June 25, 2008

New Job!

I'm excited to announce that I have a new job working at Microsoft! I will be on the networking team of the Core Operating Systems Division and I will be working on System.Net in the .Net framework. I'm really excited about this position as it will give me a chance to learn a lot of new things and to grow as a developer. I'd just like to thank everyone at Microsoft that I've worked with so far, especially my recruiter who has been very helpful and infinitely patient with my constant stream of questions. I'd also like to thank JP Boodhoo for his blog post which helped me to make this decision. He is absolutely right about setting goals and making sure that you're constantly raising the bar so that you are continuously improving yourself. If you don't have any goals for your life, career, family, and anything else that is important to you, then drop what you're doing and set some. There are some things that you can't change, however you can always change yourself and your situation if you make enough effort.

Monday, June 16, 2008

How I made it faster- the yearly windows reinstall

Every year my computer starts behaving very strangely. Programs get really slow, the start menu takes forever to load, the OS crashes randomly, and other strange things happen. This year's strangest thing was that for reasons unknown, Visual Studio decided that Notepad++ was my default web browser every time I launched a web application. I have no idea how this happened and it has defied my efforts to fix it. As a result, I find it easier to just reinstall windows and everything else from scratch. This usually takes me two days to get everything the way I like it, which is much less time than it usually takes me to go through and fix everything (if I even can fix everything, and yes, I have timed this). This time around, I decided to change things a bit and I've noticed that some things are much faster.

Visual Studio
I develop web applications in C#. That's really all I do at work (besides read blogs and drink coffee). Therefore, when Visual Studio prompted me to do the "common" install, I hit "custom" for the first time ever. I then got rid of everything that wasn't C# and web application related. No C++, no C++ diagnostic tools, no VB, no installation packager or whatever it is, just C# and web stuff. I also got rid of things that I never use like the icons and other crap packages. I also gave up any plugins that I don't directly use. I just started using my fresh Visual Studio install, and this thing flies now. It loads really, really fast. It still takes forever to load my solution (it has 14 projects in it, but I promised that I wouldn't complain about my project setup in this post, and no, I wasn't the one who added those projects) but everything else is fast. I also have several plugins turned on, including ReSharper, CodeIt.Right (like FxCop but useful), and the TeamCity personal build thing. Also, unlike my previous computer, I only have one version of Visual Studio installed (2008) as opposed to four versions (I used to need all four of them too).

Sql Server
Again, the only thing I care about in Sql Server is the database and the client tools. I don't use Reporting Services, Business Intelligence, the legacy DTS stuff, or any of that extra crap. I don't need it at this time. So I hit "custom installation" again and chose only those components that I actually use. I also found that I no longer need Oracle to be installed locally so I don't have that installed either, although I doubt this is speeding things up any because I had it turned off unless I was actively using it.

Other Software
So before I started randomly installing all the stuff from Hanselman's tool list, I made a list of all the software installed on my old computer. I then started crossing out things that I can't remember using in the last six months. I installed what was left, including Launchy, Nant, NUnit (sorry MbUnit, but some people at my work don't like you so I can't use you anymore. . . at work), DisplayFusion, ReSharper, Firefox, Subversion, and Notepad++. So far, I have yet to find something that I miss. Please also note that I didn't install Tortoise SVN this time around, which brings us to the next most useful thing that I installed:

Cygwin!!!!!!11!11!1!1!!one!11!!
I love this thing. It basically gives me a Bash shell on windows. It has all the unix commands that I'm used to using. It has alias and ln. It has hundreds of packages, including subversion. It's find totally pwns windows find. It has grep so I can do something like alias devenv=`find / -name devenve.exe | grep devenv.exe` and I get to type "devenv" and it launches Visual Studio without me having to fuck around with my Path or other environment variables. I also have vi and nano, which rule (I use nano when I need to change one stupid line of a config file because I find it simpler to use for that purpose than vi, but vi is for anything more complex). I have only begun to figure out the coolness that is Cygwin. I also get Python and Perl and probably Ruby, although I haven't actually looked for a Ruby package yet. Be warned, however: If you are not familiar with Unix, including commands, the file system, and concepts like fork(), it probably is not for you. Also, when you want to launch a program and not have the shell sit there and wait for the program to terminate, the correct thing is to add a '&' to the end of the line, i.e. /devenv.exe &
will launch Visual Studio and kick it into the background so I can do other stuff with the shell.

The results

My computer loads faster. A lot faster. This is probably due to a combination of factors, but I would say the biggest difference between this computer and the last is Sql Server only has two services running now instead of like 1000 or something (I'm exaggerating but you get the idea). I also don't have Tortoise slowing explorer down, although I may reinstall it if I find that I'm not more productive using svn from the command line. Visual studio starts faster. I also don't have to worry about which version of Visual Studio launchy decides to launch for me either. It also stops faster when I close it.

The moral of the story:

The moral here is really stupidly obvious: Don't install crap that you don't use or need. This really shouldn't need to be said, but I find that a lot of people just hit "default install" when they shouldn't be. They also install crap that they don't use and don't need. That being said, may I present:

Jeff's guide to the obvious: Volume 1
  1. Always use custom install when installing software. Pick only what you need. This applies to everything.
  2. You can always run setup again if you need to add something.
  3. Don't just install something because it "looks like you might use it"
  4. Any software that you don't use in more than six months, get rid of it.
  5. Any software that you don't immediately start to love and use frequently, get rid of it
  6. Don't just install everything off of Hanselman's tools list (uh, not that I ever did or anything. . . . )
Follow these simple instructions and I guarantee your computer will run faster and have less random behavior, although having Notepad++ be your default browser is useful sometimes.

Seattle Alt.Net open spaces

There was an open spaces event for the Seattle Alt .Net community on May 24th (I know, this post has been sitting around for a while just be glad I have insomnia tonight). I think I learn more in these open spaces events than anything else I've been to so my first advice is to find (or start) a local group and have these things.

Everyone has something to contribute

I think it was Dave Laribee (please correct me if I'm wrong) who said that "everyone here is a leader" at the last open spaces I went to. This is absolutely true. I find that everyone, no matter who they are, what their experience is, or what they work on, has something meaningful to say on something. That's one of the things I really like about these events.

The event

So this open spaces was at the offices of Lexis Nexis so I would like to publicly thank them for generously donating their space and providing lunch. Employer support of Alt .Net and these types of events is critical in ensuring that they happen and that the are accessible to all. I hope that everyone's company will follow Lexis Nexis' lead in supporting these types of events as they directly contribute to the education of their developers, and a company with good developers is more likely to be successful and deliver a better product.

We discussed a number of things at the event, most of which we were able to video, so I'm not going to spend a bunch of time recapping, however I will come back and edit this post with the links to the videos once they're are available.

The biggest thing that we got out of this event was a good foundation for a Seattle Alt .Net community. It turns out that there are a lot of us devs out there, definitely more than I had suspected, that are interested in this whole Alt .Net thing. In fact, it's looking like our open spaces is going to be a monthly thing (last I heard we're going for the fourth Saturday of the month) so I'm excited about seeing where this is going to go. If you are interested (you should be) you should subscribe to our google group and start coming to these things. Remember, everyone is welcome and everyone has something to contribute. Seriously, you should come to our next event. Even if you disagree with everything Alt .Net is, come out anyway and see what we're all about (and if you do disagree with everything, by all means share your opinions because maybe you've thought of something that none of us have).

The next event will be on June 28th at:

Mantis Technology Group, Inc.
12413 Willows Road NE
Suite 300
Kirkland, WA 98034

I'll see you there!

Wednesday, May 28, 2008

Why Elegant Code doesn't get Agile

This blog post seems to insinuate that Agile does not work. I believe that their David Starr (correction: I directed this at all of Elegant Code when I should have been specific to the poster. Apologies for this oversight on my part) conclusions are flawed and that they truly don't understand that Agile is an idea and not a process. I think a better title of this post would be "why being dogmatic about Agile doesn't really work" or "why people who do things because that's what you do in Scrum doesn't really work" or some such. By making this blanket statement about Agile not working, they are both ignoring the many success stories that Agile has and betraying their own ignorance of the Agile process.
"Agile supports the idea of frequent delivery of value to customers."

Let's look at shipping software for a minute. Where in the Agile Manifesto does it say "You have to ship software every two weeks" or anything like that? Go ahead, look for it. I'll wait. Didn't find it? That's because it isn't there, and it isn't there for a reason. That reason (I generally believe) is that shipping software to customers every two weeks is a retarded-stupid idea for all of the reasons that the article goes into. I'll summarize them here:
  • Sales can't keep up with that release cycle
  • Training can't keep up with that release cycle
  • Documentation is often not available at that frequency
I'll even add a few more:
  • Any type of release-preparation can't be done (burning CD's, peer-review, post-mortems, etc)
  • Software with a high amount of overhead to upgrade won't allow for this (think SAP or Sharepoint or some such product)
  • Most customers don't want a new version every two weeks because they'll have to upgrade and re-train their employees
  • This pace is not sustainable, which is something that is, in fact, on the Agile Manifesto ("Agile process promote sustainable development")
So what's the point? In Agile, we recognize that working software is the best measure of productivity. We also recognize that we want to deliver working software as frequently as possible. This brings us to:

DELIVERING WORKING SOFTWARE IS NOT THE SAME AS SHIPPING A PRODUCT!!!!!!!!!!!!!!!!!

That's right, you can deliver working code constantly. You can deliver it to sales so that they can preview what they're going to sell. You can deliver it to your customer proxy to verify that you have correctly implemented the features that you need. You can deliver it to your sprint demo so that the rest of the world can see it work and know that you're going well. You can deliver it to your project manager so that he doesn't have a heart attack every sprint because the software doesn't work. BUT YOU DON'T HAVE TO SHIP IT. Just because you CAN ship something doesn't mean that you should.

So why do we deliver working code in the first place if we're not going to ship it?

Because in Agile, we want rapid feedback. It is much, much easier to fix something right after you build it than it is to fix it three months after you build it. The sooner you know that you need to change something, the easier it is. For example, let's say that your QA department is one sprint behind your development in their testing. On my team, we average about ten builds per day (utilizing our CI tools at the time of checkin, but if you count personal builds it's probably a lot more). Our sprint is three weeks, which gives us about 14 days of development on average (adjusted for the time spent on code review, sprint planning, sprint demos, holidays, etc.). With ten builds per day, that means at a minimum, QA is finding a bug that has existed for 140 builds. Other functionality may be built on top of it. It may have dependencies on other parts of your code. Finally, where the hell is that bug occurring, given that maybe you've worked on a dozen different objects since the bug was introduced? You'll have to hunt for it, and that takes time.

So Elegant Code has missed the boat on this one: They're assuming that delivering working software means "shipping it to the customer" when this is absolutely not the case. They're jumping to that conclusion. Maybe I should write "delivering software means shipping it to the customer" on a mat on the floor, and they could "jump" to it. I'll make a million dollars!

"[The] organization [you're shipping to] must actually be able to receive the update without tipping over"

This is their second argument against Agile, and (coincidentally enough) their second mistake. I've added some words to clarify their point so that it isn't taken out of context, but I would encourage you to read the article so that you don't think I'm making this up. That is not my intent. Anyway, we aren't shipping the software every two weeks, so the customer only gets their release when they actually want it. The strength of Agile is twofold on this:
  1. We can show them what they're going to get more easily because the software is always working. We can give them a beta whenever they're ready for it so they can start coding against our API if they want to. If they don't like our API, we can change it easily and ship a new version. They can see if the features they want are actually going to work the way they want. They can give us rapid feedback, which is very important to our ability to deliver them the features that they want.
  2. We don't have to ship it to them "on time" either, because when you can ship every two weeks, "on time" starts to mean whatever you want it to mean. If they aren't ready for a month or two, we can delay and continue to add value (or just branch the code) and ship when they're ready to receive. If they suddenly get an upgrade to their database and our code is going to break it if we add Oracle support, we can release a working version prior to adding Oracle support. BizTalk no longer supports a feature? Prioritize that as a sprint item and ship two weeks later. Try doing that in waterfall.
In short, we gain a lot of flexibility over when we can ship. Again they're leaping to a conclusion (maybe two spaces on the mat and they can "jump" to both "conclusions") that they have to ship and it'll cause all kinds of problems for the customer if they do. In fact, the opposite is true: because you can ship at any time, you have the flexibility to deliver value whenever it is best for the customer. If you force them to upgrade, that's really an anti-pattern that I call "shoving functionality down their throats until they choke on it." This is never good.

In fact, the problems that they describe are actually readily solved by Agility: Documentation team needs another week or two? No problem. At least we have a working piece of software for them to document against. We either add something small that they can easily document, refactor the code against our //TODO list, or branch and keep going with the intent to release a branch. IT says you need a week-long burn in test? Or what about a month long peer-review process like my company? Not a problem either, we just branch and continue working on the trunk, fixing anything that comes up in the branch (which is what we're releasing) and merge those fixes with the trunk one at a time as soon as they're finished so that integration is easier.

Ultimately, Elegant Code David Starr has not missed the boat, but unfortunately the boat that they caught was the failboat. Their argument sounds good on the surface, but they have failed to realize that Agile's ability to handle change combined with a rapid feedback cycle actually deals quite nicely with all the problems that they are describing. I'd like to close with a question: If Agile doesn't work, what does? Huh? Can't find that in the article? It's in the same place in the article as "shipping software" is in the Agile Manifesto (i.e. it's not there). I think Elegant Code needs to do some more studying on what Agile is and what it truly means, as well as looking at some success and failure stories before they suggest that Agile has failed because you can ship too frequently if you're dogmatic about it.

Tuesday, May 27, 2008

New concept: Code Taste

This came up at the Alt .Net Seattle conference last weekend. Code Smell doesn't necessarily mean that a particular piece of code is wrong; it just looks like it's probably wrong (more on this in another post). Code Taste is when you actually execute the Code Smell code and find out if it really is wrong.

Tuesday, April 29, 2008

I can haz spec# kthx

Greg Young has been talking about Spec# for a long time. I was skeptical. I was argumentative. I was interested in attending the Spec# talk at the Seattle Alt .Net open spaces. I was blind, but now I see.

DBC

DBC is Design By Contract. The Pragmatic Programmer dedicates a decent chunk of time to this concept (chapter 4, pp 109-119 in my copy) but I'm going to give a brief overview that will probably not be adequate to really explain it, but I hope it makes you want to go read about it.

The idea behind it is that it's hard to work with many different modules and their interactions with each other because invariably you'll make mistakes. In order to make this easier, you specify contracts that a module will abide by. These are defined by preconditions, postconditions, and invariants. A precondition is something that must be true in order for a module to do its job. It isn't user input validation, since the user might make mistakes. It's things that, if met, the object with them will guarantee a particular result. This result is defined by the postconditions. Invariants are things that may interact with that particular method but at the end of that method their conditions will be true, regardless of what may have happened during that method call. So here's an example:

Let's say that I have a method that searches a sorted array for a particular value. Preconditions would include:
  1. The list must be sorted
  2. The list object can never be null
Postconditions would be:

  1. The output object will be the object that was searched for if found.
  2. The ouptut object will be null if the input was not found.
Invariants might be:
  1. The order of items in the list will not change
  2. The list will never be null
So what does this mean in code? It means that in order for this method to be successful, I have to guarantee that the caller will sort the list first and that they'll NEVER pass a null object in the list. It also means that the search method will always return the object searched for or null if not found, and it will NEVER return anything else. Finally, it means that although the method may do things to the list while it executes, the list that exists after the method terminates will be identical to the list that was passed in (assuming you're passing the list by reference).

So how do you enforce this?

Short answer: you can't. The only thing you can do is to put Asserts on your contracts and let them fail if somehow your contract is not met. This means that somewhere, your code is not correct. These constraints define things that should NEVER happen. A breach of contract is NOT a bug, it's a part of your code that is incorrect and needs to be fixed. At runtime, it's possible that something totally messed up might occur, like something in your list getting overwritten due to a buffer overflow or getting an out-of-memory exception because Sql Server is consuming all of your available RAM (I've seen that happen). When that happens, your contracts will catch it and throw an exception with detailed error messages. That is, if you've left Asserts turned on in Release, which I know most of you don't. My lead would probably castrate me if I suggested this.

Enter The Dragon (where Dragon = Spec#)


Spec# allows you to explicitly define contracts in your code that are then checked at compile time (this is optional). You can also add invariant constraints to your properties and such. You can add preconditions and postconditions. This does a few things for you:
  1. Spec# sets up checks for violations of your contracts at runtime. It will automatically throw exceptions for you if they are violated, so no more checking if an object is null, just put a precondition on it.
  2. There is static code verification that analyzes your code to see if you are ever potentially violating your contracts. It will actually throw a compile error if you might be writing incorrect code (this is an option that you can disable though).
The halting problem

So, like me, you might have said "Well, what about the halting problem? It's provably true that you can never write a program to analyze another program to determine if it will ever halt. What about that, huh? How do you get around that? Look at me, I know Computer Science buzzwords even if I can never actually show you the proof of that problem or illustrate it on a Turing machine or anything. I don't know what a Turing machine is but I read about it in Cryptonomicon and it makes me sound smart."

Well, that isn't a problem because the static verification will only prove your program under certain conditions. There are methods that are too complex to prove, or conditions under which a proof is impossible. To help this, you get the "Assume" keyword which allows you to set assumptions about the circumstances of your methods so that static verification can try to prove it, but this may still break something (although your contracts should still catch this at runtime).

In conclusion

Go download spec# and see how awesome it is. Then email microsoft and tell them to release it as a product. I told Scott Hanselman that if he blogged about it, and I blogged about it, then the 30,000 people who read his blog will all email Microsoft about it and the eight or so people who read my blog will also email Microsoft about it and then they'll have 30,008 emails demanding Spec# and we'll get it released.

Monday, April 21, 2008

My brain is full and my liver is angry

So the Alt .Net open spaces event has concluded. This is the best conference I've ever been to (with DevTeach Vancouver being a close second) for a variety of reasons that I will get into later, but let me first start out with a brief description of what exactly this type of event is.

Format

First, there is no agenda, no speakers submit their sessions for approval, there is no keynote, and there isn't even a fixed list of topics. Everyone has equal input and there is collective decision making in what occurs. You start out by proposing topics and discussing them and then people decide what topics they most want to talk about. These become the sessions that you can go to.

Each session can be whatever the people want them to be. If the sessions are small, everyone can just discuss the topics. If they are larger, a fishbowl format is used, where some number of people n are sitting in n+1 chairs such that one chair is always empty. Anyone may go up and sit in the empty chair at any time and the person who has been there the longest has to get up. I like this format because everyone gets a change to speak and contribute. Sessions aren't death by powerpoint and they aren't some company just trying to sell you something.


People


A lot of people were there. Some of them were new to Alt.Net and were interested in learning more about the movement. There were quite a few people from Microsoft, including ScottGu, Scott Hanselman, and many others. Also, a lot of the usual Alt.Net people were there (most of whom are bloggers that I link to on my home page). I missed Justice Gray and I hope that he will be there for the next one (study your VB 6 hard, Justice, so you can be an MVP). I also got to meet a few local developers that I hope to stay in touch with (including one from my own company but a different division than I work for). To everyone whom I gave a business card: please keep in touch. To everyone who gave me a business card: I'll try to get back to all of you as soon as I can.

Topics

So many things were talked about that my head is spinning. There was discussion and the exchange of ideas pretty much nonstop throughout this conference, so I'll try to go over a few of the highlights that I remember.

Spec#


This was an interesting demonstration. Greg Young has long hyped the coming of Spec# because it enables true design-by-contract where contracts can be explicitly specified at design time and verified at compile time. This forces you to ensure that your code is 'correct' when it is built. I had my doubts (the halting problem comes to mind), but now that I've actually seen it work I think it's going to be a great tool. I hope that all eight of you who read my blog will join the 30,000 or so that read Hanselman's blog in sending emails to Microsoft demanding the release of Spec#.

Are we innovating or just porting?

This talk was about all the new tools in the Alt .Net community and asking if we're actually creating any truly "new" tools or if we're just porting them from the Java community. It was a great discussion with a lot of people saying a lot of things that I wish I could write more specifically about. Someone videotaped this, I hope that it gets posted somewhere.

How to talk to suits

Very interesting talk about how to sell Agile to management but also some good information about how to talk to business people in general. Something that was brought up was the idea of having respect and trust for each other which is something that I think is lacking in a lot of places. Some ideas for helping to "sell" Agile was to present different practices as a solution for existing pains that a company is feeling. Lots of good information here. I was also surprised by the number of consultants at this talk, it was well over 75% of the group.

Has software development failed?

A good talk about the current state of software development projects. There have been some very notable failures in the industry over the years. I brought up my question of "are we all wrong" and got some good discussion going on there.

Javascript- it's not just the bastard step-child of your web app

This was a cool talk. A lot of people tend to treat javascript like some sort of second class citizen. It's there, but it isn't too important. As a result, there aren't a lot of test frameworks and established patterns out there for javascript and its code quality often is sub-par. While there are some great frameworks out there (some of which I need to take a fresh look at), there is a real need for improving the tools, particularly now that javascript is becoming more of a tool for the presentation layer and contains business logic. It is a real programming language so all the rules of good design apply to it and I think people largely haven't realized that until recently. Justin Angel had a great tool also for running javascript unit tests at build time. It wasn't quite what I would want but it is still very slick and would definitely be useful.

Education in the industry

This was a great point that Scott Hanselman mentioned at least a dozen times and I still think he could have said it more without it being overkill because it's that important. There is a significant lack of education in basic software engineering principles out there in schools. Computer Science programs teach algorithms, O(n), data structures, and polymorphism. Those things are super-important to developing software. However, what's missing is things like concepts in good object-oriented design, how to test your software, unit testing, refactoring, patterns, and version control. He mentioned speaking to local schools about your career and software development in general. I think he's right, and I intend to actually go out and do something about it. I'm not sure what yet, but I did talk to a few friends of mine who happened to be students at digipen. I ran into them on the last day and they asked me about the conference and we ended up talking for an hour or so. I'll post more in the future as I continue to follow up on this.

Overall, this was an awesome conference. There was a lot of good dialog and exchange of knowledge as well as a lot of respect for everyone there. Someone there said that we're all leaders (might have been David Laribee who said that) and so we need to go out and lead. I'm going to go out and lead, and I'm going to challenge everyone who is reading this to do the same.

Also, if you're not at the next open spaces, you'd better be dead or in jail, and if you're in jail then break out.

Also, if you are interested in seeing some more of it, Jeff Palermo made some videos that are posted over on his blog.

Tuesday, April 1, 2008

Checkout rules speed up TeamCity builds a lot

I discovered something interesting today: TeamCity allows you customize what folders are checked out from your source control (we're using subversion). When we were building before, the actual build took about 5 seconds or so but the total length of time was on average a minute and a half. People complained about this and didn't lieke doing personal builds because it took too long (personal builds allow you to send a build to TeamCity and have it try to build with your updates before committing your code. This helps stop people from committing breaking changes). I found that the reason it was taking so long is that we have a lot of stuff under version control, such as our database, the QA test projects, some tools projects, as well as a bunch of other random crap that, while important to the whole project, are not directly used by the particular build configuration we care about for the website. So, after reading the manual, I discovered that I can restrict TeamCity to only check out certain directories. Here's what I did:

My svn root for my project (this is not my real svn tree, duh) is at https://127.0.0.1:8080/projects/trunk and the various parts of the application live in sub folders. I therefore have the following:

https://127.0.0.1:8080/projects/trunk/Application - our web application
https://127.0.0.1:8080/projects/trunk/Database - the database scripts
https://127.0.0.1:8080/projects/trunk/Externals - things like Rhino, MbUnit, etc.
https://127.0.0.1:8080/projects/trunk/Tools - useful tools that we write for application admin stuff
https://127.0.0.1:8080/projects/trunk/QA - our QA department's test stuff (BVT, regression, etc)

So in order to build our web application, everything we need is in Application and Externals. However, I can't just point TeamCity to https://127.0.0.1:8080/projects/trunk/Application because then I don't get Externals. I also can't just have an externals directory on my build agents because then I have to maintain them in multiple places (DRY principle- it's not just for code; it applies to project structure and build scripts too). So, I have to check out https://127.0.0.1:8080/projects/trunk which gives me about 100mb of extra crap that I don't need to build the web application. How do I deal with this?

Under TeamCity's project configuration on the Version Control Settings section, there is a thing that says "edit checkout rules" that, when clicked, will allow you to apply rules to what gets checked out. I clicked it and here is how it works:

You put each rule on its own line. A rule starts with either a + or a -, with + meaning to check out the thing and - meaning not to check it out. You would make rules relative to the root directory, so since my version control settings for the VCS root is set to https://127.0.0.1:8080/projects/trunk, I have to make all my rules relative to that path. Therefore, to exclude Database and save about 45 seconds of checkout time, I add a rule that restricts checkout of that directory. Since the directory is https://127.0.0.1:8080/projects/trunk/Database, I do this:

-:Database

and it now ignores the database (I believe that these are case sensitive). I applied a few more rules and now all that is checked out is https://127.0.0.1:8080/projects/trunk/Application and https://127.0.0.1:8080/projects/trunk/Externals for my main application build. This reduced my build time to about nine seconds on average. Much better. There are other things that you can do here with checkout rules also but I'm not actively using any of them yet.

I have been using TeamCity for a while now, I strongly suggest you check it out. There is a free version with limited numbers of Users (20), Build Agents (3), and Build Configurations (20). You can buy additional build agents individually or you can buy the full version which gives you more features. Check it out here: http://www.jetbrains.com/teamcity/

Monday, March 31, 2008

HTML wall of shame, volume 2



Ok, I'd love to comment on this post. I clicked on "Comments" and guess what? Take a look for yourself. Look where it says "Enter the code shown (prevents robots)" and tell me what you see:




If you said "What fucking code?" you are correct. If you said "Some html is allowed" you are partially correct (Reason: What HTML code? Do I get to guess? What happens if I guess wrong?). If you said "Gee, the textbox is pretty big" you are incorrect and need help.

I guess now the world will never know what my comment would have been.

It's called HTML and it's not hard

In 1998, when I was a junior in high school, my friend Dan made a website. It was on geocities or something like that. He showed it to me and I thought it was the coolest thing ever, with links to his favorite sites and some text about him and stuff like that. Wow. So I asked him to teach me how he did that and so he fired up notepad and we started typing out HTML. It took me quite some time to understand tables, but eventually I had my own geocities page and it was awesome. I have been writing HTML in text editors ever since. My point is that it's not hard to write HTML that can be rendered by a browser.

It's even possible to write HTML that is rendered correctly by multiple browsers. I have Firefox and IE on my machine and I test in each of them. Even though IE is the only one that I'm required to support, I still test in Firefox to make sure that it at least looks similar and functions similarly to the way it does in IE. You know what it takes to test your html in both IE and Firefox? Five minutes. Here's how:

  1. Open IE
  2. Navigate to your website
  3. Look at it
  4. Open Firefox
  5. Navigate to your website (optional: 5a. Copy and paste the URL from IE to save time)
  6. Look at it
If in step 3 and step 6 you find that something is significantly different, you might have a problem. With that said, let's move on to:

My HTML Wall of Shame

So here's a blog that's written by someone that I won't identify but his initials are Jeremy Miller. Let's now run steps 1-3
Results:


Now, let's do the same steps in Firefox (that's 4-6 in case you've forgotten, and yes I ran step 5a and saved at least three seconds).
Results:



And now, our comparison. Do these look the same, similar, or different?

It turns out that they are similar, but not the same, in that for reasons unknown (but they're crappy HTML) the firefox version obscures part of the post, mostly the part in which the punchline of the cartoon is visible, which is more than a little annoying. And because of the nice fluid interface, no amount of expanding the browser window fixes this.

So who to blame?


Well, I don't blame Jeremy for this, it's most likely an issue with codebetter.com and it sounds like they need to "code better" when they're writing their web interface for blogging. But there are a few other blogs out there that have been guilty of this in the past (and two of them are on codebetter, coincidence? Must be. . .) and been impossible to read at various times (I would have liked to have provided a direct link to the offending posts but I don't feel like searching through past posts to find them. I just copied from my IE history because that's the only time I'd open these blogs in IE).

And this isn't limited to bloggers

No, I can only wish that it was only a few bloggers that have this issue. Unfortunately, a good chunk of the internet suffers from it. I've seen websites that cut off the sides, top, and even bottom (that's hard to do) of the page due to crappy HTML formatting in order to try to make the site look all "Web 2.0" or whatever. In order to try to help fix the problem, here are some guidelines:

  1. Test your site using the steps I outline above
  2. If it doesn't work, fix it.

If you follow these simple steps, you can't possibly fail (at least not in this way). Remember, if a high school junior could make a good web page in 1998, so can you. It's not that hard

Monday, March 24, 2008

Visual Studio is not a build tool

Visual Studio (henceforth Visual Fucking Studio) is not a build tool. I have spent the last few days hating Visual Fucking Studio more than ever before. I have hated on it in the past for things like the "Add new item" dialog box that has about 100 different options in no particular order that are all just names for a text file. I have hated the "add reference" dialog box that takes forever to load because it always has to parse every .Net assembly in the GAC and every COM object ever created before it can load (I know they fixed this in .Net 2008 but it still takes a long time to load the first time and I wish it would default to the "project references" tab because that's what I always fucking use). Both of these things that I hate specifically, however, are nothing compared to the build issues that I've had in the last few days.

How we build normally

So on my TeamCity server, I build the project using a Nant script. This is really not hard and it works well. You can also run the build script locally, and thanks to a few ideas that I've borrowed from some people it all works quite nicely and has very few issues. In fact, the biggest problem that happens is when someone adds or removes a file in the Visual Studio project and forgets to modify Subversion to include/remove the file and breaks the build but this is easy enough to detect and fix.

However, people on the dev team absolutely have to have the capability to build the project using ctrl-shift-b and run it with f5 so that they can debug in the Visual Studio environment. This means that when I hit f5 it has to work. This is the way of pain.

Why oh why did I remove that project????

So our application needs to work with both Sql and Oracle with identical functionality. This is one of the most important requirements. As a result, we had to write a data access layer that was abstracted enough so that our factories didn't need to know what platform they were connecting to (there is a reason that we did this ourselves instead of using NHibernate or something similar but more about that in a future post). This project and it's associated dependencies is commonly referred to as the "Data Access Layer" for this project.

Now it turns out that we actually have three or four separate projects (and a few future projects) that will need the Data Access Layer in order for them to live. Initially, we wrote the Data Access Layer in our main solution file for the application, but as soon as we realized that we'd need it in other places, I decided to factor the Data Access Layer out of that solution and give it its own independent solution and build process (it took me about five minutes to modify team city and our NAnt script to set up this new project). So, I now have a separate Data Access Layer solution that I can reference in all projects that are dependent on it. Now I try to build the main application solution and that's when the fun started.

You can't reference that


So you can't reference a project in another solution using Visual Studio. Ok, not a problem, let's just use a file reference. Well, where do we point the reference to? Oh, well there's the obj/debug directory that has the DataAccess.dll and it's associated referenced dll's, let's point it to that. But wait, what about obj/Release? What if we change the build mode? Well, it's unlikely that the developer builds will not be in debug mode, unless there's a bug that is only reproducible in release mode or one of 1000000000 other scenarios that would necessitate it. That means that you'd have to change the reference. What if we move that project around more? You'd have to change the reference. What if we blah, blah, blah you get the idea.

So, I get a great idea: I'm going to create an Artifacts directory in a common location for the build. Then I can just reference whatever DLL is in there so it will work in whatever mode I built the Data Access Layer in. I'll create a postbuild step that deploys them into the artifacts directory and then keep them out of version control. That way, all you have to do is make sure you build Data Access Layer and everything else will build. Cool.

Batch files have been around for years and they're not hard

So I write a simple Copy command in the post build step in Visual Fucking Studio. I run it. "Error: The command exited with code 1." WHAT THE FUCK DOES THAT MEAN? I check the output that Visual Fucking Studio tells me its running and it turns out that some of its macros don't actually mean what you think they mean and append an extra slash on them, so ${projdir)\bin actually resolves to c:\some project\directory\\bin instead. I fix this, it runs. I run it again a few minutes later "Error: The command exited with code 9663." WTF, bitch? What do you want now? I finally start adding echo and dir commands and determine that the command is being run on the wrong directory so no files were originally copied. Now I run it a second time and I get "Error: The command exited with code 9347" or something. This is about when I wished I hadn't quit smoking. I forget what fixed this, but suddenly I'm getting "0 files copied" which is not what I want, particularly when there should be 3 files copied because the source directory has three files in it and copy *.* means copy all files, in this case THREE OF THEM.

Are you sure (y/n)?

It takes me a minute to figure out that the copy command wants to ask you if the files should be overwritten first and if you don't answer it assumes that you don't. Adding /Y to it only surpresses that message but it still assumes that you don't want to overwrite. Fuck you, DOS. I wish I had cygwin installed about now, but that isn't helping me fix the build. So I add a delete command to my artifacts directory prior to the copy and after another "exited with code 1" I determine that there's another one of those directory path issues and finally get it to work. Happiness.

We sure are lucky the build worked at all

So now I start looking at getting the remaining application solution to build. When I first start, it utterly fails for some reason. It turns out that no dependencies were set between projects and there was no build order, so the whole time we were working on this project it was coincidence that it was building correctly. I set some dependencies and changed the build order and put in a reference to my Artifacts directory and the build worked. So I hit f5 to run the application and what happens? App no work. Sadness. After a few more minutes, it turns out that since this is a web app and the UI is in its own project, all the DLL's need to be copied in the UI/bin directory and of course they're not. Our project is structured so that each project should only reference a project that is directly below it. This keeps UI from referencing Data Access Layer which is a good thing. However, without a reference, it won't copy the dll's over because Visual Fucking Studio is not smart enough to follow references in dependent projects. Not a problem, I'll add a postbuild event to copy all the dll's from artifacts and the other output directories into the UI/bin folder. Go back and re-read the previous paragraph because that's pretty much the same thing I went through to add this post build step. I even copied and pasted the commands from the Data Access Layer but it still didn't work. I don't even remember why. I think in total that I spent almost four hours trying to copy like six files into a few directories as part of a build that we're never even going to use except to set breakpoints in the app and run it. I really hate Visual Fucking Studio right now.

Things to fix:

Here's what I'd like to see changed in Visual Fucking Studio:

  1. If I reference a project and that project has references to anything not in the GAC, chase down those references and copy every binary I need into the /bin folder where the app is going to run.
  2. It appears that Visual Fucking Studio just puts all the commands that I add to prebuild and postbuild events into a .bat file and runs them. I would like to see that .bat file. Better yet, I would like that .bat file not to exist and for it to just run the commands.
  3. I want to see WHICH command failed in my steps, not just some useless error code.
  4. I would like to be able to add a reference to another project in a different solution.

I don't foresee myself not using Nant any time in the future.