Accidental Scientist
 Subscribe to this blog    Add to Technorati Favorites

Wednesday, May 06, 2009

Note to the Internet: Videos aren’t THAT cool

A quick note to the Internet, especially those out there who run tech information sites. (Channel 9? Asp.net? I’m looking at YOU guys).

Videos are compelling. They get your face out there. You get a modicum of celebrity. People hear your voice. Heck, you can even show your family.

Please, keep it to YouTube.

I’ll watch videos on the internet when I want to waste a few minutes looking at a drunken squirrel, or check out Zero Punctuation. When I want information – a science based interview, or a good tutorial on how to use, say, Dynamic Data Entities in ASP.NET – believe it or not, I actually want text.

Why do I want text?

  • I can read really fast. I can skim even faster. When I’m looking for tech information, I’m looking for a couple of very specific bits and pieces.
  • It’s searchable on Google or Live Search. And you know what? I know ALL the keywords I need to know to get the search result in the first couple of links usually.
  • I can cut & paste it into OneNote if I really need to.
  • I can read it on my mobile phone if necessary.
  • I don’t have to shut out the world and put on my headphones.
  • I don’t want to look like I’m wasting time surfing the net at work (which I look like when I’m looking for tech info usually anyway). Watching a video? It’s the same problem squared.
  • Returning to point 1 – reading really fast – I don’t want to waste 15 minutes to an hour watching something that I could get through in less than 3 minutes.

If you put your information in video-only format, you’re actually just stroking your ego. If it’s technical information and it’s NOT a Photoshop tutorial, or a very SPECIFIC demonstration of new features in, say, Windows 7 - then please, don’t do it. Don’t. Just don’t. Stop it. I don’t want to hear your voice. I don’t want to watch you mug for the camera. I just want the information I came for. Give me a break.

What’s worse is when what I’m watching is someone’s cursor wandering around Visual Studio for 15 minutes. DON’T DO IT. IT’S BORING.

imagePointless ego stroking: hey look! It’s code! In a video! You can even hear them type! 

Video is great for some things. It’s my preferred medium for stories. I love it for comedy. Just please, give information in the most suitable form. And for 90% of people in a hurry, just trying to get their day-to-day done, that form is text.

(And Powerpoint mavens? I’m watching you… don’t get too complacent)

Labels: , , , , ,

Thursday, December 27, 2007

Bittorrent Bugginess... or how to use up 35% of your CPU doing nothing...

Looks like someone might need to add a little throttling to BitTorrent.

Build 4747 of version 6.0 of the client has a nasty little bug in it - one that should be really easy to fix, but a bug nonetheless.

On my laptop (a single core, old Toshiba Portege M200 Tablet PC), when I'm downloading stuff it will happily sit there, soaking up 35% of my CPU.

What's it doing when it's doing this and how did I figure it out?

Well, I looked at my TaskManager and found that Explorer was taking up 35% of the CPU. I opened up Process Explorer (it's incredibly useful - so download it here; you can find other useful tools at the www.sysinternals.com site) to figure out what was going on in more detail, selected Explorer, and looked in the properties. What I got was this:

Explorer.exe Thread Properties in Process Explorer

The Explorer.exe thread properties in Process Explorer

Hmmm... a single thread is soaking up a lot of CPU. I wonder why that is? Let's pop it open by looking at the Stack for that thread by selecting it and hitting the Stack button.

Explorer.exe Highest CPU usage Thread Stack

The Explorer.exe thread stack in Process Explorer

Hmmm... well, nothing really useful at the top of the stack, and every time I break in, the thread looks the same. Which means it's mostly likely another app rapidly updating the tray icons and spamming it so much that it's just getting hammered and soaking up a lot of CPU.

But how do we find out which app?

I wussed out on this, dear reader. I just started going through the tray and closed apps one by one until I found the one that was doing it. In this case, it looks like it's BitTorrent. I close it down, and the problem goes away. Open it back up, and lo and behold, 35% CPU.

Kinda sucks. Should be an easy fix though - just update the icon on a timer instead of every time something changes in BitTorrent. Hopefully they'll fix it soon.

Labels: , ,

Sunday, December 16, 2007

New side project...

I've decided to say, well, frankly, screw it and throw my hat in the ring for the Web2.0 social media kinda thing.

So I've started up a site - pyrogrya (catchy name, huh?) that will eventually blossom into a wonderful social media networking site that you can all enjoy.

Why am I doing this?

Because frankly, it doesn't seem that hard to do. And there's a project I've wanted to tackle for about 7 years now (on various devices) that since finally getting smartphones that can handle 3G no longer makes any sense whatsoever to do as a desktop or smartphone app. I can stick it on the web, and in fact, its usefulness should grow exponentially.

What is it?

Can't tell you yet. I want it to be somewhat of a surprise when it goes live. But I guarantee that a lot of you will find it handy. And heck, if I don't give too much away, it won't be so much of a pain when it doesn't go anywhere just like the rest of my side projects do.

Still, here we go. I bought a site, created a logo, and have started writing the database schema... this should be interesting :)

Developer blog can be found here.

Labels: , ,

Friday, October 26, 2007

The Visible Progress of Broadly Scoped Tasks

I've just gotten to a good place at work today. Finally, a long project is approaching its conclusion. So I've been thinking about what I know about project management, and working from my experience on tasks like this (and watching others tackle them) I came to some conclusions about how this kind of process works.

What is a broad task? Or a narrow one?

A broad task is one that cannot be easily broken down into pieces to attack. That's not to say you can't try; it's just that there are large numbers of unknowns at the start of the process.

A narrow task is one that is easily defined, narrow in scope (ie. tackles a very specific, small problem), and that is reasonably accurate to estimate.

Examples of broad tasks include any kind of integration work, any kind of porting work from one platform to another, and any time you need to build a new framework.

Examples of narrow tasks are those that could be described as "small features". For example, adding the ability to drop down a color-picker and select a color in an editor would be a narrow task. Adding the ability to specify that a mesh uses 16-bit floating point values for vertex positions instead of 32-bit floating point values would also be a narrow task.

The Graph of Progress over Time

Here is a graph of what a broad task looks like to the engineer doing it, and to the outside world, in terms of progress made. It's slightly scary, but that's ok, we're going to go through it in detail.


Graph of engineering progress over time on a broad task

We're focusing here on the experience of the engineer themselves (because it's important to understand what they're going through as they do the work in question), and the experience of their managers/investors... the management layer which has a strong interest in getting the task done, but might not care quite so in-depth about how it's achieved. Basically, the people with financial responsibility for the work. The ones who have to explain to investors what's going on - or indeed, the ones who have to go to investors and ask for more money for the project to continue on a regular basis.

Visible Progress

In the above diagram, the yellow body of the graph is the amount of visible progress. This is what everyone else sees - managers, executives, people on other teams - unless you're providing detailed status updates. And by detailed, I mean, showing what you did every single day. It's unfortunate, but perception really is king - you will hang or fly based on what other people can see you doing.

With broad tasks, it's crucial therefore to make sure people know what's going on - they won't understand if you're not making visible progress - all they have to measure you by is the visible progress alone.

Expected Activity

The red dashed line (the straight one, for those of you who are color-blind, and I apologise for my poor color scheme if you are) is the expected amount of progress. It's when this line doesn't match the visible progress that people get ... well... kind of nervous.

In a perfect world, this line is what your output would be. But people aren't machines. Even taking unforseen circumstances (life intruding on work, discovering that the task is a bottomless pit that no one understood until now) into account, no-one ever works like this. Let's ignore the ebb and flow of up and down days, and assume that this line is an average. Even so, the shiny happy engineer will not match this line (unless they're an engineer in the category I like to call Bulldozer). They're human, and there's a stimulus-response thing you have to take into account here.

Either way, for planning purposes (both because you can't really plan any other way, except by padding... and also because they're not the ones doing the work, so they're removed from the immediate situation), this is what management normally expect of their warm bodies.

Actual Activity

This is the green, curvy dashed line. And it's curvy. Boy it's curvy. And it's a little badly drawn (view it more of an emotional line; if I'd done it completely right, it would always slope at least imperceptibly upwards; at worst, it'd be horizontal).

This line is the actual productivity and progress of an engineer on a given broadly scoped task. Why is it shaped like that, you ask? Well, let's break it down.

The Lifecycle of a Broad Task from an Engineer's Perspective

Ramp Up

Any broad task can be pretty reasonably expected to be at least somewhat new to the engineer performing it. A certain amount of book-keeping and surveying has to be done before the engineer can even begin, because you need to figure out where to start, and where you're going.

This part of the curve is a gentle, shallow ramp (far left of the graph). It soon peaks, however, as the work gets going.

Fatigue

Next up is our first dip. Literally, at this point, fatigue sets in on the developer. They've been bashing their head against their desk trying to break the task down into bits and pieces they can manage. They've been working a while, and churning on the problem, but at this point their work provides little gratification. Their own perceived progress level is excessively low compared to the amount of work left on the task - and that worsens the problem, because now they're feeling like they're useless and they still have a huge task ahead of them.

Disheartened, they step off the gas pedal, until either someone steps in and prods them, or they get their own internal second wind. They're in the middle of a death march at this point, and they know it.

First Flush of Success
Hey, what do you know? That curve does turn back up (unless the engineer in question quits - which, if it's a huge task with no end in sight, does happen). Why?

Broad tasks are basically - as far as perceived progress goes - exponential. You're tackling problems all over the place at first, and nothing works. Then, as you keep ploughing on through the problems, you start to gather steam as you get enough of the foundation problems fixed to start tackling specific problems.

The problem with foundation issues is that ultimately, you have to fix them. You can't put them off. If you don't fix them, nothing works. But fix enough of them, and that picture changes. You can start tackling specific problems. And specific problems have a number of great qualities:

  • They're narrow (aka specific). You can look at them, and see the shape of them. They fit in your brain.
  • They're finite in scope.
  • They're understandable.
  • They're typically jenga blocks in reverse. Take them out, and whole systems come online.

Do the problems actually change at all? No, you're actually fixing specific problems the whole time. It's just that they were buried in a sea of other problems, and so they were masked, and you were also thinking about all the other problems as well.

Think of it as a huge party, where hundreds of people are shouting as loudly as they can, and everyone's standing shoulder to shoulder... and not only that, but you're the host, so your responsibility is to go and talk to every single person - even if you can't even squeeze through the crowd. Once you get a little breathing room and throw a bunch of people out, you can start having a normal conversation, and spend some quality time with your guests.

Power Curve

Here's the fun part. OK, so you're starting to thin out the problem "herd" and you've finally hit the point where not only do you know most of the systems you're working on pretty well at this point (ie. you know the mental geography of the problem you're fixing), but the herd is really thinning out. You've reduced the sea of problems down to a few rivers.

And all of a sudden, your activity level goes up, because you're making real progress. At least, you can see that you're making it.

Back to the cocktail party analogy - you're looking for people who need a fresh drink. Much easier to spot when there's only about 10 people in the room, instead of 100. Not only that, but you start noticing similarities between them - maybe three people want martinis, so you can do all of those in one go and save yourself some time.

All of a sudden, you're fixing problems, and systems start to come online. Maybe they don't work yet, but you know what you have to do to get them to work. And you know what? That's a great feeling.

In the Zone

This is the great part. All of a sudden, you've reached the point where you're making regular fast past progress. The brunt of the task has been done, and now you're at a point where you can get a good handle on your work. Here's the cool bit.

This is where the externally visible progress part of the graph takes off like a rocket. You've already slipped out of your funk, and have been grooving on the system for a while now. And with your happiness at the progress you've made, and all the positive feedback that gives, you're on a bit of a high. So your output increases.

But not only that... shortly after this point not only have you nailed enough of the problems to the wall that you're happy about what you're going, and can make targetted attacks, but at this point enough of the foundation work has been laid that the problems that remain aren't completely deadlocking your work from... well... working. All of a sudden whole systems are coming online with a few annoying hangnails here and there where things crash or don't work right. The important thing though is that until this point those systems didn't appear to work at all.

It's like an exponential curve. You've got to get past all of the long flat part until you hit the knee, and then it shoots off like a rocket.

Suddenly, to the outside observer, you're making continued, amazing, daily (or even hourly!) progress. Has anything changed?

Not really. You hit that point personally about a week before, but no one could see what the fuss was about. But the slope of your expected activity curve is now looking mighty shallow compared to the visible progress. Vindication! All those hours where people were wondering if you should be considering an alternative career path are now worth the pain.

And what does that do?

It bumps your activity up even further. Now you're jazzed, because not only can other people see what you knew you'd been doing all along, but the task is now starting to look easy to you. The visible progress curve lags the actual progress curve; at this point, you're down to a cocktail party with about 3 people, and you're serving the coffee and brandy.

Back-Patting

It's inevitable that an engineer after a long, arduous task, is going to indulge in a little back-patting. After all, if this was a race, then they've just been sprinting non-stop - which was a wonderful change of pace after trying to run through the middle of a swamp.

The only issues left to fix are pretty much in two categories - minor ones, which aren't preventing immediate progress, or huge ones which were discovered as the work went along, and have been worked around for now, and need some serious TLC in the near future.

But not right now. Because right now, they need a bit of a rest.

This phase is pretty short. It comes as the amount of issues remaining drops to only one or two. The top of the visible progress power curve is reached right at this point, and begins to slope down again - but now it's pretty much consistent progress going forward.

Refractory Period

Back patting is over, the engineer is a little mentally exhausted still (the low after the high of getting as far as they did). Now starts the ramp back up to regular productivity levels (the expected line now does have true meaning, because presumably we're back to narrow scoped tasks).

If you're going to give them some time off for good behavior, now is the time. They've milked their praise, and that little extra bit of buzz is wearing off. So now that they've done their work, and given their bows to their peers in the audience, let them go off stage gracefully and recoup their physical and mental energy.
And lo and behold... the task is done. It was a rocky road. And some rock stars were discovered.

The Secret

Some people can get past the initial curve by discipline and stiff upper lips alone (the aforementioned Bulldozers).

If you don't want to get fired, fight that visible progress curve by providing regular progress updates - never go "dark".

And the biggest secret of all?

All tasks work like this. Even the little narrow focused ones. It's just that it's over such a short period of time that you might not even notice the ups and downs, and the relative durations of each of the phases will change. About the only time it differs is if you've done something very similar many times before, in which case you might be able to avoid most of the doldrums at the start.

Even big, multiperson tasks work like this too (eg. Entire Projects). Different people in different disciplines might be at different points in the curve, but large projects do work like this.

Project Managers, Lead engineers (call them what you will) try to fight the visibile progress curve with a number of tools - milestones being one of them - for a number of reasons. Executives and financiers don't like it when a project isn't making visible progress at a consistent rate. Potentially more importantly, people can feel the ebb and flow of a project, and this can affect their productivity. Best to keep a good shiny face on things, and continuous displays of concrete progress will do that for the team - and that's good for the health of the team.

The only trick is, don't ever, ever, ever jeopardize the foundation of your project (you know, that long shallow part) solely to satisfy the need for visible progress. Because that will come back and bite you in the ass, big time, as that foundation you skimped on crumbles around you later. If you must muck around with that part, make sure you have a plan, everyone agrees on that plan, and you have a written document which specifies how and when you're going to go back and fix things up. It'll be more painful than doing it right up front, but that's the trade off you make.

More on that in a later post.

Labels: , , , ,

Monday, August 13, 2007

Off to Gamefest

I'll be at Gamefest for the next couple of days... w00t :)

Nice change of pace... lots to learn... friends to reune with. (OK, so that's probably not a word, but how else would you turn reunion into a verb? :D)

Should be fun.

Labels: , ,

Tuesday, August 07, 2007

Touched...

A while back I wrote an article on CodeProject about a bi-partite circular buffer algorithm I came up with to handle asynchronous network IO. (It's also useful for other things - pretty much any scenario where you have to pass data in contiguous blocks to other APIs, yet you don't know exactly how much data you're going to be passing at any time).

Well.. I just came across this blog post from someone:

A Pure Programmer
I read an article about circular buffer and related code, which written by Simon Cooke. It’s very good. I never heard of Simon Cooke before. I was moved by the last words in his article: “If you do find it (the code he write) useful, or use it in any of your code, all that I ask in return is that you drop me an email and let me know how the code is being used. It’s nice to know that it's out there, alive, and doing cool things."
Oh,what a pure programmer!

Awww... bless. It warms my heart that someone appreciates my work, it really does.

I must admit, I was rather proud of that little piece of code. It worked out to be pretty fast too - the only way to do anything faster would have been to use the virtual memory mirroring technique I laid out in the article - but unfortunately, that doesn't work on some architectures. (I'm looking at YOU, XBOX 360). Not sure if it's multiproc safe on other systems too. Damn cache coherency, I stab at thee.

Followup:

So I got a bit narcissistic and did a Google search on the Bip Buffer... and lo and behold, people are using it. One guy's looking at it as a way of performing least-wear writing to flash memory (now that's a cool application I never even thought of!). So glad this code's getting some use!!!

Labels: , ,

Sunday, July 15, 2007

Simon's Law of Software Development (and Evolution)

OK, I know, a bit egocentric to name a law after yourself, but here goes. It applies in all kinds of places, but it's really for the most part a Software Development law.

Architecture is persistent.

The moment you sit down and start writing code, you're laying down the architecture for your software. It doesn't matter what you write, it has infrastructure. All code is infrastructure, with other code that does something to the data. Most of the code you will ever write is spent navigating that infrastructure to gather data and perform operations on it, and then store that data out.

This gives us a number of interesting little results that you see in the real world all the time:

Once you write some code, the architecture you decided on sticks around.

Seriously. The more code you write, the more the architecture you specified becomes embedded in every piece of code you write. You can't help it - it's part of the code's DNA. Every choice you make will pervade everything your code does forevermore.

For example... You write a large piece of code which has a scripting system which is used for everything - header generation, you name it. Script based objects are garbage collected.

Now, later, you decide "Hey, wait a minute... this GC is slowing me down... the script language isn't completely fleshed out so it's hindering me, and it's not suitable for all of the tasks I envisioned at the start".

Tough cookies. At this point, a year later, you now have so much code written based on this architecture, that even if you find ways around it, your code will still taste of the existing architecture. You can refactor all you want, but until you deliberately start a second branch of code that is an isolated and sterile environment, that only talks to the original code through very hard specified APIs - kind of a software firewall - you'll have that architecture still. Even if you rip out every single line of code that uses the original architecture, unless you effectively develop the new code in a vacuum, your code will still have that architecture embedded deep inside it.

People who have worked on large projects involving 3rd party code which doesn't work as advertized should recognise this one.

If you don't specify an architecture, you get one anyway.

It might not be the architecture you want, but it's certainly an architecture. The problem is that the code will generate structure as it goes, and that structure is architecture. It'll persist just as much as the original architecture, and all you can hope for is that this hackery wasn't part of your main code path. (Which means you might, just might, be able to snip it out).

Never write code ad hoc. Always at least spend a moment thinking about the architecture of the code you're writing. If you're dealing with 3rd party code, try to analyze the architecture of their system before committing your own code to it - misunderstanding their design is bad, as you'll end up creating incompatible architecture around it, and doubling your work - you'll have wrapper code which does more than it needs to, and then that'll spawn its own kind of architecture.

If you're dealing with 3rd party code which hasn't got any architecture, and was developed ad hoc, you're kind of screwed. It'll be difficult to get the architecture to sit in your brain.

Most software development - from a human perspective - is the building of abstract models of how something works. The easier this is for someone, the more easily used that code will be. If your architecture is clean, easy to understand, logical and consistent, it'll sit well in someone else's brain because the number of details they need to remember goes down rapidly. They don't have to remember all of the function names in your code, or all of the nitty gritty details. All they need to remember is roughly what shape that code has (its model), and how that relates to other similar code they've seen, and other design patterns they've experienced.

Take WxWindows. It shares a number of architectural similarities with ATL Windowing code, WTL, MFC and .NET Windows Forms. (Personally, I don't like working with it as much as any of the aforementioned libraries because of a variety of reasons, but that's besides the point). This means that anyone with experience in Win32 UI development should be able to very quickly pick up WxWindows programming. The models are the same - the architecture is similar. Basically, it's a message passing system, with event hooks, and some resource handling for the creation of controls. The function names may be different, and some of the implementation details may change, but for the most part it's an easily understood model - because it's a similar model to other environments.

Things that can make code easier to understand for other engineers tend to also be things that give a solid architectural base:
  • Design your code first. Make a laundry list of the different things you're going to need to make your app, and figure out how you want to handle everything on that list. It should be as exhaustive as possible. You can always add later, but you want to be able to find any fundamental architectural pieces that you need to implement so that the rest of your code will use it consistently. You don't want any surprises.
  • Try to write code from the top down. When writing any class, decide how you - as a programmer - would want to write code against that class. Write that code first. Then write the class to fit. Stub out functions if you need to (but make sure you mark them so that you know you need to finish them up).
  • Write comments at the function and class level. These describe the model you're using, and separate out implementation specific details (eg. how exactly you write a serialization function) from the structure of the application (ie. the fact that it actually performs serialization in a standardized fashion).
  • Once you've decided on a pattern for handling a scenario in your code, stick with it. Don't alternate patterns. For example, if you return error codes in one place, return them everywhere. Don't switch to exceptions somewhere else. If you need to use them, wrap them in a try catch at the lowest level you can, and return an error code as a result.
  • Read as much code as you can. Try to find out what works well, and what doesn't. More importantly, try to understand if there was a good reason behind the way that code was written, or if it was created ad hoc.

In summary?

Other programmers aren't stupid. They tend to be smart, driven, passionate... and under tight deadlines. And the nasty side effect of tight deadlines is that architecture drops by the wayside. Try to keep your architecture as clean as you can, no matter what the deadline. The extra time you spend now will pay itself back later when you need to maintain or extend that code. And other developers who have to work on it will thank you for it.

What's more, if you have a clean architecture, it'll fit in your brain better, and you'll spend less effort. What's better than that?

Labels: