Sometimes Users do know what they want
Monday, June 25th, 2007When gathering specs, the first rule to keep in mind is:
Or rather, what they think they want, is usually not what they need. This is why it is important to study the process you are trying to automate, rather than simply ask the users what will the software need to do for them. Chances are that they will concentrate on superficial features, and omit big chunks of crucial business logic. Also, most users think in what I call Flat Table or Excel mode.
When the users describe data they collect and manage, most of us will immediately try to categorize them into entities with relationships. Some people do it in their head, others (like me) break out a notepad and start drawing E/R or UML diagrams. This way of thinking is actually the natural, correct way we should think about data. Unfortunately, most human beings do not see data like that because they were never taught how to store information in databases. For most people the primary tool for storing and tabulating large quantities of data is Excel - which is essentially a one large flat table. Users simply don’t know that there is a different way, they don’t know about normalization, and they do not realize that data redundancy is bad.
Database design is just a small example - but there are more pitfalls like this at almost every stage of development. This is why you sit down with the users, and watch what they do instead of just asking them what they want.
Of course when you are implementing something new - something that is not part of their current work process you are back to asking questions. And in those cases, sometimes users do know what they want.
Recently I fell in the trap of over-engineering a simple solution. My users wanted to track information about their clients. It was about 30 fields that could be easily divided into two entities - or three. Part of the data was purely informative - addresses, phone numbers, and names of contacts, and directors of client companies - both lenders, who would request audits, and borrowers who would be audited. Then there was bunch of numbers collected during the actual audit at the borrower site.
My boss exhibited the classic Excel approach here - he wanted to dump everything into a single table. But I knew better. I could easily see that the audits are done on a recurring basis, so for each lender and borrower pair you could have multiple audits performed at different times. So I designed my database with 3 entities - lender, borrower and audit, where each audit is associated with exactly one lender and one borrower, and both lenders and borrowers can have multiple audits. It made perfect sense, and as an added benefit, it allowed you to track the progression of our numbers across several audits.
But of course this progression is tracked in the actual audit reports - in much greater detail and scope. What my boss really wanted was to have a database of clients, periodically updated with the latest data about their earnings and loan details. I found this out during a demo of the almost-completed project. I also found out that the numeric data should be extracted directly from Excel work-papers for a given audit, and that it should overwrite the previous data so that we store only the latest info.
So now I have a choice - I can either keep my current schema, and just deal with the fact that borrower and audit will always have a pointless 1-1 relationship that will require me to do unnecessary joins, and update data in multiple steps, or just merge the two tables and re-write much of the existing code where necessary.
Either way, I would have been better of listening to the users in the first place. Because, as it turns out, they sometimes do know exactly what they want. So I suggest that you make the second rule of gathering specifications to be:
You may think that you know users want or need - but chances are you will be wrong. The specifications must come from the users, and reflect their process, and their needs - not our educated guess on what the process should look like. Because ultimately, they are the ones that will be using this software - and despite the fact they are usually unable to communicate it well, they do have some sort of end-product in mind when they ask you to develop something.