Photo: Abstract Light in Trees

Privacy Statement
Copyright © 2008 by Creekwood Digital Solutions. All rights reserved. No part of this web site may be reproduced without permission

Cleaning up your data

Turning Data into Information

Having lots of data is nice, but it doesn't do you very much good. What most businesses need is information. Which five products generate the highest profit? Which members haven't paid their dues? Which clients on the catalogue mailing list have purchased so little that the profit was less than the cost of mailing?

To get good answers out of data, we need to collect the right data, be able to relate different sets of data (e.g. sales information with profit and with clients), know how accurate the data is, and have it in a format that allows for processing and analysis. Only then can it be turned into information which can be used to make business decisions (like reducing catalogue mailings to the some customers.)

Data Structuring

There are a number of elements to data structuring. The most obvious one is collecting all of the required pieces of the puzzle, and having it entered into a computer. Less obvious is collecting it in a form that is useful for more than one purpose. I fixed one case where some poor secretary had keyed in name and address information for hundreds of people, but had not been given instruction on the format. While it printed nicely as a name and address list, it was useless for the required mail merge. I spent a couple of hours with the data and restructured it so it was useful, saving a day to re-key and re-proof the data.

Read "How to avoid an ugly mailing list" an article inspired by this incident.

Relational Data

In a nutshell, this refers to making sure the data in one "bucket" can be related to data in another "bucket". For instance, a web-site might have a member's only section, enforced by a list of user-names and passwords. Suppose this is set up by creating a list of current members from the membership list and assigning encrypted passwords. Someone writes a macro to change "John Doe" into "DoeJ" and if there is another John Doe, he is assigned "DoeJ2". That works fine for a year until one of them fails to renew. OK, which one do we delete from the access list? Without some unique identifier to distinguish one John Doe from the other, you'd have to delete them both and reassign the remaining John a new password. It's nice to have someone who can think this out in advance, and foresee (most of) the problems, even when the need to relate data is not obvious. That's what I do.

Data Analysis

Once the data is collected, accurate, well-structured and properly related, getting reports out is a piece of cake. Heck, you could do it. In fact, that's what I aim for in the systems I write. Not that there isn't money to be made in writing endless reports and charts, and I'd certainly do it for you if you want, but do you think that waiting for me to do it is the right way to run your business? One of the nicest things a client said to me about a system I developed was: "I prepared the reports for this meeting in three minutes, but don't tell the Board, they think it takes days!" Sounds to me like we got the specs for that system just about right.

Do you have messy data that needs cleaning up to get it in usable shape? I might be able to help you, so please feel free to contact me by phone or e-mail.

More Details
Software Tools Access, Paradox, Excel, Quattro Pro, Delphi,
Technologies used Relational database, object-oriented, event driven
Specialties Data-driven design, data conversion, data clean-up, relating diverse data sources
Locality Typically, I communicate via email, and occasionally phone and this works fairly well for many activities. For face to face contact, you'd need to be in Oakville or possibly Burlington or Milton. I tried commuting, and didn't like it.
Ideas Bulk mailing, Membership administration, cataloguing, address lists, patient information, inventory, client list, contacts, funding, donors, volunteers