Archive for June, 2007

Parsing Excel Files with Perl

Wednesday, June 27th, 2007

My company likes to store tons of useful information locked away in excel files. I understand that not everyone understands how databases work, or how to use them. I have no clue how this happened but at some point Excel became the de-facto standard for your every day data storage needs - despite the fact that flat text files are often much better for this.

For example, given a simple tabulated list in a plain text file, I can grep through it, sort it, re arrange it or analyze it using a myriad of mature and time tested text parsing tools. I can also import it into just about any kind of software, and easily write scripts against it. Excel on the other hand, is much less flexible. But there is something about the neat rows and columns of a spreadsheet that draws people to it.

I much prefer to issue a quick command in bash than to open a bulky office application to get some basic info about an employee, or a client. So I decided to write a quick Perl script to parse through Excel files located on a network share (this probably is a common scenario in most offices). Surprisingly, it was very easy.

First you will need to mount the network share on your box. Next you will need the Spreadsheet::ParseExcel package from CPAN.

Here is the script I hacked up to extract data from a sheet which has a unique (searchable) identifiers (here people’s names) in column D, and relevant data in columns E, F, and H:

#!/usr/bin/perl -w
use strict;
use Spreadsheet::ParseExcel;
 
my $FILE = "/path/to/File.xls";
my $SHEETNAME = "Sheet1";
 
# the column that contains searchable key
my $KEY_COLUMN = 3;
 
my $searchstring = $ARGV[0];
 
my $excel = Spreadsheet::ParseExcel::Workbook->Parse($FILE);
my $sheet = $excel->Worksheet($SHEETNAME);
 
foreach my $row ($sheet->{MinRow} .. $sheet->{MaxRow})
{
  my $key	= $sheet->Cell($row,$KEY_COLUMN);
 
  if($key)
  {
    my $f1	= $sheet->Cell($row,4);
    my $f2	= $sheet->Cell($row,5);
    my $f3	= $sheet->Cell($row,7);
 
    if($key->Value() =~ m/$searchstring/)
    {
      print "\n\n";
      print "Key: " . $key->Value() . "\n";
      print "Field 1: " . $f1->Value() . "\n" if($f1);
      print "Field 2: " . $f2->Value() . "\n" if($f2);
      print "Field 3: " . $f3->Value() . "\n" if($f3);
      print "\n\n";
    }
  }
}

This is of course not the most efficient script since I’m looping through all the rows in the spreadsheet. Can you say O(n)? But it’s good enough for what I need it to do.

How do you indent your code?

Tuesday, June 26th, 2007

Talking about indentation styles is like talking about religion. Everyone thinks there is only one true way to indent code, and everyone else is wrong. But let’s do it anyway. I want to know if my readers are “doing it rite“.

Personally, I use the BSD/Allman style:

if(a==b)
{
	// do something
	foo();
}
else
{
	// do something else
	bar();
}

I find that it produces very clear and readable code. Braces align with each other - for every open brace, there should be a closed brace on the same indentation level. This makes it easy to identify blocks of code. I like the extra white space created by placing the opening brace on a line of it’s own.

The other extremely popular style seems to be K&R:

if(a==b) {
	// do something
	foo();
}
else { 
	// do something else
	bar();
}

I fucking, hate, hate, hate this one. I always feel compelled to fix it by putting a newline before each opening brace. It just looks ugly and does not contribute to code readability. In BSD/Allman, an empty line with the opening brace is very visible cue letting you know that a code block has been opened. It’s hard to miss it - even if you are just skimming through the code. And even if the indentation gets messed up in some deeply nested statements, you can easily figure out what belongs where by simply counting braces.

In K&R counting braces is not that easy - especially if the opening statement is really long and the left brace either drops off the right edge of the screen, or word-wraps around messing up your indentation.

So, which way do you indent your code? Or perhaps you use another style not mentioned here? I know there are bunch of other ones like BSD/KNF, Whitesmiths, GNU, Banner style and etc. But they are all pretty much variations on the two styles above. For example the GNU style is like BSD/Allman but you indent the opening and closing brace by 2 spaces. Go figure.

Anyway, feel free to disagree with whatever I said here, and defend your favorite indentation style. I also dare everyone who uses one of the oddball styles (ie, not BSD/Allman or K&R) to stand up and represent. mrgreen

Windows within windows

Tuesday, June 26th, 2007

I just found this amusing:

Nested Windows
click to enlarge

Let me explain what is going on in this picture:

  • Windows 2000 Profesional is running in VirtualBox under Kubuntu
  • Inside the emulated windows I opened a Remote Desktop connection to Windows XP Professional Workstation
  • Using the Windows XP workstation I connected to a Windows 2000 box via Crossloop
  • Finally I used the remote Windows 2000 workstation to connect to a Windows 2003 server via Remote Desktop

I was bored. Sue me.

PerfMonG 0.2.5 Released

Tuesday, June 26th, 2007

I finally got around to rolling up the installer for the newest version of PerfMonG. Release 0.2.5 has bunch of small updates that improve the usability. I added small tweaks to the interface that now allow you to control the threshold beyond which the CPU counter changes color to blue or red. Mike made sure that the tool works well in dual screen configuration, and doesn’t disappear off the screen when you disconnect one of the monitors.

I merged the release_0.2.5_candidate branch into the HEAD of the trunk, and also created a new tag for this version. At this moment all 3 locations contain the same code, but the trunk will probably start changing soon. If you want the bleeding edge code, check out from the trunk. Otherwise, get it from the 0.2.5 tag.

We still have a long way to go, and much tweaking and refactoring to do before I’m truly happy with it. But, it is slowly starting to shape up into a nifty little tool.

If you are to lazy to click any links, and you have no clue what I’m talking about, let me explain. PerfMonG is my small open source project - a minimalistic performance monitor for windows written in .NET. I mentioned it once before. If you are a windows user, or a .NET programmer definitely check it out, and contribute if you can. )

Linux Fuckup of the Day

Monday, June 25th, 2007

I had to set up a virtual machine on one of the Ubuntu boxes today. I heard good things about Virtualbox so I decided to give it a shot - especially since it’s distributed under GPL, it can can be installed directly via apt by simply adding this line to your sources-list:

deb http://www.virtualbox.org/debian dapper non-free

You will need to add their public key before you install.

During the installation Virtualbox creates a user group called vboxusers. To run emulation you need to be a member of that group. So what did I do? I decided to quickly add myself into the group:

sudo usermod -Gvboxusers username

I happily typed in the password, and hit enter, and decided to log out to make the change go into effect. Only halfway through the log in process I realized what I just did. I forgot -a.

Yep. I just removed myself from all the groups except for vboxusers. Brilliant! I absolutely hate when I do stupid shit like that. It’s not like this was hard to fix - I just didn’t remember of the top of my head what groups I was supposed to belong to. Of course since I was no longer part of the sudo and admin groups I could no longer sudo. Luckily enough, back in the day I decided to enable the root password. So I was able to su to become root, and then usermod myself to admin, and bunch of other groups I needed like audio, video, tty, lp and etc… I wonder what would happen if I did this on a default Ubuntu box without root account. I wonder if I would be able to recover from this that easily.