Python: Open the Most Recent Log File

Lately I have been on a Python kick. You know, just in case you haven’t noticed it based on the new outcrop of Python centric posts around these parts. In addition to Google App Engine related stuff I’ve been doing, the big P sort of became my new go-to language for random-ass scripting needs. Perl used to be that language to me, but… Well, let me tell you a story.

Recently I needed to update an ancient Perl script that I wrote few years ago. At the time I was actually taking a Bioinfomatics course, where we user Perl to process genetic sequences. Needless to say, the language and the regexp syntax were fresh in my mind. Fast forward few years of me working almost exclusively in Java and PHP and most of that 1337 knowledge got filed away in some distant corner of my brain. Then it was covered up by intellectual mold, cobwebs and someone hung a sign saying “Beware of the Leopard” on the door. Imagine me happily opening a file and going, “Hey, I just need to change couple of… Oh… Wait… What? What… I don’t even… What the fuck in hell was I thinking?”

It turns out that former me was a dick, and he was fond of playing Perl Golf. You know, as in “I bet I can do this in three fucking lines or less”. Yeah, I hate that guy. I eventually figured it out, but I decided that I should tell myself to:

  1. Comment more
  2. Fight the urge to write incomprehensible spaghetti code
  3. Oh, and try not to hard-code so many things

So I figured out that I might as well take some of my old Perl scripts and rewrite them in Python where needed. For example, I had a script that I would periodically check a certain log file searching for error codes that I wanted to be notified about. Recently however the app that generates said log files has changed its behavior. Now, when a long file reaches a certain size it starts a new one giving it a name like “log_file_12345″ where 12345 is some semi random garbage probably based on the timestamp or what not. My old Perl script of course had the log file hard coded.

I replaced it with a python script that did the same thing, and added the following logic to find the most recently modified file in the directory:

#ct: 801527ff-462f-4291-9925-8bcbe7ac9e0a
import os
def find_most_recent(directory, partial_file_name):
 
   # list all the files in the directory
    files = os.listdir(directory)
 
    # remove all file names that don't match partial_file_name string
    files = filter(lambda x: x.find(partial_file_name) > -1, files)
 
    # create a dict that contains list of files and their modification timestamps
    name_n_timestamp = dict([(x, os.stat(directory+x).st_mtime) for x in files])
 
    # return the file with the latest timestamp
    return max(name_n_timestamp, key=lambda k: name_n_timestamp.get(k))

You guys know python lambdas, right? They are basically inline functions. They take a list of arguments and a one-liner expression. Whatever that expression evaluates to is returned as a result. So as you can see above, I pass a lambda that will check whether a file name matches partial_file_name into the filter function.

The last line is interesting because it reveals a particular Python quirk. When you run the max function on a dictionary, it will return the the highest key without ever looking at values. If you want to get the key of the highest value however you have to do the above.

Have you ever been in this situation? Have you ever wanted to travel back in time and smack yourself upside the head for writing overly cryptic code. Or you know – stupid code? In production environment? Yeah, I have monumentally stupid code in production right now. How about you? The script above was an easy fix because it was running just for my benefit and nothing else depended on it. Some of the production stuff… Not that easy to fix.

This entry was posted in programming and tagged . Bookmark the permalink.



9 Responses to Python: Open the Most Recent Log File

  1. Sam Weston UNITED KINGDOM Google Chrome Linux says:

    Everyone’s coding style changes over time. If I look at code I wrote a year ago I cringe and facepalm :P I really need to get around to learning python; it looks like such a versatile language.

    Reply  |  Quote
  2. Zel FRANCE Mozilla Firefox Windows Terminalist says:

    name_n_timestamp = dict([(x, os.stat(diexctory+x).st_mtime) for x in files])

    Shouldn’t it be directory and not diexctory ? I think this is the second or third time I notice an obvious typo in the code you post, is it intentional ? You know, so that it won’t work if you don’t actually read, understand and fix it ?

    Reply  |  Quote
  3. road UNITED STATES Mozilla Firefox Mac OS says:

    I’m a biologist and I do a lot of bioinformatics. Perl is the only language I know intimately. Lately I’ve been engaged in a debate with a co-worker about whether there’s value to learning Python. It sounds like a swell language, but I already know Perl. Your points about perl-golf and spaghetti code are well-taken, but you didn’t go-on to describe if/why you thought Python was superior. Perhaps you don’t actually think it’s superior… but — and I don’t intend to start an argument here, this is an honest question — can you explain to someone who only knows perl if/why they should be interested in Python? I mean, you could also just write *better* Perl code, no?

    Reply  |  Quote
  4. Luke Maciak UNITED STATES Mozilla Firefox Windows Terminalist says:

    @ Zel:

    No, that’s just me cleaning up the code the last second inside wordpress. I think originally the arguments were dir and pnam, files was fl and the dictionary was just fdi.

    I do that sometimes. I copy and paste code into wordpress and then next day I’m like – hey, I can clean this up before I post it. That’s where the typos come from.

    @ road:

    I don’t think Perl is better or worse. It’s different – and since I haven’t used it for a while, I got a bit rusty. My main problem was my own use of:

    a) long and cryptic regexes which I had to actually work out on paper because I did not remember what they were supposed to do

    b) Use of the default variables ($_ and @_) all over the place which made the code harder to read than if it had explicit and appropriately named arguments.

    Not really a problem with the language itself – just with me. You can write perfectly readable code in Perl. Python just lends itself to a slightly cleaner and more explicit code. On the other hand it has that white space rule that a lot of people hate.

    I think I mainly wanted an excuse to mess around with Python some more.

    Reply  |  Quote
  5. usecide POLAND Google Chrome Linux says:

    Is not easier to use one line bash script?
    I think that:
    $ tail -f `ls -c *.log | head -n 1` -n 10
    should do the trick :)

    Reply  |  Quote
  6. jambarama UNITED STATES Mozilla Firefox Ubuntu Linux Terminalist says:

    I think these posts are very interesting, but this comment is really just to see where your country locator puts my IP address. I’m in a plane over the Southeast US.

    Reply  |  Quote
  7. jambarama UNITED STATES Mozilla Firefox Ubuntu Linux Terminalist says:

    Ah, smart enough to put me in the US. Geobytes IP locator put me in hong kong.

    Reply  |  Quote
  8. Luke Maciak UNITED STATES Mozilla Firefox Linux Terminalist says:

    @ usecide:

    Yes. That would be easier. :)

    Although if you do use a perl or python script you can extract the data you need and do stuff with it. Like send an email. You can do this with shell scripts as well, but the more complex the “stuff” is the more difficult it becomes to create and maintain that shell script.

    @ jambarama:

    LOL. Geobytes seems to be a tiny little bit off the target. :P Just a bit!

    Reply  |  Quote
  9. Ross UNITED STATES Google Chrome Mac OS says:

    This is great, but where you make the dict, instead of doing simple string concatenation (with the plus), you should:
    os.stat(os.path.join(directory,x)).st_mtime)

    Reply  |  Quote

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>