Archive for April, 2009

Small Programming Projects

Thursday, April 30th, 2009

Let me know if this has ever happened to you. Your boss walks up to you and tells you he has a tiny little, itsy, bitsy project for you. He wants you to build a small online application. Nay, a small online form – just a single form with some database back end. Nothing more. A single page that would allow the employees could submit their TPS reports (or whatever) online. It’s only going to run on the intranet, it will never face the web and there is only 4 people who will really be using it so there is no point worrying about authentication, user management and all that stuff – just do the bare bones minimum necessary to tell the users apart, and keep them from overwriting each others data.

You implement it, and everyone loves it. The boss pats himself on the back for clever use of technology and tells you to add a tiny little feature to make the thing accept quarterly evaluations as well, and extend the user base to like 20 people. That’s about it. And you shouldn’t really worry about expanding it. It’s not going to get any bigger than this. It’s just a tiny change. No need to add anything else. Don’t spend any time improving the design. It’s not necessary, and you will be wasting time.

Next week you get another request. Then another. And another. Each time it is a tiny little change – and you are explicitly told the application is not going to grow, and it does not need a redesign. Six months down the road your application is the main online hub of the company. It’s facing the web, it’s tracking just about every little bit of data your boss could think off, stores few gigs of data and every employee, client, applicant and visitor must interact with it at one point or another. It is a monster and the code is a labyrinthine maze of hacks, patches workarounds and hastily added modifications aiming at adding stability and security to something that was not designed for it in the first place. And better yet – no one, including you can believe how huge it got and how quickly it happened. No one has seen it coming. No one could predict it would actually be taken this far.

I’ve seen this happen multiple times, and heard about similar scenarios from others. Very small, simple projects have a tendency to blow up and grow exponentially into huge enterprise systems. You never know which project is going to bloat this way. In fact, you probably won’t know that one of your projects is on this destructive path until it’s to late. It happens in small incremental steps, spread over long amount of time. But it happens.

And since we know it happens, we can be prepared for it. The easiest way to avoid exponential growth of this type from becoming an issue is to always code as if you were designing something 5 times as big. Always modularize your code, build your applications in a classic 3 tier system and always use MVC or similar paradigm. If your app bloats into some sort of a monster, you will have the infrastructure in place to support it, and build it up. At least for the most part.

Then again, this approach flies in the face of the KISS principle. Sometimes coding this way is indeed an overkill. Sometimes a single form will just be a single form and building a 3 tier architecture and creating/deploying some sort of a framework to support it may be a huge waste of time and resources. Sometimes a quick, dirty and direct approach can be vastly superior to the roundabout enterprise way.

So I guess the trick here is to do something in between these two extremes:

  1. Always assume your code is going to be facing the real internet, even if the initial spec says it wont. Build with security in mind.
  2. Always assume you will need multiple access levels for your users and a flexible access controls.
  3. Never assume your data won’t bloat out of proportion. Never use sqlite or SQL Server express when you could be using something more scalable
  4. Design your database for extensibility – normalize your design and be aware that you might be adding more tables and more complex relationships into the mix in the future. This shouldn’t be much effort since your tiny project will probably need only a handful of tables.
  5. Try to do some data modeling and object-relational mapping as this will help scaling the code later. Since you are starting small, this should be easy to do and it will keep your code clean and organized
  6. Design or steal a robust log in / user authentication module. One day your application may become the main login to the intranet or myriad of tacked on services. Don’t half-ass it. Also think how you can handle authentication from robots – since you may need to set up complex web services one day.
  7. Use an off the shelf module or solution if possible. Why? Chances are the author already spent a considerable amount of effort to ensure extensibility and scalability of their code. So when that inevitable feature request comes in, you have that much less work to do. And even if it is not very extensible, you may still be ahead. It is much easier to justify the need to do some re-writing or re-designing when you can blame it on inferior 3rd party code.

Feel free to add your own tips to this list. Did this sort of thing ever happen to you? I speak from experience here – I still maintain a system that started like that. I was young, naive and green undergrad and I got a tinsy little web project. And now here I am, many, many moons later – still maintaining the damn thing. It grew into a behemoth – an overwhelming mountain of code some of which is so crappy that I am the only brave soul that dares to touch it.

I also inherited a system like that once. I rewrote most of the UI (that was actually my assignment) and left most of the back end intact – trying to avoid the hairy code on that side of the application. I then happily passed it on to the next brave soul.

How about you?

Youtube Rot

Tuesday, April 28th, 2009

Anyone who has been blogging for at least few years now, can attest to this: Youtube videos sometimes go away after they have been up for a while. Most of the video hosting services do not have policy of deleting old and inactive content, but it still happens. There are many reasons for this. Sometimes the author deletes the video himself. Sometimes he or she deletes their whole account nuking all the content they have ever submitted in one swift click of a button. Other times Youtube forcibly deletes their account for a TOS violation. Finally, many videos are lost to broken and often abused anti-piracy compliance procedures (delete video first, ask questions later… or never). Regardless of reasons, the slow fading away of embedded videos is a fact of life.

If you don’t believe me, you can easily verify this for yourself. Pick a blog that you like and dig into it’s archives from 3-4 years ago. Look for video posts. I can almost guarantee that you will find at least one (but likely more) broken embed that looks like this:

nbcnolongeravailable.gif

Chris Wellons (btw, go read his blog – good stuff) was recently browsing through my own archives and found quite a few dead videos there. These vids just vanished. They were swallowed by the cyberspace void never to be see again. They became ex-videos. They have ceased to be!

What’s worse, I forgot which videos were they to begin with. Some of my old posts would tangentially discuss a video that I embedded but at no point would even hint at what the content was – and I did not remember. So there was no way for me to even go back and re-embed it or post some explanatory blurb. These posts are now mysteries.

Of course, the web being what it is, all online resources are prone to sudden disappearances like that. Broken links to articles, and broken images are also a frequent sight on pages that have not been updated in a while. But I find it that Youtube videos are much more volatile than anything else.

This is probably due to the fact that most of other content people tend to link to is self hosted rather than tied to some free service that needs to react to 10 million wanton copyright infringement complaints each day. For example when I put an image or an article here on Terminally Incoherent it will stay up as long as I keep paying my hosting fee. It may go down every once in a while because my host can’t keep their shit together, but it won’t disappear overnight one day. Even if Dreamhost decides to delete my file due to a complaint (or for fun) I can effortlessly restore it under the same URL if I need to.

Furthermore, we already have established ways to prevent this breakage from happening – or to work around it. Hot-linking to image files is for example, considered rude and inconsiderate (if not dangerous – what if the original author of kitten.jpg you hot-linked replaces it with goatsey or tubgirl out of malice?). It is considered a common courtesy to copy an image, host it yourself and then link back to the original source. This way the author gets the credit, but does not have to suffer the costs of serving the image to your readers.

Similarly, you can avoid making your post incomprehensible after an article you linked to goes away, by simply quoting the source. Textual quotes give the readers the much needed context without forcing them to leave your page to read some long article they may not be that interested in to begin with.

Video hosting these days is almost exclusively done through Youtube like services. You can still self host videos just like images but it is just not as easy. To host a picture, all you need to do is upload it. To host a video, you will usually need to convert it into a FLV, upload it, set up a flash based FLV player somewhere and then combine the two. It takes some work. Or you can just post the video in it’s original form for download, but it will be huge, inconvenient to handle and it will kill your bandwidth. Posting your stuff to Youtube or similar place is really the best the easiest and quickest way you can get your stuff out there.

Once you host the video on their servers however, you are at their mercy. If your vid is taken down, you can’t easily restore it. You can beg them to bring them back, and dispute the take down but that does not always work. If you re-edit the video (for example in response to a complaint, or just to fix it up) you can’t re-upload it with the same URL. Your new upload will get it’s own unique address, an all the existing links will be pointing to the old one – or to a 404 error page if the old vid was taken down for some sort.

Locally caching videos the way we commonly do with images is impractical. Youtube currently does not provide us with an easy way to download the videos it is hosting. While there are many sites that specialize in extracting FLV files out of Youtube pages they are all unofficial. Some video services frown upon that practice, try to interfere with it and explicitly ask you not to grab their source files in the TOS. Most of them go out of their way to make embedding their content super easy for the end user. They provide you with an embed code (often in several formats such as HTML, BBCode and etc) and syndication buttons for popular services.

Because of this, embeding is the accepted method of sharing videos these days. So we have a broken system in place, used by millions of people. We know that this system often results in broken links and that there are no easy to apply and viable alternatives. So what do we do about it?

There exist services that specialize in allowing you to watch “deleted” Youtube videos. or which cache videos on demand, or automatically. but they don’t help us. For one, they are not reliable – some videos are lost to them as well. And in either case, we trade one service for another which may be even less reliable in the future. But even if a video is available via such service, it does not change the fact that original link or embed is broken. That’s really what we are trying to prevent here.

There is really no way for us to stop or avoid Youtube rot – it just happens randomly. You can try to be cautious of what you link to, but you just never know which video will stat up forever, and which will be gone after 2 weeks. Best thing we can do is to try to work around it. Now that we know Youtube rot exists, we can take steps to make our posts and articles less vulnerable to it’s effects.

Here is my proposal: each time you embed a video, include the following in the body of your post somewhere:

  1. The exact title of the video as it appears to youtube so that your readers can easily google it once it is gone
  2. The name or nickname of the original creator, to help narrowing down the search
  3. A brief description of the video contents to provide context for your readers if the video can’t be found elsewhere

When I say brief description, I mean brief. You don’t need to do a House of Leaves type transcript of every scene and every dialog. Just give your readers some idea of what happens in the movie. You can even phrase it in the form of a comment as in “I never knew hat doping menthos into coke and watching it explode could be this much fun”.

Most people don’t do this. They just post the embed code and then comment on it, but they do not take time to properly describe, tag or attribute it. I know, because I’m as guilty of this as anyone else. Think about it though. Next time you include a video in your post, take a 10 seconds to copy and paste the title and the author below it. Take a minute to comment on it’s contents. This will pay off in the long run. In a few years, someone will stumble upon that old post of yours and will still be able to make sense of it. Better yet, thanks to the full title and author they may even be able to locate the vid you mentioned on some other service.

Blog fixed via Upgrade

Monday, April 27th, 2009

Just wanted to let everyone know why the blog looks like crap right now. Or rather why do you see the ultra spartan Kubrick theme instead of the bastardized, orange version you are used to.

Last night my blog broke for some unspeakable reason. I was unable to post or edit posts – each time I would hit the “Save” or “Publish” button I would be redirected to my homepage without any changes being saved. Just like that.

I spent whole Sunday ripping my hair out, cursing, banging my head against the wall and tweaking every possible setting on the blog I could think of. I disabled and enabled cache and anti-spam plugins. I upgraded the PHP version on the server. I tried disabling mod_security as per this thread which was describing eerily similar issue. I edited and re-edited my .htaccess files about a million times.

The only thing I didn’t actually do was to update Wordpress… Cause you know – that’s just crazy talk. As my last resort I contacted Dreamhost support, but since it was Sunday they only got back to me today. Do you know what they said?

“We’d love to help ya but you must upgrade your wordpress first cause your version is way old.”

I’m paraphrasing here – the support guy was actually much more courteous and used expressions like “I’m very sorry to hear about your problem” and “I will be happy to help you” but that line above was pretty much the gist of the email.

At that point I got pissed off and did something reckless. I logged into my CPanel and clicked on the big button that said “Upgrade Your Wordpress, Stupid!”. That thing has been taunting me for the longest time but I never had the guts or inclination to click it. I’m one of those… What do they call them… What is the polar opposite of an early adopter? And no, it’s not late adopter. I’m like the anti-adopter.

I run obsolete software and hardware until it breaks. I’m a firm believer in the old adage “if it ain’t broke, don’t fucking touch it less you want to spent the next 6 hours fixing it”. But since my blog just broke I felt compelled to press that button. That was reckless and stupid, mind you cause I happened to be running Wordpress 2.0.5.

Yeah, shut up – I’m that bad. I’m gonna turn in my geek badge at the end of the day. The thing is, when you upgrade from that far back, things are bound to get broken. I know this from experience – I’ve seen many broken upgrades in my day. That’s probably why I’m deathly scared of them. But I was desperate, pissed off and I felt reckless. So I pressed the button.

Lo and behold – the blog fixed itself, and without any major fuckups. At least none that I can see right now. I mean, you can read it and I can post again so that means we are back in business. The custom theme went to hell during the upgrade, but I have it backed up so I can easily restore it later today or tomorrow. The important thing is that the damn thing did not break into million pieces like I expected – which is great.

So that’s why the blog looks like shit today. But look at the bright side – until I fix the theme you get an add free Terminally Incoherent experience for the day. Or at least those of you who are not running adblock.

Anyways, I’m happy and relieved and I can go back to writing ridiculously long, and badly spelled posts for your enjoyment. Oh and Wordpress 2.7 is kinda spiffy. My only problem is that I can’t find anything now. But I’ll get used to it. :)