Archive for February, 2008

It is not theft!

Tuesday, February 19th, 2008

I said it before, and I’ll say it again: downloading copyrighted content without author’s consent is not theft. It’s copyright infringement! It doesn’t make it any less illegal, but it is a very different crime. Please get it right, otherwise any discussion about copyright is impossible!

Claiming that infringing copyright can be equated to theft is essentially building up a straw man argument. After all, everyone knows that theft is a relatively serious crime and one that affects the victim in a very real way. Infringement is very different crime which really requires a different set of laws and a different mindset. If you try to convince people that file sharing is harmful while basing your argument on claiming it is “like theft” you are simply doing it wrong. You are simply building a straw man, and toppling it with impeccable logic but none of that applies to copyright. Unless we change the law to treat intellectual property the same way we treat physical property, your argument is flawed.

It is really simple to explain, but most people on the other side of the fence (pro copyright) just don’t get it. When you download copyrighted content you are simply violating someone else’s exclusive distribution right. There are many real world examples people use to illustrate how infringement works, but I like this one:

Imagine there is a big game in the local ball park. Bunch of local kids decide to watch the game by climbing a tree growing on public property right outside the fence. They get caught. Should they be charged for theft? Do you think their crime is equal to for example that of a pickpocket who swiped game tickets from an unsuspecting sports fan?

They were able to watch the game because the tree was there to provide them with a good viewing platform. If it wasn’t there, would all of them buy tickets? Could you with any degree of confidence estimate how much money each of them would spend on tickets, hot dogs and refreshments?

If you haven’t figured this analogy yet, let me break it down for you. The ball park is the official distribution channel for the multimedia content. The ballpark owner is RIAA, MPAA and etc. The sports team playing inside are the artists. The kids are “the pirates”, and the tree symbolizes the internet. Downloading movies and music is like climbing that tree and watching the game for free. It’s not right naturally. You ought to buy a ticket to watch the game - so it can be said that the owner didn’t earn any money on you. But it can’t be said he actually lost any money because no one can prove whether or not you would actually buy a ticket if the tree was not there.

You Wouldn't Steal a Car

Infringement has nothing to do with steeling. This is why the MPAA’s “You Wouldn’t Steal a Car” campaign is nothing more than muddying the waters. It’s propaganda - a calculated distraction. Entertainment industry does not want to discuss the real issue here because it is not in their best interest. What they want to do is to sow fear, uncertainty and doubt. And it’s not that they do not have an argument to begin with. The law is on their side. But we have to agree that infringement is much lesser crime than theft.

Then again, you have to wonder what good is a law that is virtually unenforceable. When I meet new people the topic of file sharing invariably comes up sooner or later. I have to say that I have yet to meet a person below 40 who can honestly say they never downloaded a song or a movie from the internet. It is simply unheard of. I’m counting both males and females of all races, creeds and education levels. In fact, many times I was honestly surprised to find out that a given person engages in this activity. No one respects the copyright law. Copying is as natural as breathing to us. This is what computers and internet were designed for.

And there is no way to stop it. Technological means to curb copying (DRM) don’t work. And statistically chances of getting caught are close to zero. Only a very small percentage of people will ever get sued by RIAA and MPAA.

Who what is the point of un-enforceable law that everyone breaks without even thinking about it? Protecting revenue streams of distributors might be a reason here, but there is something wrong with this picture. Let’s step back and think about it for a second.

I just said that I don’t know a single person below 40 who doesn’t do at least a little bit file sharing. This means that a huge chunk of population is involved here. And yet, I haven’t heard about a single movie or music studio closing it’s doors because of losses incurred due to “piracy”. Movies, music and software keeps getting made so the alleged losses cannot be that significant. Entertainment industry can talk all they want about “lost sales” and projections of lost revenue but most of us know these are just wild guesses that cannot be substantiated with any concrete data. There is just no evidence to support them.

If this was theft, we would see real very tangible evidence of loss. But we have no such evidence - just fuzzy, optimistic estimates. And this is the gist of the problem that the pro RIAA and pro MPAA advocates try to avoid discussing.

So please, next time you get into this discussion, get it right. Otherwise you are just regurgitating a flawed argument that has no basis in reality.

Client Side Table Sorting with JQuery

Monday, February 18th, 2008

For a while now, I have been searching for a good Javascript framework to mess around with. I think I finally found one that I really like. JQuery is a 15 Kb bundle of pure fucking magic. It is tiny compared to some of the other frameworks out there, and at the same time very powerful. What can it do? Tons of stuff. First thing you notice is of course eye candy things like sliding and hiding divs, easily manipulating tons of data on the page and etc.

The best thing about JQuery however are the plugins created by the very active community. For example, there is a small script out there that lets you sort HTML tables on the client side. Normally when I dumped data out of the database on the screen I would implement sorting by simply making another database query with a modified WHERE clause and then re-draw the whole table with the new dataset. But why should I use such a database intensive method when JQuery makes it possible to quickly sort a large table without even reloading the page using client side resources?

Go check out the demo on the tablesorter plugin website. It is really a great piece of work. Christian Bach deserves major props for putting this together. To tell you the truth, I never really considered sorting of tables in javascript mainly because the amount of effort it would take on my part. Implementing the sorting algorithm and the transformations would just be so much more time consuming than simply writing a quick SQL query. It somehow never seemed worth the effort to me, until now.

The way I see it, this method of sorting has two major benefits over the old fashioned server side method. First, as I mentioned earlier it takes some load of the database. Second, the sorting will seem much faster to the user because we are not reloading the page. It should even be faster than an asynchronous call as these things tend to have a visible delay. Tablesorter manipulates DOM objects which are already loaded in clients memory - there is no loading or waiting required.

It even comes with a neat pager functionality which could come in handy. Then again, I’m not absolutely convinced that paging should be done on client side for very large tables but it is a nice feature to have just in case you need it.

Just to illustrate how easy it is to apply it to an existing table here is what you need to do. First download both JQuery and Tablesorter and dump them somewhere. Then simply add following calls in the header:

<script type="text/javascript" src="/path/to/jquery.js"></script> 
<script type="text/javascript" src="/path/to/tablesorter/jquery.tablesorter.js"></script>
<script type="text/javascript">	$(document).ready(function() { $("#myTable").tablesorter(); } );</script>
<link rel="stylesheet" href="/path/to/tablesorter/themes/blue/style.css" type="text/css" media="screen, print" >

Then simply make your table id to be “myTable” and it’s class to be “tablesorter”. The first one is mandatory for the sorting logic to work. The “tablesorter” class is defined in the css stylesheet you link to and it will create nice headers and table layout for you:

<table id="myTable" class="tablesorter">
	<thead>
		<tr>
			<th>Column 1</th>
			<th>Column 2</th>
			<th>Column 3</th>
		</tr>
	</thead>
	<tbody>
		<tr>
			<td>1</td>
			<td>a</td>
			<td>foo</td>
		</tr>
		<tr>
			<td>2</td>
			<td>b</td>
			<td>bar</td>
		</tr>
 
	</tbody>
</table>

That’s it. You are done. Your table is completely sortable now, by every column, and is even nicely formated. It couldn’t be easier than this. ) And yes, you do have to use the thead and tbody tags.

There is one caveat though. When you download current release, be aware that the jquery.tablesorter.pack.js file is broken for some reason. It will work just fine in Firefox, and Opera but in IE6 it will throw some sort of cryptic bug. It might be some artifact introduced in the packing process or something like that. If you use jquery.tablesorter.min.js which is only marginally larger in size, this issue goes away. Go figure. It’s still a great tool despite this little glitch.

I think I’m sold on JQuery. It is tiny, and yet extremely powerful toolkit, and I highly recommend checking it out.

E-Books and File Sharing

Saturday, February 16th, 2008

Have you noticed that while book scans are as abundant on file sharing networks as other media, people in publishing industry are not really crying that much about piracy? Ok, I take that back - they do cry about it, but only when we bring up the subject of E-books or book scanning projects (like Google Books for example). Then they get as silly as movie and music distributors. I’m not sure why though.

At the moment the Book publishing industry is in the comfy area where their product is “tastiest” in it’s papery, physical analog form. Frankly speaking, e-books suck. There are many reasons for this - and only some are technology related.

For example, what is the best e-book file format out there? It seems that everyone has their own proprietary one tied to a single platform with some custom DRM attached to it. Some people distribute things in PDF format but they usually don’t bother to format the text for screen reading. Most PDF’s are designed for printing, and thus use letter sized pages which force you to scroll up and down. Personally I’m partial towards the Microsoft LIT format used by the Microsoft Reader. The format is designed for screen reading (no scrolling or zooming necessary), has built in bookmarks, highlighting and notes features. It re-pages your book based on screen size, so it still looks good on a PDA. And the files are relatively small. But the format is proprietary and only supported by Microsoft which means you wouldn’t be able to use it on your iPhone or a non-windows based reader.

Second problem is that no one really knows how to use e-books which is more of a behavioral issue. When do people usually read? In most cases it’s not when they are sitting at their computer desk. They read in their bed, in a bathtub, on the toilet, on a train/bus, on the beach, in the park, on their lunch break in the cafeteria and etc.. And it so happens that these places are actually less than perfect environments for a laptop or an electronic reader. Who is going to take their $400 Kindle or a $200 PDA into bathtub with them? Or on the beach? Until we get a reader that is very robust, very portable and very cheap, the paper books will be immensely more practical.

Besides, ask any bibliophile what he thinks about e-books and he will start waxing poetic about the sublime textile experience of actually holding a book. People enjoy silly things like the feel and smell of the book - the way you crack open a brand new volume, or the way the very old book bears marks left by previous owners. Without knowing it publishers have been capitalizing on one of Kevin Kelly’s 8 generatives: the embodiment factor.

Scanned books are abundant on file sharing networks, but despite that fact people keep buying paper versions because they are just better, more practical and nicer to handle. Personally I have several scanned in books in pDF format sitting on my hard drive - most of them are old RPG rulebooks that are long out of print now. I hardly read any of them because the are just so impractical. I also have bunch of novels downloaded from bunch of places on the internet. I don’t think I have read any of them in the digital format though. I had bunch of stuff by Cory Doctorow but I ended up buying most of it mostly because I wanted to support the guy (patronage is btw, another one of the 8 generatives). I had Card’s Ender’s Game and Starship Troopers by Heinlen in LIT format on my PDA back in the day, but at one point I just gave up and went and bought the books. Reading from my crappy DELL Axim was not a great experience.

A paper book (be it hardcover or paperback) simply provides the average customer with more value than an e-book. This is just how it is for now. This is precisely how you make money in the digital era - you provide your customer with tangible added value on top of the easily digitized data. E-books may be easy to copy, but they are not a real threat to the publishing industry, and they won’t be one for quite some time yet.

The Foreach Inconsistency

Friday, February 15th, 2008

It is interesting that most programming languages have almost identical syntax for the most commonly used set of flow of control statements. There are variations of course, but for the most part the if/else blocks, and the for, while and do loops are almost identical no matter what you use. It is actually quite surprising that the for loop is this consistent as it has the most complex of all of these statement. Yet you can jump from C to D to Java to C# to Perl to PHP and the syntax remains virtually the same. It is always variation on the following theme:

for(declare iterator; evaluate condition; increment)

It might be because all these languages try to mimic each others common features for a reason. Or it might be because the for loop is by far the most popular loop out there. If you don’t believe me look back on some code you wrote recently. I can guarantee you that the keyword for will show up much more frequently than a while or a do. For that matter, it seems to me that the do loops seem to be the least favored breed out there.

What is interesting however is that while all these languages have an identical syntax for the for loop, their take on the related foreach loop varies greatly. Everyone seems to be writing it differently. I mean look at this:

D:

foreach(element; array) {
  // do something to item
}

Java:

for (Object element: array)
	// do something

Javascript:

for (element in array)
	// do something

VB .NET:

For Each element In array
 ' do something
Next element

C#:

foreach (object element in array)
	// do something

PHP:

foreach ($array as $element)
	// do something

Perl:

foreach $element (@array)
	# do something

Python:

for element in array
	# do something

Ruby:

array.each do |element|
	# do something 
end

It is a mess. Some languages use the foreach keyword, while others simply add functionality to the standard for loop. Some like Python and Ruby completely do away with the standard counting loop and use the for … in … syntax to do all looping. The in keyword seems to be relatively popular but once again, not everyone likes it. There is even no consistence as to whether the collection/set should be specified before or after the temporary element.

Since I tend to jump between languages a lot lately, I’m perpetually confused by the foreach loops. I end up looking it up every other day because I keep forgetting which one is which. I keep putting the element outside the parens in PHP, I use the as keyword in Perl and etc. I never really had to look up a while loop in a language before - they all work pretty much the same. But it almost seems as if the foreach loop was THE feature which you use to show the world how your language is clever and superior. It’s like a contest really. My foreach is better than yours!

There is nothing we can do now. We already have all these different ways of expressing this loop floating out there. But perhaps we could try to navigate towards some common ground here. Personally I think that the foreach(element in array) syntax is possibly most readable one out there. I imagine that Java and C# folks are shuddering at the sight of that superfluous in keyword in there but for me it really helps. This is the way I would expect it to be done - put the element first, the array reference second and have it read almost English like. For each element in the array do this. How does the PHP version sound when you try to read it? For each array as element? What does that mean. Come on people, the foreach-in version wins on readability. But then again maybe it’s just me. Maybe I’m suffering from the “my foreach loop is better” syndrome too!

Teaching Web Application Design is not Easy

Thursday, February 14th, 2008

Here is a hypothetical situation: I introduce you to a total n00b - some dude who is completely green, and has no programming experience whatsoever. Let’s give this hypothetical construct some generic name - for example Guy. I bring this guy into your office, sit both of you down and say that you have 3 months to make Guy into a kick ass Java programmer. Or Python programmer. Or Ruby. Whatever floats your bout. I’m going to use Java for the sake of consistency but feel free to mentally substitute it with your favorite language of the month. Either way, you have 3 months to take him from absolute zero to a point where he can be given a moderately challenging assignment to work on. He will be working on the back end, writing complex algorithms doing some sort scientific analysis or will end up writing some crazy GUI stuff. Here is the language, here is the deadline, make it happen! This is all you will be doing, and you are free of all your other obligations until this teaching project is done. Could you do it?

Let’s assume that Guy is fairly bright student who has a good grasp of technology in general, and is fairly good at abstracting ideas and thinking in abstract terms which is kinda required here. Other than that, he is a blank slate - he never programmed before but he is eager and excited to start learning. I think this could be done - I think you could take someone like that, and train him effectively in a relatively short amount of time.

Our aim is to teach him Java but we will really be teaching him programming from scratch. He will become a programmer, and once we are done he should be able to pick up another language an learn it on his own. But we have to start with something to teach him concepts so we pick one. Right from the get go we can build a consistent curriculum that will start with the simplest “Hello World” example and slowly build him up teaching him the basic, then advanced concepts. It would be a fairly linear progression in a consistent environment. We teach him a new concept, then augment it, add another one, tie them together and so on.

The only time you actually have to shift gears is when you teach him about databases. You have to briefly take him out of the Java context to teach him SQL syntax, but at this point he should be able to grasp it quickly. You have to take him out of the OO paradigm to teach him the relational model. But that’s about it. As long as Guy doesn’t have to do any front end stuff for the web, it’s all clear and consistent linear progression. It is by no means an easy or trivial task, but it is relatively straightforward. There are few variables to worry about, and you always know where you are and where you need to go. When you finish, Guy will learn all the important concepts, and develop all the programming good habits he ought to have.

Now if I turn around and say, make Guy into a Web Designer capable of creating dynamic Web 2.0 applications the story changes a little bit. It is not an outrageous request to begin with - at least not much worse than the Java programmer one. It is still doable, but I submit that it is much more haphazard and confusing process. Web design is one of those areas that looks deceptively simple from the outside. I mean, how hard can it be? You write a simple back-end, slap together some HTML templates, dump it all on the server and you are ready to go. And it’s true - a lot of web apps are relatively simple to build. But taking someone from zero to a competent Web 2.0 designer dude it is really quite complex because all of the variables involved. For example, where do you start?

You could start with teaching Guy HTML which is easy. Every semester I teach basic HTML to my CMPT-109 students and I can assure you that all of them are from the statistical 60% population sample that will never be good at programming. But they all can manage to write simple websites that usually do not validate but look fine on your screen. I joke around that I teach them the wrong way to use HTML, but it’s kinda true. I usually don’t touch CSS because of the lovely caveat of browser support.

Anyone can write down bunch of HTML tags, but it takes patience and perseverance to learn all the quirky ways in which different browsers render certain things, and work around them. So the whole HTML experience is like a ride on an escalator that terminates in a vertical wall that must be scaled to progress any further.

You also need to teach Guy the client side Javascript which means you need to dive into imperative, functional programming paradigm with some OO thrown in for shits and giggles. You have to make a choice now - do you use Javascript to teach him programming principles? Or do you just make him learn few things by rote, and wait for the server-side language to really give him a solid programming lesson?

If you decide to stick with Javascript for a bit, you will hit another snag. It’s not consistent! Even simple things like walking the DOM often vary from browser to browser. So you are not only dealing with programming concepts but also weird quirks of client side language.

Then you have to make yet another huge leap and teach him server side stuff. We could do Java but we are on a deadline, right? And we still have a long way to go. So let’s pick something that is easier to pick up - a scripting language. PHP is a popular choice - mainly because it is deceptively easy to learn. Unfortunately writing good, maintainable code in PHP is actually a form of art in itself because similarly to Perl, PHP really makes it easy for you to develop really bad coding habits, and really ugly code. But you can probably take Guy from “Hello World” to dynamic HTML generation to DB access very rapidly so it’s good enough for our purposes.

In fact, it is probably better to start with the back-end language, teach him basic programming principles and then put him on a crash course of HTML, CSS and Javascript and XML. It’s probably more digestible this way. Either way though, you end up with a lot of context switching.

You are teaching Guy 6 distinctly different technologies: HTML, CSS, Javascript, XML, the server side language and SQL. They all have different rules, they all have their quirks and at least 3 of them are wildly inconsistent. The modern web application design field is an amalgamation of these very different technologies haphazardly thrown in together. It’s a mess! A good web designer who can envision and implement a working Web 2.0 application must wear many hats so to speak. He has to be a good back end programmer, a good client side programmer, he must have a knack for web design with css, and he has to have at least cursory understanding of databases. But most of all, he must be able to effortlessly switch between these domains of knowledge. When he finds a bug, he needs to be able to locate it - is it in the back end code? Is it in Javascript code? Is it a HTML layout issue? Or perhaps it is a problem with the way you generate Javascript code using PHP? This sort of multi-dimensional thinking is something very hard to teach.

In contrast, a Guy the Java dude who doesn’t work with web apps usually wears a single hat. When he finds a bug, he fires up the debugger in his Eclipse and steps through his code until he figures out what is wrong. It is a much more straightforward process which requires almost no context switching at all.

So can you take Guy from zero to Web 2.0 hero in 3 months? You probably could, only it is going to be messier, grittier and less elegant process. Unless of course you cheat and use a framework like Rails one of it’s many clones.

This is why Rails is just a huge hit these days. It’s because it automates a lot of this petty bullshit that web devs deal with on daily basis, and brings some consistency into the equation. It allows a single developer to whip out a decent, functional application using consistent methodology, without switching context every 5 minutes. 90% of the code he will write will be in Ruby. And while Rails does a lot of stuff behind the scenes, the programmer hardly ever has to deal with it. Nevertheless, Rails like frameworks are still just a crutch - they are code generators. They spit out relevant markup and sql when and where it is needed. The whole environment is still the same haphazard mess, it is just partially hidden from the developer. It is a step forward though.

I was watching some interview with Steve Yegge (if you don’t know who he is, look him up) recently and something he said in there really struck a chord with me. He said that perhaps we just need to come up with a brand new language for the web. Something designed from ground up with Web 2.0 in mind. Not a framework, but a language which would encompass and unify all these distinct domains under a single, consistent syntax and implementation. Something that would let us use that good old linear progression teaching model, instead of trying to force incompatible and often contradicting ideas into poor Guy’s head. You would define a display template, the model and controller logic, and client side processing and validation using the same uniform basic syntax. Then you would slap it on the server, and the relevant HTML, CSS and Javascript would be generated dynamically. Naturally I’m paraphrasing it here, but wouldn’t something like that be nice?