Archive for May, 2008

Why Software DRM Doesn’t Work

Friday, May 9th, 2008

I talk about DRM a lot, but I never really found a perfect way to explain it to people who are clueless about technology. When I teach a class on digital media, and go over DRM I usually briefly cover different methodologies, and then put up dates when they were cracked at the end of each slide. I found that it really drives the point home when I go through this buildup stage, explaining these extremely complex systems and conclude each series of slides with “Oh, and this scheme was actually cracked 3 days after release”.

Then I usually reuse few slides from my cryptology lecture and give them Bob, Alice and Eve example, emphasizing how in DRM world Alice and Eve are the same person. This is a great example that really speaks to people who understand and care about cryptography, but Fluency in Technology students usually find it a bit confusing. I’d love to have something explains the absurdity of DRM in a way that is clearer, funnier and doesn’t involve the 3 most hated people in my classroom (sorry Bob, Alice and Eve).

I think I might have found an allegory that might just work on that level. This is how Shamus from Twenty Sided described DRM in his recent post:

In the original Monkey Island, at one point you are captured by natives who lock you in a simple bamboo hut. There is a trap door in the floor through which you may escape. If you’re dumb you can walk over to the natives once you’re out, and they will grab you and throw you back into the hut. The second time they throw you in, they add chains to the door. The next time the door is made of metal. This keeps going until eventually (if you keep going back) they have a bamboo shack with a massive steel vault door on the front, a timed lock with an alarm system on it. It looks like the front of Fort Knox.

“How he keeps getting out is almost as mysterious as why he keeps coming back.“

In a lot of ways these DRM schemes are a bamboo hut with a vault door on the front. The keep using a bigger and bigger lock and a more complex system of authentication, but it still has to run on a machine where you can edit the executable, and all the hacker has to do is go in and disable the part that says, “Do the security check.” It doesn’t matter how secure or complex or devious the security check is, if the machine’s not doing it, it’s not doing it.

I played that game, and I remember that part but I never connected the two! But it is a perfect fit! It’s vivid, funny and really gets the point across. I really don’t expect my students to actually know what Secret of the Monkey Island was. Most of them are just to young to have played it when it was still on the market, and to clueless to download it from one of the the abandonware sites and play it via ScummVM. Well, maybe one or two would actually know about it. Still, the story is silly enough to work so I’m totally stealing it.

Here is a youtube video of that scene for your reference:


escape from the hut © pheedbaq

Shamus is right of course - it very difficult to design copy protection software in a way which will be difficult to crack by a 15 year old kid armed with a debugger and a hex editor. Anything that is running on the client machine can and will be tampered with. The only way to make the game uncrackable is to have the copy protection run on a remote server and have the client simply forward over user authentication. Still, that doesn’t prevent people from sharing accounts and hacking the client to do weird things. Not to mention the costs of running an operation like that. Most of video games that are not MMO’s are really client based applications ant as such will always be vulnerable.

So the second best thing you can do is to mislead and confuse the potential attacker and make his job difficult. Adamantyr posted a god example of this practice in the linked thread over at Twenty Sided:

Concerning executable cracking, Chris Crawford has a VERY good write-up of how he protected one of his games in his book “Chris Crawford On Game Design”.

In particular, he uses obfuscation techniques such as:

  • Burying work code inside of recursive loops, so reading the active process stream has a ton of noise the hacker has to wade through to find the ONE interval that does something.
  • Code over-writing, in other words, the program overwrites parts of itself while running in memory. This is actually really bad from a security standpoint nowadays, but it’s fiendishly clever and sadistic for the poor hacker who’s world view has just been demolished by code that changes when he’s NOT LOOKING.
  • Dummy variables with obvious names that draw the hacker away from the actual important ones.
  • Storing actual data in the stack garbage and fetching it in a clandestine way, like an “accidental” buffer over-run.
  • Deliberately breaking the game so the legitimate version would “fix” one element of data. Otherwise the game can’t be finished.

He actually hired a professional hacker to try and break his program after he’d finished it, and the guy never got past the first level of defense he set up. He later found cracked versions online, but none of them were actually completable as his “flawed” data element wasn’t fixed.

I haven’t read Chris Crawford’s book but the techniques mentioned above would indeed make the life of your average teenage cracker very difficult. However, they would make the life of your average game developer a nightmare as well. Some of these things are really bad practices. Storing data in garbage, controlled buffer overflows, cryptic spaghetti code - this stuff is just bad software development plain and simple. If you are a single programmer on the project, you can probably get away with scattering stuff like that all over your code. When you are working as part of the team, this is the kind of stuff that will get you beaten up by an angry mob of coworkers who have to debug your cryptic code.

These methods do not really seem to fit well into the modern software development model. The only way to make this soft of copy protection work is to have it tightly woven into the very fabric of your software. The copy protection checks should be tightly coupled with real processing code, overlapping and hiding behind real data in as many places as possible. But who the hell is going to test and maintain that kind of stuff? No one really does copy protection this way anymore.

These days most companies think of DRM as a security layer or a module you can buy or license then slap onto a wide range of products your products. They view it as installing a lock, on the bamboo hut because that makes sense and is economical. Once you build an awesome lock, you can use it on any hut you want. Sadly, a hut is still a hut. It is made out of bamboo which can be defeated with a hacksaw, and you can always tunnel under it since it has no floor. What Crhis Craftword seems to be proposing is building a Cube like environment instead of a hut. But that is a hard job which requires not only dedication but also experience. The problem is that most game developers are not really experts in obfuscating their code, and building copy protection mechanisms into their code. In fact they are usually the exact opposite of that. They are trained to write clean and understandable code that is easy to test, easy to debug and conforms to the best practices. Game development studios just want to make games. Who insists on DRM then? The publishers of course. They are the major driving force behind the copy protection industry because “piracy” cuts into their profits the most. And they are not experts on writing obfuscated software either. What they want is something simple like this:

  1. Get a nice black box containing precompiled binaries for the game from the developer studio
  2. Purchase a another box with a complete, end-to-end DRM solution
  3. Pay some low skilled employee to wrap the game proper in the DRM container creating master package to be burned on CD’s or DVD’s
  4. ???
  5. Profit

Neither game developers nor publishers are really interested in building these protection systems. They are interested in buying them, and thus a whole new industry grew as a response to this. Companies started specializing and building DRM systems as separate products. DRM is now a piece of software that is generic and modular designed to fit with as many different products as possible to maximize profits. It can’t blend seamlessly into the game it is protecting or hide behind live data. By necessity, the number of places where the game code intersects with DRM code is limited. The more you try to integrate the two, the more of custom code and modifications you need. And of course, DRM makes charge for this kind of stuff at a premium rate. So the direction which the game industry seems to be taking is building really complex and impressive locks to use on their bamboo huts because it is really the only logical and economical way to do this. The other route is just plain nutty - exuberantly expensive, and potentially creating huge maintenance problems in exchange for what? They can’t guarantee you success - no one can. If something runs on the client machine, it can and will be tampered with - any part of it can be overwritten or modified.

But as you can see the whole system is deeply flawed. Sometimes I wonder how do executives who make the decisions to use DRM systems such as SecuRom or StarForce react when they find out that a cracked version of their product hit the torrent sites 3 hours after the release? How do they justify the expenses they incurred to license the protection technology? Perhaps they don’t. Perhaps no one tells them these things. Perhaps they live out their lives oblivious to the truth, thinking that the millions of dollars spent on licensing some DRM product actually made their software invulnerable. More likely though they hide behind company policy so they can then justify low sales to their sock-holders telling them stories how evil pirates are still robbing them blind despite these strong counter-measure steps they took.

Anyway, if you don’t mind Shamus, I’m gonna use your Monkey Island allegory next semester when I’m teaching my class about digital media and DRM. )

Few Words on LaTex Fonts

Thursday, May 8th, 2008

LaTex documents have a very distinctive look. First of all, they do look very pretty in print since they actually use professional typesetting techniques to make the text flow nicely on the page which makes them stand out when juxtaposed against stuff generated by Microsoft Word. But there is another reason why it is easy to spot a LaTex document from a mile away. It’s the default font known as Computer Modern which was created by Yoda Donald Knuth himself. Explanation for the joke can be found here. It is very distinctive, and gives the documents that Tex look and feel:

Computer Modern

A lot of people I know, affectionately call this default look as “fucking ugly”. I do not share that sentiment. Personally I think the font has it’s own unique charm, but nevertheless it is not suited for everything. Sometimes you are required to use the ubiquitous Times New Roman font. How do you do that in LaTex? There is nothing simpler - just stick the following two lines in your preamble:

\usepackage[T1]{fontenc}
\usepackage{times}

The result is subtle, but the difference is clearly visible. Which font you prefer is really a matter of taste:

Times New Roman

Sometimes having your document standing out from the sea of papers written in Times. If you want that older book look, you could try to use something like the Bookman font:

\usepackage[T1]{fontenc}
\usepackage{bookman}

See how it compares to Computer Modern and Times above:

Bookman

Naturally, when I start talking about fonts, someone immediately starts asking me about Arial. You don’t. Why would I want to infect my pretty LaTex documents with Arial? Nevertheless they keep asking. Arial is the Times New Roman of Sans Serif fonts, despite the fact that it is merely a cheep knock-off of Helvetica. But yes, you can do it just as easily as any of the examples above:

\usepackage[T1]{fontenc}
\usepackage[scaled]{uarial}
\renewcommand*\familydefault{\sfdefault}

Here is the catch - by default LaTex renders your text in Serif fonts. You will need that last line to switch your whole document over to Sans Serif mode. Here is a sample:

Arial

Don’t use Arial though. Use Helvetica which is the de-facto king of Sans Serif world, and a far superior font. Syntax is the same as above:

\usepackage[T1]{fontenc}
\usepackage[scaled]{helvet}
\renewcommand*\familydefault{\sfdefault}

Quick note about the scaled parameter - you can use it to resize your font. For example using scaled=0.92 will give you size equivalent to 9pt.

Please compare the sample below with Arial, and decide for yourself:

Helvetica

The dirty little secret here is that when you do this, you might not be getting the true Helvetica but rather it’s very close cousin Nimbus Sans. That might depend on your system, and LaTex setup though. Nevertheless, Nimbus is still far better font type than Arial.

If you want to experiment with fonts, or you are simply searching for that unique look of your own, check out the LaTex Font Catalogue. It lists all the most popular supported packages, along with usage examples and sample images.

Update 05/08/2008 07:59:11 PM

By popular demand, I switched up the images a little bit to make them clearer and so that they don’t require you to click on them.

On Linux Hardware Compatibility

Wednesday, May 7th, 2008

I love how anti-linux advocates and windows fanbois always pick on Linux for hardware compatibility or rather lack of thereof. Just about every rant about Linux I have seen so far includes a gripe about it not supporting new or exotic hardware out of the box. Funny thing is that, neither does Windows.

Here is an experiment, and I encourage everyone to conduct it at their leisure. First, grab your Windows XP CD (preferably the one with SP2 slipstreamed in) and do a clean install on a formated drive. Once it is done, pull up the device manager and count the yellow question marks (these are the devices that failed to initialize because they are not supported out of the box). Try to figure out what they are (good luck on that), and write them down on a piece of paper. Once you do that, grab your favorite Linux distro (I recommend Ubuntu) and repeat the exercise. Once you have your Linux installed, run lshw or equivalent command and see how many of the devices from your “yellow question mark” list were detected and configured during the installation. I suspect that you will be able to cross of at least few of them from your list. Your results may vary

I did this experiment several times on fairly standard, and widely deployed (at least in my company) Inspiron 600m hardware. Both Dapper Drake (6.06) and Gutsy Gibbon (7.10) have booted into fully operational machines and installing optional “proprietary” drivers was as easy as clicking on a button in one of the system menus. Windows XP SP2 on the other hand booted in low resolution mode, and without any working network device forcing me to install 4 or 5 driver packages off an OEM “Drivers & Utilities” CD that was shipped with the machine.

I had very similar experience when I installed Hoary on my old Inspiron 4000 laptop back in the day. Not to mention that one time when I pulled the HD out of the aged 4000 and installed it in an Inspiron 4150 which actually had a different motherboard, different video card, sound card and network devices… And it still worked. Don’t ask me how - but I was using that machine for over two years without a hitch.

These are just the examples which I have documented, but in my experience every time I pitied Linux (or rather Ubuntu) against Windows the former always turned out to be the more robust, and more user friendly (at least during installation and setup) than the later. Perhaps I’m biased, but I implore you to test this yourself.

Note that I didn’t talk about Vista here, because I have yet to do a clean install of that monster. Hardly anyone that I know is running it, and those who are are usually more interested in downgrading to XP than re-installing it when the time comes. Honestly, that’s the sentiment around here. I can’t tell you how many people approached me asking if I can downgrade their Dell or HP computer to XP. But that’s inessential. Perhaps the new OS from Redmond can really match Ubuntu in it’s ability to detect, and configure hardware out of the box. Still, that doesn’t change the fact that Ubuntu had Windows XP (the most widely deployed OS in the world so far) outmatched and outclassed for years now all the while Microsoft fanbois were ragging on Linux for lack of hardware compatibility.

So I ask you, which operating system is better in this area? The main difference between the two is that all the hardware being sold out there is guaranteed to work on Windows. So while a clean install of XP will often have your machine limping in a half crippled, low resolution mode, with no sound, no network connection and no working modem, you can always get it working with the proprietary 3rd party drivers. You just need to find and install them - which may or may not be difficult, depending on whether or not you managed to lose the OEM CD with the drivers.

So what is the main difference between Windows and Linux? Windows always has access to 3rd party drivers - Linux, not so much. Is this something we should blame Linux community, or developers? No, not really - just like we can’t credit Microsoft and their dev teams with making all these drivers. They are made by hardware manufacturers who are at liberty to pick and choose which operating systems they are willing to support. How do they choose them? I guess they look at adoption and deployment rates - and many of them find Linux to be to small of a target to commit their resources to supporting it.

So we end up with an endless loop scenario. Hardware vendors are not supporting Linux because to few people are using it. Few people are using Linux because the lack of support from the hardware vendors. In such environment the only thing Linux community can do is to hack, and reverse engineer everything they can get their hands on, and support it out of the box. And this is what they have been doing for years now.

The Twitter Threading Problem

Tuesday, May 6th, 2008

I love Twitter, but ever since I started using it I felt that the way it implemented the reply system was a bit lacking. On one hand they made it very simple to reply to people. Simply start your message with @name and it will automatically link it to the most recent tweet of that person and then put it on their “replies” page. There is a problem here which is probably best illustrated by this conversation:

Twitter Threading Conversation

As you can see, we have a nice 3 way, asynchronous conversation going on here. Now, how hard do you think it was for me to generate the image above and make it thread like that? Can it be done automatically? Nope. The way I did this was to fish out and screen-cap the particular messages one by one. Also, they were not all in one place. I had to go and grab some of them from Miloš’s and Billy’s twitter pages.

Quotably is a valiant attempt to automatically generate stuff like the manually constructed conversation above. Unfortunately due to the fact that Twitter’s threading model is broken by design it doesn’t always work. For example, here is what it caught from this conversation:

Quotably

As you can see, it’s almost there. It just that a lot of the tweets simply got dropped for various reasons. For example, using the @name notation in your message more than once will confuse the system and it won’t link to anything. The other issue was that at one point Miloš and Billy were talking to each other and while I still saw their conversation happening on my updates page , and it was still regarding the same topic page, Quotably wouldn’t associate it with my name and include them in this thread.

Billy is right, putting the StatusID value in the message itself is probably a bad idea as it would quickly start eating away our precious 140 characters. But Twitter already does some behind the scenes magic, when it links your message to the top tweet of the person you replied to. So in theory you should be able to re-construct the whole thread by simply clicking on the “in reply to” links. This hardly ever works though. Most conversations are asynchronous so by the time you replied to something, the author of that tweet might have posted 3 or 4 new messages and yours gets linked to the most recent one.

There really ought to be a way to link to a message rather than to a person. We are already halfway there - each tweet has a reply button associated with it. Now all we need to do is to make these buttons do something - and preferably without losing any characters in the message area.

Before that happens though, we need a better threading app. Is it possible to do it with the current Twitter setup? Perhaps, but we can’t really really rely on the in-reply-to implementation currently in place. This is what Quotably is doing right now, and it is not really working for conversations such as the one above. So how do we thread messages?

Perhaps we could try to reconstruct a conversation by simply doing a full text search on the @name pattern and then arrange tweets chronologically. So a user could select 2-3 user names, and then have the Quotably like service fetch all their status updates in which either of the chosen users includes an @name pattern of any of the other two. No indentation, or other thread structure would be necessary here, because it is pretty much infer it from the data at hand. But if we list them chronologically we at least capture them the way they originally appeared in your Twitter stream.

It’s not perfect but it could be a nice complement to a service like Quotably. I do not purpose using it as the main threading system, but rather as an alternative which could help users track more complex, many sided conversations between a group of users.

Update 05/07/2008 01:27:03 AM

Heh, it appears that Google Adsense is having an interesting take on this thread. Observe:

Twitter Threads According to Adsense

Send a HTTPS POST request with C#

Monday, May 5th, 2008

The other day I wrote about my little Drag and Drop application and mentioned I wanted it to send HTTPS POST requests to an existing PHP web application. I thought it would be a relatively trivial task, but it turned out not the be as easy as I thought. First let me show you the code, and then discuss the pitfalls.

Sending the request by itself is actually pretty easy. Observe:

// this is what we are sending
string post_data = "foo=bar&baz=oof";
 
// this is where we will send it
string uri = "https://someplace.example.com";
 
// create a request
HttpWebRequest request = (HttpWebRequest)
WebRequest.Create(uri); request.KeepAlive = false;
request.ProtocolVersion = HttpVersion.Version10;
request.Method = "POST";
 
// turn our request string into a byte stream
byte[] postBytes = Encoding.ASCII.GetBytes(str);
 
// this is important - make sure you specify type this way
request.ContentType = "application/x-www-form-urlencoded";
request.ContentLength = postBytes.Length;
Stream requestStream = request.GetRequestStream();
 
// now send it
requestStream.Write(postBytes, 0, postBytes.Length);
requestStream.Close();
 
// grab te response and print it out to the console along with the status code
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
Console.WriteLine(new StreamReader(response.GetResponseStream()).ReadToEnd());
Console.WriteLine(response.StatusCode);

Piece of cake, right? Even if you don’t know C# you should be able to follow this. As a side note, can you see how freakishly verbose this shit really is? I mean, how come ruby can implement this like this:

Net::HTTP.post_form(URI.parse('https://somplace.example.com'), 
   {'foo'=>'bar', 'bas'=>'off'})

Or python for that matter:

data = urllib.urlencode({"foo" : "bar", "baz" : "oof"})
urllib.urlopen("https://somplace.example.com", data)

Is it really necessary for me to declare all that shit and go through all the motions to perform something that other languages do in a single line? This is one of these reasons why I no longer think that the traditional Java/C# approach of static typing, and overwhelming verbosity is the way to go. The functionality here is the same - and yet Ruby and Python totally win on expressiveness and readability hands down.

But I didn’t really start this post (just) to rag on the verbosity of C#. The problem with the code above is that it doesn’t work if your certificate is not valid. Why would I be posting to a web page with and invalid SSL certificate? Because I’m cheep and I didn’t feel like paying Verisign or one of the other jerk-offs for a cert to my test box so I self signed it. When I sent the request I got a lovely exception thrown at me:

System.Net.WebException The underlying connection was closed. Could not establish trust relationship with remote server.

I don’t know about you, but to me that exception looked like something that would be caused by a silly mistake in my code that was causing the POST to fail. So I kept searching, and tweaking and doing all kinds of weird things. Only after I googled the damn thing I found out that the default behavior after encountering an invalid SSL cert is to throw this very exception.

Fortunately, there is a quick and dirty workaround. You simply need to subclass the ICertificatePolicy class from the System.Security.Cryptography.X509Certificates namespace and rig it so that it always validates the certificate, no matter what:

using System.Security.Cryptography.X509Certificates;
using System.Net;
 
public class MyPolicy : ICertificatePolicy
{
  public bool CheckValidationResult(ServicePoint srvPoint, 
    X509Certificate certificate, WebRequest request, 
    int certificateProblem)
  {
    //Return True to force the certificate to be accepted.
    return true;
  }
}

Once you do that, you just need to toss the following line somewhere in your code, before you actually send the request. This will swap your default CertificatePolicy class for our fake one with the validation hack:

System.Net.ServicePointManager.CertificatePolicy 
    = new MyPolicy();

Note that the compiler may whine about this approach being an obsolete, but since this is a pretty ugly hack in itself, I paid no heed to it. There might be a better way to do this in .NET 3.5 but since it worked I let it be for now.

So there is how you do it. A more relevant exception message would be a lot of help in this circumstance. Perhaps something among the lines of “invalid certificate”? But I’m just thinking out loud here. If you ever run into this issue, here is how to solve it.