Do you ever feel that siren call of code that needs to be written? Sometimes I get an idea into my head, and then spend the next few days thinking about little else. I’m thinking about the code in the shower, on the toilet, in bed before sleep, while I sleep. Half the time it’s not even that interesting of a project… But it is a project, and I want to get it done. This is what happened to me this weekend.
My university used to allow students to create personal websites via Novell NetDrive service. It had a rather clunky, but perfectly usable web interface that allowed anyone to log in and manage files in their PUBLIC_HTML directory from anywhere in the universe (provided they can get internet connection). I used that service extensively for the HTML lab and final project assignments. But alas, the OIT decided to phase out all the Novell stuff and replace it with something much more difficult to use for the average student.
The new system requires you to mount the networked drive using WebDav, which is already a hurdle much to high for most of my students. But to add insult to the injury the difficulty is compounded by two additional issues:
- For some strange reason you must change password of your campus wide ID for this new system to even bother talking to you
- Computers in the lab are locked down so tightly that without an admin passwords students can’t mount shit
I figured that I might be able to skirt around these issues somehow, but despite my best attempts I haven’t been able to set the damned drive on my system for like 3 days. So having filed a tech support ticket into a black hole that is the OIT support system, I got an idea: I could write a bare bones NetDrive replacement over the weekend.
I’m not sure how I settled on using Google App Engine for this. I think I just didn’t want students to put their filthy files on the same server as my blog, I didn’t want to run a home server seeing how I don’t have a spare box, and I didn’t want to pay for the pleasure. So somewhere along the way I decided that App Engine is a great idea, even though it does not actually have a real file system. But, App Engine is free, and you can easily save files in it’s Blobstore.
So the idea is simple: the user comes along and uploads a file to blob store. We save his username, the file name and the Blobstore reference into the Datastore. Then we allow the user to retrieve his vile using a neat url that looks something like:
How do we do that? First let’s set up our handlers like this:
def main(): application = webapp.WSGIApplication( [('/', MainHandler), ('/upload', UploadHandler), ('/list', ListHandler), ('/pub/([^/]+)/([^/]+)?', ServeHandler), ], debug=True) run_wsgi_app(application)
So the upload form will sit at /, the upload action is gonna happen at /upload and we will be serving the files at /pub/username/filename.ext. So far so good.
This is how we are going to store our file information:
class MyBlobFile(db.Model): userName = db.StringProperty() fileName = db.StringProperty() blobstoreKey = blobstore.BlobReferenceProperty()
Apparently you have to use the BlobReferenceProperty to store a the unique blob id key in the dataStore. Initially I set this field as a StringProperty but App Engine was complaining like a little bitch. So I changed it.
Setting up a form is easy, but you need to remember the silly create_upload_url call. Make sure you include it. Otherwise it won’t work.
class MainHandler(webapp.RequestHandler): def get(self): # username = whatever, get from your session/login handler upload_url = blobstore.create_upload_url('/upload') # don't forget this self.response.out.write('<form action="%s" method="POST" enctype="multipart/form-data">' % upload_url) self.response.out.write('<input type="hidden" name="username" value="%s"><br>' % username) self.response.out.write("""Upload File: <input type="file" name="file"> <input type="submit" name="submit" value="Submit"></form>""")
Here is the actual upload code:
class UploadHandler(blobstore_handlers.BlobstoreUploadHandler): def post(self): upload_files = self.get_uploads('file') # 'file' is file upload field in the form blob_info = upload_files # that's it, we're done # save in Datastore f = MyBlobFile() f.userName = self.request.get("username") f.fileName = str(blob_info.filename) f.blobstoreKey = blob_info.key() f.put()
Finally, this is how we serve the file. It’s also very, very simple:
class ServeHandler(blobstore_handlers.BlobstoreDownloadHandler): def get(self, ffolder, ffile): # get the blob key from the blobstore q = db.GqlQuery("SELECT * FROM MyBlobFile WHERE netID =:1 AND fileName =:2", str(ffolder), str(ffile)) results = q.get() resource = results.blobstoreKey self.send_blob(resource)
If you are perceptive, you probably noticed a flaw in my logic here. What if the user uploads two files with the same name? The Blobstore and Datastore won’t care. They will simply assign a new random key to the new entry and call it a day. This is indeed an issue, but I got around it by simply checking whether or not the file exists running the same query when the file it is uploaded. If there already is a Datastore entry that matches this username and filename then I delete it, and the associated blobstore entry. This mirrors what would normally happen in a filesystem – file would get overwritten.
This, ladies and gentlebirds is how you do it.
In retrospect, I probably did not need Blobstore for this issue. You see, Blobstore is a “billing only” feature of App Engine. I did not know that when I started this project but it bears mentioning: you will need to enable billing in order to use it. So if you can get away with it, it is probably a better idea to store your files in DataStore using BlobProperty. But it is nowhere near as nice – you essentially have to implement the blob_send() function yourself, send the correct mimetype headers and etc..
I’m actually considering rewriting my code to do it that way. For the time being, I enabled billing for my app, but set the daily budget to $0 which should keep me in the free quota range of 1GB of storage space. Since I’m only going to expose this app to 26 users I’m hoping this will be enough. It will be interesting to see if they will blow through the bandwidth and concurrent access quota during the lab session. They shouldn’t but then again, you never know.
Quick note: you can’t set your quota to $0 when you first enable billing. You have to set it to $1 first, and give them your credit card information (nothing is charged up front though). Then you wait 15-20 minutes till their system makes up it’s mind about the whole billing change, and go back in to change it down to $0. At this point it will accept 0 as a valid input value.
Thursday will be a genuine test by fire for my code. I will let you know the damage on Friday or next week.
In the meantime, if you want to mess around with the code, I have it up on GitHub.
Please note that I opted not to use the built in user/session handling mechanisms and instead bolted on custom session handling with GAE Sessions. If you wanted to use my code for your project, this might be something you would want to change. My justification for doing this is that Google account registration is a pain in the ass sometimes. I actually don’t know how many of my students have Gmail accounts and I don’t feel like walking 20 people through Google registration pages during the lab. So I went with something quick, easy and hackish.
And yes, I’m storing passwords as unsalted md5 hashes. That’s like 4 WTF’s all rolled up into one right there. Sue me. I just didn’t care enough to write something more robust. If Thursday is not a complete disaster, and I decide to continue using this tool, I will probably fix this.
Also, can someone explain to me why almost all non-programmers assume that it is a great idea to try to chit-chat with you when they see you have code on your computer screen. It’s like “Oh, I see you are busy writing code. Let me interrupt you by telling you about my day, and asking irrelevant questions about the weather forecast for Monday.” That shit is getting notorious lately.