Archive for the 'programming' Category

PHP: File Download Script - Straming Binary Data to the Browser

Monday, October 13th, 2008

I don’t think I have posted this snipped of code here yet. It is old as hell, but I use it all over the place lately so I figured I just post it here for future reference.

The script below solves the age old problem of making bunch of files accessible to your users without dumping them in a publicly accessible directory (files, not users). For example, you have bunch of PDF documents you don’t want to be indexed by Google, or directly linked to from other websites. What you want is to allow certain users to download them after logging into your web app and checking their cridentials. How do you do that?

Well, the simplest thing to do is to put them in some directory outside of your web root. Set up .htaccess (or whatever you use) to disallow any connection to that folder from the outside world. The only machine with access to these files should be localhost. Then you do some PHP magic in your application to stream the file into the browser upon successful authentication.

The streaming part is actually pretty straightforward - it is a simple combination of the print and fread commands. The convoluted part is convincing your browser to actually initiate file download instead of just dumping ASCII gibberish onto the page for binary files. This is accomplished by sending bunch of headers to the browser prior to streaming the files. The headers are different for IE and the rest of the world but this is pretty much what I ended up with after a lot of trial and error:

// change these to whatever is appropriate in your code
$my_place = "/path/to/the/file/"; // directory of your file 
$my_file = "filename.ext"; // your file
 
$my_path = $my_place.$my_file;
 
header("Pragma: public");
header("Expires: 0");
header('Cache-Control: no-store, no-cache, must-revalidate');
header('Cache-Control: pre-check=0, post-check=0, max-age=0', false);
header('Last-Modified: '.gmdate('D, d M Y H:i:s') . ' GMT');
$browser = $_SERVER['HTTP_USER_AGENT'];
 
if(preg_match('/MSIE 5.5/', $browser) || preg_match('/MSIE 6.0/', $browser))
{
  header('Pragma: private');
  // the c in control is lowercase, didnt work for me with uppercase
  header('Cache-control: private, must-revalidate');
  // MUST be a number for IE
  header("Content-Length: ".filesize($my_path)); 
  header('Content-Type: application/x-download');
  header('Content-Disposition: attachment; filename="'.$my_file.'"');
}
else
{
  header("Content-Length: ".(string)(filesize($my_path)));
  header('Content-Type: application/x-download');
  header('Content-Disposition: attachment; filename="'.$my_file.'"');
}
 
header('Content-Transfer-Encoding: binary');
 
if ($file = fopen($my_path, 'rb'))
{
  while(!feof($file) and (connection_status()==0))
  {
     print(fread($file, filesize($my_path));
     flush();
  }
  fclose($file);
}

I found these headers to work for me. Your millage may vary. I tested the script above in IE 6 and 7, Firefox 2.x and 3.x, Konqueror 3.5.8 and Chrome 0.2.149.30 and experienced no problems. As usual, I’m always open to constructive criticism and better solutions in the comments. Let me know what you think!

JQuery Tablesorter: List of Builtin Parsers/Sorters

Monday, September 29th, 2008

On of my users tried to explain a bug to me today. Apparently the results on the search page would not sorting properly. Or they were sorting but not by date but by half or a quarter of the date. Or at all. Or they would sort correctly, but sometimes they didn’t and only half sort them. Needless to say, I was thoroughly confused and since this was not one of those “just read the error message to me” issues, I decided to visit the desk of the person who was complaining.

When I got there, I got a small demonstration. “Watch this!” she said, as if she was going to do some trick and I got scared for a second that she will do something that we could not have predicted, and inadvertently crash the whole application due to some hidden bug. You know the type of the bugs - the ones that are completely missed in code review, overlooked in testing and only come out when a user starts clicking buttons you didn’t even put into the design somehow. Fortunately she just typed a query into a search form, and tried to sort the results by date. Then she triumphantly pointed at the screen: “See! That’s what I mean”.

She was right. The results were indeed getting sorted according to the date column, but the algorithm was wrong. Instead of sorting by date, the table was sorted alphabetically/numerically with the obvious results. So I went back to my desk to figure out what went wrong.

The sorting was done by the Tablesorter Plugin so I assumed that the algorithm was right started digging in our code first. I soon figured out what was causing the issue: blank date values!

It’s simple, to avoid clutter missing dates are simply not displayed. So a table will look a bit like this:

<tr><td>05/25/08</td> <!-- snip --> </tr>
<tr><td>08/01/08</td> <!-- snip --> </tr>
<tr><td></td>         <!-- snip --> </tr>
<tr><td>12/11/07</td> <!-- snip --> </tr>
<tr><td>11/10/07</td> <!-- snip --> </tr>

If there are not blank cells in the column, tablesorter script correctly recognizes it as a date column. If it sees blank cells, it reverts back to a text sorting algorithm. Since only some queries would produce blank cells like that, this issue went unnoticed for quite a while. I guess we trusted tablesorter to do the identification thing properly.

The fix was trivial - force the tablesorter to treat date columns as date columns no matter what they contained. You pretty much have to specify the data type for each column:

  $(document).ready(function() 
  { 
    $("#myTable").tablesorter(
    {
       headers:
       {  
         0 : { sorter: "shortDate"  },
         1 : { sorter: "shortDate"  },
         2 : { sorter: "shortDate"  },
         6 : { sorter: "shortDate"  },
         13 : { sorter: "shortDate"  },
         14 : { sorter: "shortDate"  },
         16 : { sorter: "shortDate"  },
         17 : { sorter: "shortDate"  },
         19 : { sorter: "shortDate"  }
       }, 
       widthFixed: true,
       widgets: ['zebra']
    }); 
  });

Btw, guess how I knew that I needed to use the “shortDate” keyword? Because I looked at the tablesorter.js code naturally. Initially I tried “date” but of course that did not work. Next I stared at the online documentation for 20 minutes before I decided it was pointless, so I just downloaded un-minified version of the script, and scanned through it looking for parsers and their names.

Tablesorter is a great plugin, but it really could use more in-depth documentation. While I was digging around in the code, I decided to write down the names of all the parsers for future reference. Here they are:

  • text
  • integer
  • currency
  • floating
  • ipAddress
  • url
  • isoDate
  • percent
  • usLongDate
  • shortDate
  • time

The auto detection in the script works pretty well most of the time. It can however fail for simple reasons such as blank lines. I’m surprised that the list above was nowhere to be found on the page. Oh well… It’s here if you need it.

VBA Hacks and Trickery Grab Bag

Thursday, September 11th, 2008

Hi, I’m here to torture you with more VBA! Run away! Run away! This is part two of the tales from the Programmer’s Purgatory. And once again, no it is not hell. When you are in Programmer’s Hell you spend all your time in meetings, and/or doing team building exercises *shudder*.

This time I’m unloading few random snippets of code that I produced during my sentence in the purgatory. For example, there is a vba macro that is supposed to extract some data from an excel file with ~20 worksheets. The users need to be able to add new worksheets to it, or move and modify existing ones (that is an absolute requirement, and arguing about it is pointless) so you can’t really protect the workbook. There are really two ways to access any given sheet:

  1. By it’s name: ThisDocument.Sheets(”Accounts Receivable 10″)
  2. By position: ThisDocument.Sheets(10)

There is of course an issue here. If you try to call worksheets by name, the users will promptly rename every single one. If you try to get it by position, they will rearrange the whole workbook. It is a lose-lose situation and almost every day our helpdesk has this very conversation:

luser: “Hi, the import macro is broken”
helpdesk: “Did you rename any of the worksheets in your excel file?”
luser: “Well, yes…”
helpdesk: “Ok, did you see the big warning in red letters that says DO NOT RENAME OR DELETE THIS WORKSHEET on top of each sheet?”
luser: “Well, yes…”
helpdesk: “So why did you rename them…”
luser: “Well… You know… So, um… What do I need to do now?”
helpdesk: “Rename them back?”
luser: “Ok, I’ll email you the file and you do it for me cause I don’t know how to do that”
helpdesk: *shoots self*

We decided to be tricksy Hobbits and added the following snippet to every single sheet in the document that should not be renamed:

Sub Worksheet_Deactivate()
    Me.Name = "The Name of the Worksheet"
End Sub

It’s funny because it is so subtle. They can rename the worksheet all they want, but as soon as they click on another sheet the name will just quietly revert to what it was supposed to be in the first place. No matter what they do to it, it won’t keep the new name. I imagine they will eventually give up, or call the Helpdesk and then we can tell them in our best BOFH voice “Read what it says in red on top of the worksheet. Do not rename!”

Of course techno-idioticus luseratis is the most ingenious animal on the planet. While they are unable to perform the simplest tasks in Windows or Office without assistance, they show almost limitless resourcefulness when it comes to circumventing the tricks we use to stop them from breaking things.

So if they can’t rename the worksheet they will delete and recreate it or do something equally silly. So we came up with an insidious way to make their life harder. Naturally, it couldn’t be easy because the fuckers at Microsoft decided that there is no reason to fire some sort of an event when a worksheet is deleted. I mean, who would ever want to capture such an insignificant event? No one, that is who.

So there is no real way of preventing the user from deleting a worksheet other than protecting it, or protecting the workbook but that naturally won’t work for two reasons. One reason is that they would often need to un-protect it in order to add/modify certain sheets. The second reason is that most of the time you can remove protection using tools like this one.

So we figured that we’ll just disable the Delete button on relevant Sheets. The user will still be able to get in trouble by moving the worksheet to another workbook, but at least the most obvious way to delete things will be disabled. This is a bit convoluted, but I blame Microsoft for not making a Worksheet_Delete method.

' Put this in it's own module or Workbook module
Sub DeleteButton(ByVal enable As Boolean)
  Dim CommBarTmp, Commbar As CommandBar
  For Each Commbar In Application.CommandBars
    Set CommBarTmp = Commbar.FindControl(ID:=847, _
       recursive:=True)
    If Not CommBarTmp Is Nothing Then _ 
      CommBarTmp.Enabled = enable
  Next
End Sub
 
' On each sheet put the following two subroutines:
 
' When browsing away, enable the Delete button
Private Sub Worksheet_Deactivate()
    DeleteButton True
End Sub
 
' When activated, disable the Delete button
Private Sub Worksheet_Activate()
    DeleteButton False
End Sub
 
' Add these two in the Workbook module:
 
' Enable the button before closing excel
Private Sub Workbook_BeforeClose(Cancel As Boolean)
    DeleteButton True
End Sub
 
' Enable the button when you switch to another workbook
Private Sub Workbook_Deactivate()
    DeleteButton True
End Sub

Note that you could use the Workbook_SheetActivate method to activate this behavior for all worksheets instead of using separate Worksheet_Activate and Worksheet_Deactivate methods. We didn’t do that because only some sheets need to be protected this way, while others are free game.

Finally, here is another frequent Helpdesk conversation that happens at least twice a day:

luser: “Um… Is there a way to make Word recover a document that wasn’t saved?”
helpdesk: “Not sure what you mean… What happened?”
luser: “Well, I was working on this file, then I closed Word and it asked me if I want to save changes so I clicked no. Now all my changes are gone. I lost 6 hours of work.”
helpdesk: “Why did you click on no?”
hluser: “Um… Because I realized that I forgot to save the document and wanted to go back to do Save As”
helpdesk: *bangs head against the keyboard*

So here is yet another trick - we will forcefully save the document even if the user resists:

' Put all of these in ThisDocument module
Sub AutoOpen()
    WaitAndSave
End Sub
 
' Wait 5 minutes
Sub WaitAndSave()
    Application.OnTime Now + TimeValue("00:5:00"), "Saver"
End Sub
 
' Save the doc and wait again
Sub Saver()
    ThisDocument.Save
    WaitAndSave
End Sub

This way, the lusers can’t claim they lost 5 hours of work because Word sucks or because some macro crashed their Office or because there was a power outage. Now they can only lose up to 5 minutes of real work. I’m also considering combining this with the versioning feature of Word 2003 (and higher). This way if the user fucks something up they will be able to roll back their changes in 5 minute increments.

I hope you enjoyed this brief excursion into the terrifying world of madness which is VBA. Let’s hope there won’t be part 3, because even right now I feel that my sanity is slipping.

Making a Better Use of Script Tags

Wednesday, September 10th, 2008

Admittedly, I’m a certified Stage 3 Javascript fanboi. There is a lot of things I love about the language, but one thing I do not particularly care for is the script tag syntax. Let me show you what I mean. Pretty much every page I created recently has something like this near the top:

<script type="text/javascript" src="jquery.js"></script>
 
<script type="text/javascript">
 
   $(document).ready(function(){
      // Your code here
   });
 
</script>

In other words you have two script tags in your code. First one imports the external js file (in this case the JQuery library) and the second one calls some functions from that file. If the first script tag fails to import the external code for some reason (for example due to a network problem) then the second tag will blow up in your face immediately. Granted this is how embedding of javascript in HTML was done since the begging of time, but you can’t say this is a graceful behavior.

John Resig (the man behind JQuery) noticed this, and he proposed to fix this behavior. He proposed to change the behavior of your script tags in order to allow you to write your code this way:

<script type="text/javascript" src="jquery.js">
 
   $(document).ready(function(){
      // Your code here
   });
 
</script>

Embedding the code operating on the library in the very same script tag has many benefits. For one, it logically groups the code so the calls to library functions occur in the same place that library is imported. Furthermore, it prevents network related failures. If you can’t import jquery.js, the code related to it will not get executed. This is a much more graceful than what we have now.

The best part is that these are not just some theoretical musings that will never be implemented by any of the mainstream browsers. John achieved this effect by adding 3 lines of code to the end of jQuery.js file ultimately proving two things:

  1. That John Resig is the man
  2. That jQuery is FUCKING AWESOME

In fact he takes it one step further. For example, roughly 99% of code you will write against the framework will be inside the $(document).ready closure. So while we are using shortcuts, why not just make that code optional in our modified tags:

<script type="text/javascript" src="jquery.js">
      // Your code here
</script>

Doesn’t that look much clearer, and straightforward? I love it! But don’t get too excited yet. This functionality is not included in jQuery yet and there is no guarantee that it will ever make into it. John says it will need some through testing before it gets deployed.

Once more I’m amazed, and humbled by what you can do with Javascript, and exactly how much power is packed into that 16kb jQuery script. Mind boggling!

Excel: Adding Checkboxes the Easy Way

Thursday, September 4th, 2008

Apparently I have trespassed against The Big Programmer In the Sky and I’m currently suffering in Coder’s Purgatory where they make me code in VBA. It could be worse. I could be in Programmer Hell (also known as the infamous 0xf th Circle of Hell) where you don’t get to code at all but instead attend meetings and team building events. I figured that since I’m suffering, I will make you suffer as well by exposing you to some VBA.

For example, how about we create something like this in Excel:

excel1.png

You know, like a checklist. The users would go down, the list, hit the appropriate checkboxes and then we would have a script that extracts this data, and builds an exception list. How many questions? Oh, not many - maybe like a 100 in total, spread across 4-5 worksheets. And the good news is that the questionnaire already exists in Word document so you can just like copy and paste it.

Each checkbox is linked to a cell and toggling it will toggle the value of the cell between TRUE and FALSE. Extracting data is trivial. The problem here is creating the checklist itself - it is a tedious data entry job you give to your intern after he/she finishes filing, and does the Starbucks run. There is no easy way to copy these checkboxes. Or rather you can copy them, but their cell references do not change. So you have to change the linked cell property for every single checkbox manually.

So I said “fuck this shit” and wrote a macro to do this for me. The following script will prompt the user for 3 things: the Cell Range which should be given like so A1:A10 which is the range where the checkboxes will be inserted, a Linked Column to be used for linked cells (the row number of the linked cell will be the same as the row number of where the checkbox was inserted) and a Checkbox Label (the text to be displayed next to the checkbox). Then it will generate the right number of checkboxes. Here is the code:

Option Explicit
Sub insertCheckboxes()
 
  Dim myBox As CheckBox
  Dim myCell As Range
 
  Dim cellRange As String
  Dim cboxLabel As String
  Dim linkedColumn As String
 
  cellRange = InputBox(Prompt:="Cell Range", _
    Title:="Cell Range")
 
  linkedColumn = InputBox(Prompt:="Linked Column", _
    Title:="Linked Column")
 
  cboxLabel = InputBox(Prompt:="Checkbox Label", _
    Title:="Checkbox Label")
 
  With ActiveSheet
    For Each myCell In .Range(cellRange).Cells
      With myCell
        Set myBox = .Parent.CheckBoxes.Add(Top:=.Top, _ 
          Width:=.Width, Left:=.Left, Height:=.Height)
 
        With myBox
          .LinkedCell = linkedColumn & myCell.Row
          .Caption = cboxLabel
          .Name = "checkbox_" & myCell.Address(0, 0)
        End With
 
        .NumberFormat = ";;;"
      End With
 
    Next myCell
  End With
End Sub

So here you go, inserting checkboxes the lazy way. Welcome to my purgatory, I hope you can feel my opain. Now excuse me while I go shoot myself in the face with a shotgun.

End With.