The other day this well known Excel issue trickled down to me. If you are to trigger happy applying formating to your excel cells, you eventually hit some sort of hard coded limit. Excel can only handle 4k of formating annotations. I suppose this is an inherent flaw of their binary file format, but then again, who knows. The bottom line is that this has been an issue in every single Excel version up to, and including Excel 2003. Microsoft has
a long standing knowledge base article which gives no suggestions on what to do once you hit that forbidden threshold. Their advice is essentially “don’t use so many different formats”. That’s about it.
The problem is that once you hit the magical number that oscillates somewhere in the 4k range Excel will block you from doing any new formating adjustments by popping up the “Too many different cell formats” error message. In theory, simplifying your worksheet will make the error go away, but to do that you will need to either remove or modify formating on some of the cells or cell ranges. But you can’t do that because of the error. So you end up in an endless loop. To solve the problem, you have to do the precise thing that it prevents you from doing.
The poor guy who fell victim to this bug went through 3 support drones who all told them the same thing – don’t use so much formating. Fortunately he CC’d some managers on the support emails, and eventually it ended up in my mailbox. Which was a good thing since I’m not as immersed in the Microsoft monoculture as the lowly tech support drones. The truth is that there is a solution to this problem, and it’s very easy. But Microsoft won’t even tell you about it because it involves using a competing product.
If you ever run into this issue do the following:
- Open the document in Open Office
- Save in the Microsoft XLS file format
80% of the time this is enough to solve the issue, because Open Office organizes the formating information slightly differently due to the differences in implementation. For the most part it is a very exact port, and it faithfully mirrors Excel functionality, but there might be few minor glitches when you do this. However, it probably beats re-creating the document from scratch via copying and pasting.
The problem here is that none of the low level guys knew that, or even considered trying it. Living immersed in the monoculture has blinded them to alternative solutions. This is the price we pay for letting Microsoft run our lives.
I got this in my inbox today – a very interesting read regarding this issue. I figured I’ll share it here for reference as it directly pertains to the problem at hand:
I was Googling the phrase “Too many different cell formats” today, to check that my web site (which offers a solution) was still sitting high up Google’s list, and came across your blog at http://www.terminally-incoherent.com/blog/2008/04/02/excel-too-many-di fferent-cell-formats/, with the suggestion of using Open Office to get rid of the error.
I was aware of the Open Office solution, but am also aware of its disadvantages:
The main disadvantage is that once you have transferred the defective Excel workbook to an Open Office environment, you have to repair the problem there (which is obviously easier, since OO doesn’t keep on reiterating the error message every time you try to do anything). Most people are in a situation where they cannot call for a complete switch to OO in their organisation, and they want to be able to put the mended workbook back into Excel. How long it is before they hit the error again is almost impossible to forecast, and they are right to worry.
The second is that there is reason behind the formatting difference that makes OO cope when Excel can’t. I think this error turned up in Excel 97, which was a fairly substantial re-write of the object model; but perhaps it came earlier. The intention seems likely to have been a need to speed up the rate at which Excel could repaint the display when it changes. That rate could be painfully slow if running Excel across the sort of telephone line that was common at the end of the 80s. (I remember having to rejig the arrangements in a Delphi database application I had programmed, when I discovered that pressing the down arrow key once while looking at a display caused the server to retransmit the whole of the database’s contents over the ‘phone line to my desktop and then repaint the visible display about 2 minutes after the arrow had been pressed). I have never compared the rate at which OO and Excel perform large operations of that sort, but would not be surprised if Excel wins the race, as a result of enhancements like this one.
So, though the OO solution is effective, not everyone is going to want to commit themselves to using it. A rather more painful way of getting rid of the error message, so that slow repair work can start, is to identify a row or column which must contain a uniquely formatted cell (which might be a row or column containing the cell[s] most recently changed, that caused the crash), and delete it completely. It will be worthwhile to note all its contents. It could be sensible to delete it by Cutting it (with Ctrl-X) and then paste it from the clipboard into an empty workbook in a separate instance of Excel. Then the deleted range can eventually be pasted back into the problem workbook after it has been repaired.
Why did Microsoft invent the format combinations array? Someone probably worked out that it would save time if every cell in the UsedRange had a pointer to an array of format combinations which could be applied directly. I expect they gave the programming job to a Seattle high school pupil on vacation job experience, and then went off to the West Indies rather than supervise the work. Certainly his/her implementation is woefully short of any logic to recognise in advance that the array is nearly full, issue warnings of the need to clear things up, and finally refuse to accept any more formats, but leave enough array space to allow the user to make changes which will reduce the number of format combinations, and not crash the workbook.
You may not realise that the reason why the limit is said to be “approximately 4000” is that it seems to vary. Some workbooks crash when the number of different format combinations is as low as 2500. I have one with over 4400 different format combinations which only crashes when the user tries to add one more. The reason why it varies is that the list that eventually gets filled up in practice nearly always seems to contain duplicate entries, so that it has more formats in it than one can actually count.
I think that duplication occurs because the Excel object model must be programmed to carry out list housekeeping as a background task. If I format a cell using a font (eg Monotype Corsiva) which in fact is previously unused, the basic task for Excel must be to determine whether the cell has a new format combination, and if so add it to the list, then insert the relevant pointer value in the cell data. When the list has several thousand entries in it, and I have just reformatted a cell, the repetitive searching could take ages, even with some clever logic. Easier to add the new format to the list, and later search in background to see if it is a duplicate and remove it if so. If the workbook is saved and closed at the end of a session, any de-duplication that has not yet been done just gets lost. The duplicates problem is another indication that this whole aspect is not the finest portion of the Excel object model. If the error had ever hit Bill Gates in person, the Chief Software Architect would have spent billions to rebuild this facility completely.
There is a lot of misguided advice available on the internet (eg Leo Heuser’s regularly-published VBA routine to DeleteUnusedCustomNumberFormats which may be a solution to some problem, but definitely not to this problem). But there are really only two solutions to the formats problem.
One is provided by the tool at http://xlsgenreduction.arstdesign.com/index_en.html which actually manipulates and rewrites the contents of the workbook file, to eliminate duplicated format combinations and, if required, to reduce the number of combinations further by imposing the replacement of formats which are very similar to others in the list. (Interestingly, that web site uses the same terminology as the official Microsoft contribution, by talking of Styles. The Style object is a completely different aspect of the Excel object model, and both papers confuse the reader). The aspect of Xlsgen that my customers criticise is that it gives the user no control of the automated reduction process; it can be very frustrating to find that it has wiped out all cases of a format which differs only slightly from its replacement but was created just to deliver the difference required.
The other solution is my much more laborious tool, QAid at http://www.total.rowing.org.uk/quarrell/QAid/ which, being itself an Excel 97 workbook, ploughs through all the problem workbook’s UsedRange, identifying, counting and listing the format combinations. It can take ages, but produces a good list. Then it offers a user-controlled elimination process. That presents the user with one of the cells that each have a unique format combination, and a list of the 11 cells which have the fewest format differences from it. The user chooses one of the 11 (or skips if none are suitable), and its format is applied to the unique cell. A few hours’ work reduces the number of formats away from the danger level, and a second run of the analysis confirms what the safety margin is.
However, QAid does not attack the problem of duplicates. None of my users has ever complained about this. I think that is because most Excel spreadsheets large enough to hit the error have several hundred unique formats in them (the record is 1250 unique cells), and eliminating those gives a large safety margin. Everyone who has ever hit the limit develops a more defensive approach, and few see it again.
I don’t know whether any of this is of interest to you – but it’s all knowledge.
Thanks Peter! OO.org is indeed a quick and dirty fix here. It worked for me, but as you said it may not resolve the issue completely. Thanks for elaborating on this.