Have I mentioned that I love Vim? It is such a useful little tool. Let me give you an example. The other day someone gave me a messy list – a dump of email addresses from some database as a comma separated file. Here is a sample of how it looked like – note that instead of actual emails I’m using names of vegetables, fruits and other stuff (like Rupert, or poo for example):
test, poop, boob, apple, carrot, mango, kiwi, apricot, banana, apple, tomato, prune, cranberry, raspberry, orange, lemon, potato, pudding, lemonade, pants, spoon, flax, dogmeat, poison, pee, hamburger, rupert, apple, mango, sunflower, bee, pumpernickel, puddle
Of course the actual list had over a thousand emails and did not include Rupert (or his poo for that matter). It was equally messy, full of duplicate emails and basically looking like a wall of text. Someone requested it expecting a sorted, itemized list that they could print out and look at for reference. What they got was a text blob.
So I grabbed the file and opened it in the trusty old Vim and issued three commands. The first one was:
Of course this is a single regexp. I’m replacing every occurrence of a comma followed by a space with a carriage return. This sort of unrolled my csv into a file with a single item on each line:
test poop boob apple carrot mango kiwi apricot banana apple tomato prune cranberry raspberry orange lemon potato pudding lemonade pants spoon flax dogmeat poison pee hamburger rupert apple mango sunflower bee pumpernickel puddle
To sort it, I simply did:
This sorted my list and removed the duplicates (the ‘u’ stands for unique list):
apple apricot banana bee boob carrot cranberry dogmeat flax hamburger kiwi lemon lemonade mango orange pants pee poison poop potato prune pudding puddle pumpernickel raspberry rupert spoon sunflower test tomato
Last touch was to add line numbers to every single line. Yes, I know – I could print the file with line numbers enabled but the person who would be using this file was barely capable of using notepad. So the lines had to be hard coded. This is actually a new trick that I just learned and it goes like this:
The \=line('.') bit does the actual line numbering, while the .". " bit simply appends a dot and a space to each number so they nicely stand out from the actual items. The end result looks like this:
1. apple 2. apricot 3. banana 4. bee 5. boob 6. carrot 7. cranberry 8. dogmeat 9. flax 10. hamburger 11. kiwi 12. lemon 13. lemonade 14. mango 15. orange 16. pants 17. pee 18. poison 19. poop 20. potato 21. prune 22. pudding 23. puddle 24. pumpernickel 25. raspberry 26. rupert 27. spoon 28. sunflower 29. test 30. tomato
I’m putting this here as a useful tip – more for myself than anyone else. Chances are I will forget the line numbering trick in a few weeks and will need to look it up again. Hopefully some of you may find it useful as well.
To summarize: Vim is awesome. It is like a Swiss Army Knife for text files. Use it, learn it, love it!