I have noticed that video is increasingly becoming the preferred communication medium on the web, especially for the younger generations. This is especially noticeable amongst the newly formed gator/puppy set which has spawned in the August that never ended, but not limited to just them. Any time these folks get some thought in their head that they feel is worth sharing with the world, they turn on their webcam, ramble off the cuff anywhere from 15 minutes to 3 hours and then promptly upload the whole thing to YouTube without editing.
Back in my day (only a decade ago, but that’s like a million internet years) we might have called this “vlogging” but I haven’t really seen that word used in ages. Personally, I always thought of vlogs as prepared essays with with visual components. To me the whole point of doing a video to show viewers examples of the stuff you are talking about. What Anita Sarkesian is doing is a good example: well researched, well edited, succinct, to the point visual essay with concrete examples of game-play and game dialogs. On the other hand, someone just talking “off the cuff” into the camera for twenty minutes in a single unbroken cut is…
Well, to me it just seems lazy. Here is the thing: I can read faster than you can talk. Therefore, if you have a message you want to get out there, the most efficient way of doing this is via text. Text can be absorbed very rapidly, even if it is an unstructured stream of consciousness jumble. I can skim long articles pretty quickly without losing too much information, but there is simply no way to skim a video. You can skip around, but that’s not the same. Skipping feels lossy. When I skim I can still look at the length and shape of the paragraph, check the opening and closing lines, scan for relevant keywords within and etc.. The best YouTube can do for me at the moment in this respect is to show me still thumbnails of what I can expect to see on the screen when I skip to that point. Which, if I’m watching a 20 minute unbroken rant, is always going to be your face.
Writing things down takes some effort. The very process of arranging words into sentences, sentences into paragraphs and so on forces you to think about structure and flow. You can’t just vomit words in the exact order they pop into your mind. Written word has rules, and ignoring them yields unreadable and confusing mess. But if you use video, you can just ramble, talk in circles, get tongue tied, correct yourself and go on tangents without losing too much coherence. Our brains are pretty good at making sense from unorganized, jumbled speech, because that’s how we communicate on the daily basis. So you can talk to a camera the way you would talk to your friend, and chances are most of your viewers will at least get a gist of what you’re saying. But the fact people can comprehend what you’re saying doesn’t mean you are coherent, or that you are not wasting their time. Because you are.
If you turn on a webcam, and hurl words at it for an hour without at least an outline, and without at least some basic editing to remove filler words (umm.., err..) and stuttering you are saving yourself time while wasting mine. Considering that, according to some estimates, over 20% of sounds we make during regular, conversational speech are non-lexical vocables, false starts and corrections, this is rather inconsiderate. This is why you don’t usually see people speaking this way on TV or in movies (save for maybe, you know mumblecore stuff, which consciously mimics “natural” conversation patterns) because for the most part its just noise. Useless, pointless interference that is not conducive to getting your message across.
So if you have some thoughts you want to share, write them down, kinda like I’m doing it here. Put these words on Medium, or Twitlonger, or one of the other five million sites designed to facilitate exactly that. Ranting into camera is just lazy.
Then again, maybe I’m just getting old. Perhaps there is a generational shift away from textual communication happening right now. And why not? It has never been easier to publish video online, and with ubiquitous broadband and storage we don’t have to aggressively edit for size, like we used to. So people are taking advantage of this.
There is this vision of the future that worries me quite a bit: one in which text is dead. In this future all interfaces we input data using touch and speech, and all output is visual and verbal. Humanity is mostly illiterate (save for handful of historians and archivists who study old text) but not uneducated. Poets and writers simply dictate their books to machines, because we perfected speech processing algorithms, and we have them read to us by descendants of Siri, who have perfect cadence and inhumanely soothing voices. Scientists and engineers dictate their papers and equations. Math is done in-silico…
But would that even work? Can you read and write scientific papers without the ability to skim? Can you write good code, without actually… Writing? Up until now, education and literacy were inseparable: one depended on the other. But can technology disentangle the two? Can it help to create a society of highly educated analphabets, and would that even be a desirable thing? I’m inclined to think that this future simply won’t happen, because text is just too fast, efficient and convenient. It compresses insanely well, can be searched and indexed with frightening speed and efficiency, it can be absorbed much faster than audio and it can be translated without artifacts and side effects (such as lip movement being out of sync with dubbed speech on video). I just don’t see us ever giving up all the benefits of text, without getting anything in exchange. Because even if we get perfect speech recognition software, and machines can interpret our commands with flawless accuracy, talking is still slower, less accurate and less focused than writing. It just would not make any sense to abandon it.
But, Spike Lee’s movie Her does provide a vision of the future in which no one ever types anymore, but people still do read. And that is potentially something that could happen one day. And that’s my worst nightmare, because I can only ever properly organize my thoughts when I write. Which is one of the reasons I never felt compelled to make these sort of stream of consciousness type videos. Vocalizing my thoughts adds another layer of abstraction and takes me that much farther away from my message. I feel that dictation is nowhere near as flexible as typing. For example, have you ever tried to someone how you want them to re-format a document?
Can you copy that sentence… No that’s too much… No, actually I meant this sentence, and the short one afterwards. Now cut them out, and put them… Wait, scroll up a bit. No too much. Lower. Third paragraph… Sorry, I guess technically that’s fourth if you count that single word over there as a paragraph. So we put it here, but now we have to change it up to fix the flow…
It usually takes five minutes to explain to a human something you could do yourself in five seconds. Now imagine parsing all of this in an unambiguous way that can be understood by a machine. Editing text with speech would be a nightmare. In fact, editing anything with speech seems like an uphill battle. I think we would literally have to invent new, un-ambigous sub-dialects just to efficiently interface with machines. Or maybe learn Lojban.
I think what we’re seeing here is just laziness, and not some generational paradigm shift.
Then again, I have been wrong on things like these in the past. If this is the way of the future, I will have to adopt to that new, nightmarishly inefficient world. I don’t want to be the bitter old man who doesn’t get the new technology and refuses to get with the times. And at the very least, this strange future without reading and writing would result in more engaging, and visually pleasing Powerpoint presentations without bullet points…