# Pitfalls of WYSIWIG: Self Publishing Hell

At the beginning of March, Shamus Young published a very interesting bog post about his adventures in self publishing. It gives you a glimpse of what exactly does it take to publish your own book these days the internet way. And by that I mean without the aid of big publishing magnates who take a cut of your income in exchange for their services.

In essence, self publishing is hell. Hell on wheels. It is a train of torture and the conductor is that annoying kid from the Game of Thrones. You want to punch him in the face, but he goes “LOL, nope!” and ruins your shit, over and over again. That’s more or less a TL;DR of Shamus’ article right here. If you are planning to self publish a book, prepare yourself for pain.

Why does publishing hurt so much? It’s because most of the services that help you push out your work to all the different online stores standardize on Microsoft Word. I mean, it is nice they have a standard at all, but Word is possibly the worst thing to standardize on. It just does not work.

You can read the post yourself to see the multiple levels of failure, all of which can be traced back to using the completely wrong software platform. And this is not just my old song and dance about WYSIWYG. Word fails as a text exchange platform too. It’s layout depends on installed print drivers, fonts and styles. It’s engine reformats and reshapes your text by covertly changing quotation marks, ellipses and hyphens to special characters. It handles simple formatting by wrapping your text in miles, and miles of XML tag soup.

It is virtually impossible to copy and paste a Word document into a WYSIWYG web form without introducing some weirdness. In general when transferring Word documents to web, you have two choices:

1. Loose all formatting and just use plain text
2. Introduce weird formatting artifacts and “tainted” HTML code into otherwise clean web page

There is simply no way around it. As we move towards more open-ended, more web driven environments, Word is an archaic, led anchor that drags everyone down, and restricts their movements. Word hurts productivity by forcing you to fiddle with it every time you are trying to collaborate or publish your stuff online.

You probably expect me to segue into a spiel about LaTex right about now, but I won’t. LaTex is great, and it solves 80% of the issues Shamus mentioned in his post. This was precisely why Donald Knuth invented it – because publishing was such a monumental pain. But LaTex is not a solution for everyone.

Let’s face it – LaTex has a byzantine syntax, steep learning curve and a stack of dependencies. The default LaTex package for windows is a few hundred megabytes in size. The installation is not difficult, but it does not work straight out of the box. Not to mention that the features of LaTex are overkill for most people.

I used it for my Masters Thesis because it had very, very strict formatting requirements. It took me several hours to actually set up my document the right way – but I only had to do it once. For my class-mates who went with Word, this was a daily struggle. Their layout would break every time they embedded a new figure, pasted some text, or hit the undo button the wrong way.

But, if you are writing articles, blog posts, short stories, or even novels you usually don’t need all that power. You need an efficient, care free way of editing text. But not just text – text and some minor formatting.

If you read Shamus’ blog post, you will notice that he likes to use italics and bold for emphasis. That’s more or less the main reason why he used a WYSIWIG setup rather than, say Notepad. Sure, he could have went with HTML but it has similar problem to LaTex – the syntax sometimes obscures the content.

We do have something better though: it’s called Markdown (or Textile, or whatever). No, seriously – I’m not joking here. Look at it this way: a lot of online publishing platforms already support Makrdown (or Textile, or whatever) as a valid input format.

Markdown is plain text – it has none of the problems Word has. It does not pollute the content with special characters, symbols, invisible formatting marks and etc. It is not susceptible to version/formatting related issues (.doc files vs .docx files). It does not concern itself with page line spacing, layout, margins and other such details. Those things are usually handled by the “output” format. So if SmashWorlds or other such service used Markdown as their standard, they could standardize on layouts too. They could make assumptions based on the character count and their own fonts to see how to re-flow text for specific page sizes required for print and e-devices. Currently they ask the users themselves to conform to a set of standards, and then hope things work out for them.

Markdown is a markup language – albeit a very subtle one. Shamus hates the LaTex formatting:

This is \emph{italicized} and \textbf{bold}.

It is complex, because it needs to be. It was developed by computer scientist for other scientists, without regards for syntax sugar. It evolved organically from Tex. It is what it is – and what it is, is not necessarily user friendly.

Compare it to very subtle, and human readable Markdown formatting:

This is *italicized* and **bold**.

This is actually how people sometimes emphasize their text in forum posts or chats where formatting is not available. It is organic, sensible and it does not pollute your content with verbose code tags or keywords. Same goes for headings:

This is heading 1
=================

-----------------

#### This is heading 4

Markdown is simple and easy to learn. The last 2 paragraphs more or less taught you 80% of all the Markdown syntax you will ever need for writing blog posts and short stories. Especially if you are not a programmer and you don’t need to make code snippet sections (which are also dead easy). You can “learn” the ins and outs of markdown in about 10 minutes or so.

Word files are a relic of the past – a thing that is best relegated to the musty accounting offices and business dungeons where people slowly by surely lose the will to live, under the yoke of pointy haired MBA carrying space cadets. Creative folks don’t need the behemoth of Microsoft Word 2010. What they need is a clean, lean writing environment – something like WriteRoom or CleanWriterPro. And they need a light, lean and unobstructive markup language to go with it.

Mark my words: in the upcoming decade we will turn back the wheel of time, and banish the dead-end WYSIWYG paradigm from our lands. The only thing we need to do this, is improve the support for these lean formats. And by support I don’t mean just syntax hilighting – I mean treating Markdown documents as first class citizens of the document world. We need editors for non-programmers that respect and acknowledge the dual nature of documents written in markdown languages – that they have the raw-code layer for editing, and pretty display layer for how it will look when published.

In essence we need a handful of two pane editors such as Mou that will let you type your document on the left, and see how it will look on the right.

Granted, none of what I have written here would help Shamus, because his suffering was not really his fault. Even if he chose to write his novel in LaTex or Markdown, his publishing service still required Word. So he would still have the same headache – plus the added hassle of converting that format into Word.

Why am I writing all of this? Because I hope I can change a few minds out there. The creative types are sick and tired of Word – they want to wash their hands clean of it. If you run a self publishing service and you require submissions in Word you are doing it wrong. Most of your technical difficulties, support tickets and your users failures are caused by your choice of platform. So why not add support for alternative methods of submission – why not aggressively advertise it to your users and see if they like it better. And most importantly, why not Zoidberg?

Agree? Disagree? Let me know in the comments.

This entry was posted in technology. Bookmark the permalink.

### 12 Responses to Pitfalls of WYSIWIG: Self Publishing Hell

1. i agree
that’s why i included markdown-support for comments in my blog (well, that and because it was as easy as modifying 2 lines and doesn’t introduce larg amounts of security-problems as html would)

Nowadays i see markdown everywhere, Github, Diaspora, Blogs (pretty much everything appart from wordpress and blogger seems to use it now). There are libs for most languages and it’s easy to use.

But: another really great format i noticed some years ago, Yaml, didn’t quite catch on as i would have expected, so i will refrain from stating “it’s inevitable!”. I don’t know if this is because of “writing structured data” is a use-case no one besides me needs or if everyone just misuses something like poorly readable JSON already.

2. Peter says:

Markdown is great for blog posts. There’s probably an app for converting it to Word’s format, so you could use it to write a book. However, I’d like to be able to use mathematical formulas in Markdown, similarly to Latex.

3. says:

Diaspora has markdown? That’s pretty cool. One of these days I will need to check it out. One can never be in enough social networks, eh?

@ Peter:

I didn’t love MarkdownPad. Mou seems more polished. Personally, I would be perfectly happy with a read-only Markdown “viewer” that you could open the files with. I write in vim anyway so no need for another editor.

Pandoc and MultiMarkdown both do Markdown to ODF conversion. ODF files can be open in MS Office starting with 2010 so that works. And if you really need a .doc file you can always take a single extra step and use Open Office to convert it for you.

In fact, I wish Open Office had a command line switch that basically said “fuck off, stay in the background and just convert this input file to .doc for me” for that very purpose.

4. Eric says:

Especially concerning E-Books, it is important to have a format, that the reader can adapt to what it needs. When I am writing I use plaintext and a ‘markdown’ that I change later when it goes into formatting normally I use //italic text// and **bold text**. Chapters I mark with ###

5. I’m still not really satisfied with any of the existing solutions. You’re right that using Word documents is a horrible way to go about it. It proves that the publishing industry is still highly computer illiterate. It means there’s no knowledge of source control and clean source transformations (i.e. tidy change diffs).

I’ve wanted to write nice documents in clean markup for work and for my own projects. Even if I ignore the fact that most of my co-workers really only know WYSIWYG and know very little about source control, so I’d often be trapped anyway, every system as something that I don’t like. They’re all 80% solutions.

You talk about some of the problems with LaTeX, which I agree with. The markup is heavy and ugly. The system is huge. In my experience, if I need to do anything non-standard, it takes a huge amount of tweaking and boilerplate.

I’ve also looked into groff. With the mom package, it’s really quite usable and produces great results. Unfortunately, it’s markup is the worst of all. Not all of it can be done in-line, and the in-line markup it has is huge and whitespace sensitive. And, to a greater degree, doing anything non-standard takes lots and lots of tweaking and hacking.

I use Markdown a lot for my blog these days, as you know. It’s my favorite kind of markup. It’s clean and looks really good in plain text, as you pointed out. As long as I’m not getting fancy, I love writing in it. However, the spec is unfortunately incomplete. Each Markdown engine has it’s own extensions to help deal with it, but you can’t count on the extensions being there. It’s too inflexible, making some sorts of things impossible.

For example, sometimes I’d really like to use markup inside a code block. There’s no way to do this in Markdown. I’ve also wanted to use a heading inside a list item. Also not possible.

For my blog, I often end up dropping back to HTML a lot, to do things Markdown can’t do. The Markdown engine I’m using, Maruku, isn’t extensible, so I can’t make interesting modifications to its output, like automatically adding an attribute to certain HTML tags it outputs.

So I end up falling back to HTML — something that’s a bit too low level. The markup is a little heavy, though generally not too unbearable. CSS adds tons of flexibility, though it’s quite complicated, and some parts of it aren’t fully supported everywhere. It’s imprecise, which, to circle back around, is one of the limitations of Word: the exact look of your document depends on your fonts, screen resolution, etc.

My ideal system would look a lot like Markdown, but with a well-designed, fully extensible engine — kind of like LaTeX.

6. Matt says:

In essence we need a handful of two pane editors

Not so sure on this one – 2 pane is fine, but it can be clunky sometimes. Especially if there’s a compile/typeset step between changes in the code pane and visible changes in the pretty pane.

Would be nice to have a system that pretended, some of the time, to be WYSIWYG, inserting sensibly formatted syntax marks for you at the push of a friendly GUI button and showing you the results in realtime, but made it as easy as possible to jump back and forward between the two so you can hack on the code directly, maybe showed a few non-printing marks in the ‘preview’ mode to avoid the problem of accidentally clicking inside of hidden syntax tags, and had the option to break it out into 2 panes if you so desired.

Seems like all the editors have their little grumbles, and that we’re still waiting for the magical one that answers all problems, but I guess some of those grumbles are born of mutually contradictory demands. That said, there’s plenty of editors I have no experience or even knowledge of, there may already be one that includes all the features I want and made all the same design choices as I would. If not, I’ll just have to write one some day. Maybe.

7. says:

Agreed. As nice as markdown is, it is very, very limited. Now, I like to pretend that this is by design. It’s “markup light” and if you need something heavier you use HTML or big-boy tools like LaTex.

But yeah – we are seeing mass exodus from the word platform as evidenced by proliferation of “creative writing editors” such as writeroom. Also, Macs now have pretty good versioning built into the OS. I mentioned it in my review of CleanWriter Pro editor some time ago.

I think Scrivner is pushing creative writing clients software in a new direction. They are trying to make a tool specifically for creatives, not business users with their memos and fax coversheets. So Scrivner has snapshots (for versioning and tagging), multi-document merging, high level manipulation features (abstracts chapters/sections as sticky notes on cork board), syntax formatting for screenplays and stage plays and etc. I believe Neil Stephenson (a long time Emacs user) was quoted praising.

Granted, I never used it, and I’m not sure how good the export features are. Neal Stephenson might have written a book in it, but he is a high profile author with a real publisher that can sit there all day and massage his prose into the right format in Word if need be. So I’m not sure if this is the solution.

So there is definitely movement away from word and that is good.

I like the idea of simplicity of markdown married to the power of LaTex but is that even possible? The more features you add to a markdown language, the more complex it becomes.

@ Matt:

Markdown to html is a single pass compile and it is usually really fast. It can be done at a satisfactory pace even in Javascript and it works pretty well – see this online two pane editor.

But yeah, ideally I would want to use my editor of choice and some stand alone viewer. Right now when I write markdown I set up vim to run it through Markdown.pl script, and open the file in a web browser for me as a preview. It works fine for me, but it’s definitely not user friendly.

8. STop says:

I’ve been playing with reST (reStructuredText) recently. It’s more featureful than Markdown, but still very similar (http://docutils.sourceforge.net/rst.html).

Retext is a basic two-panes editor for *nix (very much like Mou), which supports both Markdown and reST (http://sourceforge.net/p/retext/home/ReText/).

9. bowerbird says:

much has happened in the last 9 months,
— if you live in the mac world anyway —
so let me catch all of you up on the latest.

check out “multimarkdown composer”…
multimarkdown is a version of markdown
that address all the original shortcomings.
and “composer” is a dedicated editor that
shows you a formatted preview window,
for the two-pane format described above.
in the mac app store, for less than $10… if you like the idea of a formatted preview, but prefer to use your favorite editing app, you must grab “marked” from the app store. it’s the “second pane” for _any_ text-editor. unfortunately, the latest version is lion only. but in the mac app store, and less than$5…

there are lots of other markdown editors
around today, and always more coming,
but those two will take care of you nicely.

but let’s get back to the top of the article…

i’ve invented a form of light markup that is
even lighter than markdown. i call my baby
“zen markup language” — z.m.l. for short —
and i’ll be releasing apps for it very shortly.
my apps are cross-plat (mac, p.c., and linux),
and i will also have a version up on the web…
(it’ll go live at “zen magic love dot com” soon.)

my workflow is aimed squarely at doing books,
and the needs of books (rather than websites),
so the output formats are .html (for the web),
as well as .pdf and .epub and .mobi (kindle)…

-bowerbird

10. I’m definitely a fan of the plain text markup languages also. They can be extended to long-form publishing. For a book I’m currently working on, I’ve set up to generate LaTeX from reStructuredText source, with Erik Hetzner’s extensions mixed in to produce citations from Zotero. It’s taken a lot of work to set up, but having direct, well-defined control over the text has freed me to play some tricks in the typesetting (I hope successfully) that would not have been practicable otherwise; and with the original text in reStructuredText, it won’t be a huge burden, once the print version is out, to produce ebooks from the same source.

11. Morghan says:

MS Word also has one of the worst systems for making a ToC I’ve ever seen. I think that’s part of the reason so many Kindle books, even the well edited ones, have a poor or nonexistent ToC. Now as to why they lack any formatting to separate chapters with the page break being easily handled is quite beyond the range of what I’m willing to guess at.

12. k00pa says:

I never really liked the WordPress editor. I usually edited my text in Google docs and then copied it to WordPress. I added formations and links in the WordPress.

Now I have switched to Jekyll and I write all my posts in Markdown. Best thing about this is that I can use Vim to edit my posts.