Keep it Clean by Cally Phillips


Over the past year I’ve regularly formatted ebooks (for myself and other people) and while I’m no ‘expert’ I do think I’ve got something of a grip on the topic and would like to take the opportunity to share my thoughts. This seemed an appropriate place.

In my experience most people have an uneasy relationship (at least in the beginning) with the concept and practice of eBook formatting. It can cause tantrums, nightmares and not a few fits of the vapours. It leaves one battered, baffled and re-considering whether publishing ebooks is even worth it in the long run (jury of course is still out on that one).  But it is a nettle which indie writers who want to publish ebooks must grasp. And like clutch control, once you’ve ‘got it’ it can become a seamless part of your working life.  (I will not refund money on that promise however!)

We can talk endlessly about formatting packages, methods and strategies BUT my first tip for anyone is: find a format that works for you and stick with it. Learn its ins and outs and don’t be tempted by other ways. Nothing is compatible with anything in the world of ebooks and all of them offer a steep learning curve. You have to use what works for you best.

But before you even get to that stage I advise that you KEEP IT CLEAN.   By which I refer of course to ones original source document. For most of us this is Word.  If you don’t use Word you can ignore the rest of this because it’s Word I’m talking about.  Not ‘the word’ you understand, just Microsoft Word (and other word processing options are available.) All the crying and wailing and gnashing of teeth CAN be put aside if you learn and understand how to get a REALLY CLEAN Word document. Without it you’ll never know where the problem lies (unless you are competent in HTML. I’m not!)

Incredible things are possible. I grew this in 6
years from a coffee bean!

Know how to use your tools.
So, Wordies, here we go.  There something about Word you really need to know.   For ebook formulation the OLDER the version of Word you use the better.  Why? Because the less HIDDEN CODE there is.  You know when things changed from .doc to .docx (Word 2007 vintage) the x stands for xml (I think) or in layman’s terms: HIDDEN CODE.  Which we DO NOT WANT.  You can use any version but for building ebooks you want to AVOID hidden code at all costs.

The fools paradox?
The paradox essentially seems to be that to build an ebook you need to build an html file. But as soon as you use the hidden code bits of Word you are building an html file that may be incompatible with the html file that you NEED in formatting.  So you need to avoid the html elements within Word. These are things like headers and headings and all those cool style things that make your document look GOOD (in the first instance while you are looking at in on a screen.) Some of them are okay, bold and some paragraph indenting for example. Others are not. Anyone who’s ever tried to grapple with an ebook TOC (Table of Contents) will know that things can go horribly wrong and it’s because of hidden code.  IMHO html is easily confused and when it becomes confused like all things computer its artificial level of intelligence spits the dummy.  The safety first angle and the CLEAN option is to avoid all formatting you possibly can (and be aware of what formatting you are using.)

So how do I know how? 
It is totally counter intuitive but in order to prepare an ebook for publication you need to STOP all those things you think are really important like indents and headers and page breaks and tabs and the like (unless you are simply sending a word doc to Kindle which I WOULDN’T recommend and in which case you are on your own – it’s akin to bungee jumping or sky diving!) and CLEAR ALL FORMATTING.  There are lots of ways to do this  and I’ll put more tips up on my personal site.  (Where you can also find tips for using Sigil which is my formatting editor of choice. I won’t bore you with the reasons why.)

Sometimes you can be surprised - this Jasmine
is one year old and flourishing through winter.
The thing you need to be aware of is that you can happily THINK your Word doc is CLEAN and it ISN’T because of the hidden code.  You need to get your head round that to succeed.  If you have print publishing experience  (anything from making newsletters or writing articles, chapters or complete works) you have to throw away all the things you think you know.  Think of them not so much as bad habits as ‘inappropriate’ habits for the ebook world.

The Unified Theory of Everything.
I’m striving for a way to make things simple (aren’t we all) and I won’t say I’m completely there yet but here are my observations for what they are worth:
Goal:
The simplest way to write a document that can be used and re-used both as an ebook AND as a paperback. 
Current method:
1)      I write in Word using a NORMAL template. But I DO NOT put in any headers or page breaks or page numbers or such like. I don’t worry about centering. I DO however use a global paragraph indent – because I like to have a first line indent on paragraphs and you CAN do this without upsetting the html formatting gods. (I think.)
2)      I rewrite/edit using this.
3)      When ready to start the publishing process I save this file as the MASTER (or whatever word you want to use) .doc
4)      Then I am faced with a couple of choices:  

If I want to create an ebook, I save the MASTER as webfiltered html  and then import into Sigil (the rest is described in the Sigil tutorial) – basically you format within this programme and it saves it as epub which I can then use Calibre to convert to a mobi file for Kindle.  I’ve got it down to an art form and starting with a CLEAN master file I can do this in less than 10 mins (once the formatting has all been put back in. That takes as long as it takes depending on the project.) Without a clean file it can take DAYS.  Which leads me  to conclude the importance of starting with a CLEAN file is beyond price.  Building a TOC and putting in cover/metadata and converting to all formats CAN take less than 10 minutes, believe me.

If on the other hand I want to produce for POD or short run print, I download a template from the ‘distribution partner’ or printer.  Then I format to that template.

The benefit of this system is that you have ONE master document which you can then use for print or ebook.  But you have to understand that your MASTER is that. It’s the bare bones, suitable for everything NOT a finished document ready for print or publishing.  It’s okay because the amount of copy editing and proof reading you need to do whether you are working creating ebook or print copy means that the time you spend ‘adding in’ the formatting is part of that editing/proof reading process.  Getting consistency across headings and styles is a lot easier in fact when you do it as part of the publishing process not as part of the initial ‘writing’ process.  I didn’t always believe this but I’ve been converted simply by experience of self/indie publishing.

So folks. The moral of the story is. Learn to create CLEAN word files in the first place.  Accept that putting right the hidden demons of html code if you’ve got them WILL take time and effort. Do you really need to see BOLD HEADERS in your draft work so much that you are prepared to give yourself a headache right through the process of publishing an ebook? I don’t.  Once you can adapt to writing a simple, code free word doc life becomes so much easier. 
Recognise your limitations! This twig will NEVER
be a wonderful Magnolia tree! 
But what about the stuff you’ve previously written using the ‘inappropriate habits’? There is what (I think) the games industry calls a ‘cheat’. To check that your Word doc really is CLEAN you can convert it to a .txt file or a .rtf file and then back into Word .doc. You’ll see how all the formatting has disappeared.  (There are other ways in other versions of Word to STRIP the formatting globally.)  You can open it in Open Office (you could write it in Open Office to start with and that might help –haven’t tried that)

The good news is that apart from Word all the tools you need to build a cracking ebook are FREE (and now all the tools to build a POD book are too!) Sigil is free. Calibre is free. Open Office is free Createspace offers free templates which you can adapt to suit your own needs for POD. But you do, repeat, do need to stop hanging on to the ‘it looks prettier if I put in all the styles etc while I’m writing’ mentality.  It simply costs you time later on.  Remember like driving a car it’s easier with daily practice which is one reason I’ve converted so many texts to ebook style this year. I like to learn the technology.  I find it saves time in the end.  You need to know what it can do and what you can do and what can be done. And what’s appropriate. But most of all, you need to keep it clean.

Good luck and I wish you happy and pain free publishing!



Comments

Kathleen Jones said…
Thanks Cally - I, too, have learned this to my cost. And, because I write biographies it becomes very tricky. There are quotes to handle, references (end notes render Neil suicidal) and bits of poetry here and there (more hair-tearing). Having a 'clean' word file is an absolute necessity.
I used to use WordPerfect, which is a dream to write with, but the conversion to word results in all sorts of glitches. Sadly I now have to use the dreaded microsoft.
Some writers are reporting that Scrivener and YWriter are more efficient at converting to e-pub than Word, but I don't like either of them to work in. Others may though.
Thanks for sharing the nitty gritty baseline! You will save a lot of people a lot of angst!
Chris Longmuir said…
Very useful post, Cally, and I totally agree that the word document has to be clean!
Lee said…
I wish I'd have had this to read about five years ago!
Lynne Garner said…
USeful post thanks! I've found if I use text edit (I'm a Mac user so not sure if you can use on a PC) then go into format and making plain text my previous issues have been resolved.
Dan Holloway said…
very useful. A rather daft-sounding thing i recently stumbled upon and find very useful is pasting your Word Doc into the "compose" screen of blogger and then flipping to the "view html" screen - that instantly uncovers the ridiculous amount of hidden code you have
Jan Needle said…
is there anything you don't know, phillips? i think i'm beginning to hate you. xx
CallyPhillips said…
Jan - I didn't know that. Ha ha (had me suspicions!) Tip of the iceberg me old mucker.

Lee - you and me both!

Dan - I love the 'share' I've learned a lot of tips from watching and sharing with you! Thanks buddy.

Lynn - Mac,I know nothing about except I can't afford their products - sadly. Too deeply embedded in pc culture ever to shift.
Now in a joyous irony I'm going to have to prove I'm not a robot to comment on my own post - what is THAT all about.

Lydia Bennet said…
Great advice Cally! Though the meatgrinder was gruelling, smashwords downloadable guide on how to prepare your doc for it does take you through the 'cleaning' process too very carefully. You are so right, if only every indie did this properly from the get-go there'd fewer duff ebooks with formatting issues still being sold. I learned a great deal from formatting for kindle and smash, I didn't know about 'emm dashes' and enn dashes, for example!
Dennis Hamley said…
Am I doing something wrong or is Kamal doing a lot more than he tells me when he converts my books to Mobi? I just take my Word version (saved as 1997-2003 because having the latest version is a real poisoned chalice)and go through it with the paragraph sign, making sure the spacing is OK and new chapters start at the top of the next page. And, though real 'line of least resistance' stuff, it all seems OK: the books look all right to me and even a very picky reviewer on Amazon.com hasn't found much to complain about.

This robot-proving thing gets on my nerves. Half the time I can't even read the word which always consists of a smudgy photograph with two vague symbols in a square which might be a house number covered in mud. I waste hours writing comments which just don't appear Luckily I don't have to prove my manhood today.

In Spirit of the Place, by the way Cally, you'll see some problems with indenting, at least six typos which are my own fault and the CE in PLACE on the title page has slipped on to the next line. However, when you hold the screen horizontally on our new Kindle Fire, it all looks good again. The Fire, by the way, is GREAT. My one year-old Kindle now seems clunky, geriatric and tired. However, I have to stick to it because Kay has free run of the new arrival.

When we get back from NZ, I'm going to put all the slips to rights, unpublish the old S of the P and replace it with the new.
Hmm, I still have Word 2000 (mainly because I'm too stingy to upgrade software that still works), and when I convert my manuscripts for Kindle they do seem to slip through quite well!

But I've heard Word 2001 is the best version for a clean conversion... anyone know why?
CallyPhillips said…
Dennis - Are you still here? Get thee to NZ with your mighty Kindle Fire and stop worrying. It's interesting to hear how great the Fire is because weren't we all being sold how IMPERATIVE it was to have the e-ink screen? And now the colour seems to be better and the reading just as good as on e-ink. Sigh. Always something better round the bend.

Katherine - I think it's just because of the lack of xml code. 2000 and 2001 (up to 2003) didn't embed stuff like 2007 onwards did. that dreaded docx from doc change small but significant. It's proof that sometimes NEW isn't always IMPROVED.