Can Everton Jones find out how his father stole Emperor Bokassa’s diamonds and, more importantly, where he hid them; before the world and his brother get there first?
Click on the picture link in the sidebar to read an extract of my first novel, which was published by Paradise Press in August 2012.

Thursday, 1 May 2014

How to import your marked-up MS Word file into your ebook using Sigil

Sigil is set up to import an MS Word file in its entirety, including all the styles and formatting. This might sound like a good idea until you actually do it and look at the file in code view. You will see literally hundreds of lines of code: the entire MS Word stylesheet, in fact, re-coded in html and CSS. Probably, for an epub, this might render properly, but there is so very much that could go wrong that something probably will. And if even just one tiny bit of it it does go wrong, mending it would involve unpicking the entire MS Word stylesheet. A task well beyond my limited skills and one which would depend also on how logically the original MS Word document styles were constructed in the first place. And as for Kindle, it supports so few html tags and even fewer CSS styles that the chances this will work on a kindle are in my opinion negligible.

So my preferred strategy is to keep the e-book as simple as possible, and that way eliminate any possible problems from the beginning and ensure I know exactly what is happening. This means importing your document as plain text (marked up to indicate where your formats go) and then re-applying the formats using CSS and html which you know will work on a Kindle and an e-pub. (See my post about which CSS works with Kindle and which does not.) Persuading Sigil NOT to import the MS Word stylesheet is not as straightforward as it might at first sound. Even then a little bit of tidying up of the file is necessary, but this can be done in a matter of minutes using find and replace. This post outlines how I go about it.

Once you have marked up your book text in MS Word, remove any comments you might have in it. And then, when it is ready, save it as unformatted text in a .txt file. To do this, click the ‘Office’ button (top left) and select ‘Save As…/Other Formats’:

In the ‘Save As…’ dialog, select ‘Plain Text (*.txt)’ from the ‘Save as Type:’ pop-up menu:

You will get a dialog asking for details of how the file is to be converted to plain text. It should default to these settings: ‘Windows(default)’ and ‘CR/LF’ at the end of lines. Use these defaults and click ‘OK’:

You could alternatively create a blank .txt file in Word and then use ‘Paste Special …’ to paste your book text into it. Select the whole document and copy the contents to the clipboard. Then click in your new, blank .txt document. Select ‘Paste Special …’ from the ‘Clipboard’ pane in the ‘home’ group of the ‘Ribbon’:

Then select ‘unformatted text’ from the dialog and click ‘OK’:

You will want to save the file in the folder you are using for your ebook project.

It is important to distance yourself from MS Word at this point, so CLOSE the .txt file you have created. When you do, Word may throw up some dialogs asking you if you really want to save it as a .txt file and warning you that doing so will mean the loss of some formatting information. Select the options to lose AS MUCH formatting information as possible! Now RE-OPEN it using something like Notepad (Windows). Perhaps the easiest way to find a suitable program is to right-click (Windows) or Control-click (Mac) on the filename and then choose a program to open the file with from the pop-up menu:

In notepad, the file now looks like this:

Once the file is open, select the entire text and copy it to the clipboard. Now create a blank e-pub file or open one you made earlier using Sigil. (See my post on how to use get started by making a blank e-pub with Sigil.) Your blank epub should contain a single empty chapter. Open this in the main Sigil window in code view. Find and select the blank paragraph which Sigil has placed in the <body> of the page. NB ‘&#160;’ is an alternative html code for a non-breaking space:

DELETE this blank paragraph and ensure the blinking insertion point is positioned on a blank line between the opening <body> tag and the closing </body> tags:

SWITCH back to ‘Book View’ and paste the text into the chapter. (If you don’t switch to book view before pasting you will get the whole thing as one v-e-r-y l-o-n-g paragraph.) When you are finished, switch back to code view again and you should see each paragraph of the original document enclosed by opening <div> and closing </div> tags:

Now to the most important bit. You will need to use find and replace to change all the closing </div> tags to </p> tags and then change all the opening <div> tags to <p> tags. What you want to achieve is this:

Next, SAVE the file. Sigil should ‘clean up’ the code placing the <p> tags in the same line as the text they enclose:

If the file doesn’t automatically clean itself up, turn clean up ON by selecting ‘Preferences …’ from the Edit menu to bring up the Preferences dialog:

And in the preferences dialog, click ‘Clean Source’ and make sure the ‘Open’ and ‘Save’ check boxes are ticked:

You will need to do a bit of cleaning up of the file using Find and Replace to make sure there are no spaces before the opening <p> tags or after the closing </p> tags. There might also be some blank <p> tags and maybe some non-breaking spaces (&#160; or &nbsp;) left in the file which need to be deleted.

Find and Replace in Sigil is helpfully in a panel below the main window and is self-explanatory:

In the screenshot above, ‘&#160;’ is being replaced with nothing (i.e. it is being deleted).

As I said at the beginning of this post, unless you save the original MS Word file as unformatted text, when it is imported into the ebook, Sigil will copy the entire MS Word stylesheet along with the text and convert the styles into inline CSS which it will put into the <head> section of the document. This will usually ammount to several HUNDRED lines of code. You want to be sure that your ebook displays exactly the way you want it to, and so importing the MS Word stylesheet is NOT a good idea. It is WAY too complicated and finding and fixing any problems will be well nigh impossible. Keep your ebook coding simple and basic and there will be less to go wrong!

Next Steps: You need to create and link a CSS stylesheet in your epub. Then you are ready to replace the markup with the CSS styling you want and to split the one long chapter you have just created up into the individual chapters.

Index to ‘how to …’ posts:

How to ‘unpack’ an epub file to edit the contents and see what’s inside.
How to understand what is inside an epub
How to link the html table of Contents in a Kindle e-book
How to restructure the html table of contents for a Kindle
How to delete the html cover for a Kindle ebook
How to link the cover IMAGE in a Kindle e-book
How to clean up your MS Word file before your get started
How to markup an MS Word file to identify the formats before importing it into an epub
How to create a new blank e-pub using Sigil
How to import your marked-up MS Word file into your ebook using Sigil
How to create and link a CSS stylesheet in an e-book using Sigil
How to replace the markup with CSS styles in your ebook using Sigil
How to style an e-book so it works with the limited CSS styling available to Kindle e-readers
How to understand the syntax of CSS
How to style Small Caps in an e-book
How to split your ebook up into chapters using Sigil
How to sequence your e-book
How to phrase the copyright declarations etc. in an e-book
How to generate the logical table of contents using Sigil
How to understand toc.ncx in an e-book
How to generate the html table of contents in an e-pub
How to style the html table of contents using CSS
How to create an html cover for your epub using Sigil
How to present references and notes in a book
How to use Mark Up to link notes in your e-book
How to present a bibliography in a book
How to use markup to link entries in a bibliography with the notes section
How to index an e-book
How to use the tools in MS Word to create an index

TinyURL for this post:

No comments:

Post a Comment

Twitter Bird Gadget