Can Everton Jones find out how his father stole Emperor Bokassa’s diamonds and, more importantly, where he hid them; before the world and his brother get there first?
Click on the picture link in the sidebar to read an extract of my first novel, which was published by Paradise Press in August 2012.

Thursday, 20 March 2014

How to restructure the html table of contents for Kindle

The html table of contents in your ebook needs to be formatted in a rather odd way if it is to be accepted by kindlegen. In essence, in a normal page of html the <p> tags will enclose the <a> tags like this:


In the Kindle in the html table of contents these tags need to be the other way around so that the <a> tags enclose the <p> tags, like this:


But let’s look at this in a bit more detail:

An entry in the html table of contents in an epub might look a bit like this:

<p class="contEntry"><a href="Chapter1.xhtml">Chapter One</a></p>

notes: class="contEntry" is the way the CSS styling is applied to the paragraph. (See my post on understanding the syntax of CSS for more information.) Chapter1.xhtml is the href to the file containing the first chapter and is the location the e-reader jumps to when the link is clicked. As the files for the html table of contents AND all the chapters are located within the same folder, there is no need to include the path to the folder in the href, just the filename for the chapter will suffice. Chapter One is the text which will be displayed for the hyperlink in the html table of contents and would normally be the same as the chapter heading in the chapter it links to, but it does not have to be.

(* note 27 August 2014: I have changed the name of the style from "indexEntry" to "contEntry" and to make this post consistent with later ones.)

In order for this entry to work properly in the Kindle, the tags need to be swapped around, like this:

<a href="Chapter1.xhtml"><p class="contEntry">Chapter One</p></a>

If you do not swap the <p> and <a> tags around Kindlegen will throw out a bunch of warnings to the effect that it cannot resolve the hyperlinks when it tries to convert the epub into a mobi file. BUT DO NOT try to do this using Sigil! It won’t like it and will try to swap the tags back around, buggering up your ebook in the process. What you are doing is tinkering with an already valid epub file and you really need to use an html editor such as Komodo Edit to do that. First, unpack the epub and then find the html table of contents in the Text folder in the OEBPS and open it using Komodo Edit. You can save yourself a LOT of time if you make careful use of Find and Replace to do this:

FIRST delete all instances of <p class="contEntry"> by replacing them with nothing
THEN replace all instances of .xhtml"> with .xhtml"><p class="contEntry">
FINALLY replace all instances of </a></p> with </p></a>.

WARNING! It is sensible to save a copy before starting. It might even be a good idea to make incremental backups as you go along, using ‘Save as …’ and saving the epub with different filenames after EACH step such as ebookA.epub, ebookB.epub, ebookC.epub, … etc. Find and Replace can make a complete mess of your file if you either make a mistake or don’t think it through properly before getting started. But it is worth considering, as it can save you a LOT of time. Using ‘Find Next’ and then ‘Replace’ to make the changes on a case-by-case basis instead of ‘Replace All’ to do them all in one go might be a safer, if slower, option.

ALSO: be alive to the possibility that Find and Replace might find the extension .xhtml in parts of the page header, so keep an eye on the results of your Find and Replace. Sigil kindly shows which replacements have been made and reports how many. If you have just TEN chapter links to modify and you see that Sigil has made ELEVEN replacements, then you’ll have to find the mistake and correct it. Either use ‘Undo’ (ctrl-Z) or else copy the code from the header of another, unaffected, chapter in the e-book.

IF your html table of contents also links to chapter subheadings, it will contain entries a bit like this (click here to skip to the end of the next bit if you have no subheadings):

<p class="contSubEntry"><a href="Chapter1.xhtml#subheading1">Subheading</a></p>

notes:Chapter1.xhtml’ is the name of the html file containing the chapter. ‘subheading1’ is an arbitrary id given to the subheading in Chapter1 by adding ‘id=subheading1’ to the tag containing the subheading. It would seem to be good practice to make the id’s used to reference the subheadings unique and to restrict them to single text strings with no spaces. ‘#’ is a required part of the html syntax. The text: ‘Subheading’ is just what will be displayed in the table of contents and, whilst it will most likely be the same as the text of the subheading in Chapter1 it does not have to be. class="contSubEntry" applies the styling you defined in your stylesheet using CSS.

(*note 27 August 2014: I have changed the name of the style from "indexSubheading" to "contSubEntry" to make this post consistent with later ones.)

What you need to do is to change the tags around so they look like this:

<a href="Chapter1.xhtml#subheading1"><p class="contSubEntry">Subheading</p></a>

Unfortunately, it will be a bit more complicated to achieve this using Find and Replace. Perhaps the safest way is to delete all instances of <p class="contSubEntry"> and then manually paste this back after the <a> tags.

A possible strategy might be to end the labels with something like ‘¦’ or some other unique text. i.e. ‘subheading1¦’, ‘subheading2¦’, etc. Then you could replace ‘¦">’ with ‘¦"><p class="contSubEntry">’. I haven’t tried it, but it should work.

When you are finished, the html table of contents will now pass kindlegen and the links will work properly. Note that the <a> tags can be within the <p> tags elsewhere in your Kindle ebook, the problem seems to be confined to the html table of contents.

Next Step: Once you have restructured the html table of contents, you will also need to link the html table of contents for the Kindle, otherwise the ‘table of contents’ button on the Kindle e-reader won’t work.

Index to ‘how to …’ posts:

How to ‘unpack’ an epub file to edit the contents and see what’s inside.
How to understand what is inside an epub
How to link the html table of Contents in a Kindle e-book
How to restructure the html table of contents for a Kindle
How to delete the html cover for a Kindle ebook
How to link the cover IMAGE in a Kindle e-book
How to clean up your MS Word file before your get started
How to markup an MS Word file to identify the formats before importing it into an epub
How to create a new blank e-pub using Sigil
How to import your marked-up MS Word file into your ebook using Sigil
How to create and link a CSS stylesheet in an e-book using Sigil
How to replace the markup with CSS styles in your ebook using Sigil
How to style an e-book so it works with the limited CSS styling available to Kindle e-readers
How to understand the syntax of CSS
How to style Small Caps in an e-book
How to split your ebook up into chapters using Sigil
How to sequence your e-book
How to phrase the copyright declarations etc. in an e-book
How to generate the logical table of contents using Sigil
How to understand toc.ncx in an e-book
How to generate the html table of contents in an e-pub
How to style the html table of contents using CSS
How to create an html cover for your epub using Sigil
How to present references and notes in a book
How to use Mark Up to link notes in your e-book
How to present a bibliography in a book
How to use markup to link entries in a bibliography with the notes section
How to index an e-book
How to use the tools in MS Word to create an index

TinyURL for this post:

No comments:

Post a Comment

Twitter Bird Gadget