Can Everton Jones find out how his father stole Emperor Bokassa’s diamonds and, more importantly, where he hid them; before the world and his brother get there first?
Click on the picture link in the sidebar to read an extract of my first novel, which was published by Paradise Press in August 2012.

Saturday, 1 November 2014

How to understand the manifest in content.opf

Content.opf is the most important part of your e-book. This post covers what you will find between the opening <manifest> and closing </manifest> tags in content.opf. Earlier posts cover understanding content.opf and understanding and editing the <metadata> in content.opf. I will go on to explain the remainder of this file: the <spine> and <guide> sections.

Each line of content.opf between the <manifest> tags refers to the various different parts of the e-book. Each distinct item within the e-book must be listed in the manifest (with the important exception of content.opf which should NOT be included). Sigil will create a valid entry in the manifest for you for each item as you add it to the e-book. However you WILL need to make some manual changes to the manifest when linking your html table of contents and cover image for kindle, so it is important that you understand how it is constructed.

Syntax of an item tag:

The manifest consists of a series of <item /> tags. One for each … well … item in the e-book. Note that these tags have no closing tag, instead each tag is closed by the ‘ />’ at the end (the space matters). Here is an example:

<item href="Text/Section0001.xhtml" id="Section0001.xhtml" media-type="application/xhtml+xml" />

This is the entry which Sigil has created automatically for the blank first chapter of the e-book. The first thing in the tag is href="Text/Section0001.xhtml". This is the URL of the item, and tells the e-book reader where to find the file. In this case the filename is Section0001.xhtml and the path from content.opf to the file is ‘Text/’. (The ‘Text’ folder is in the same place as content.opf and the path tells the e-reader to look in that folder to find the file.) It might be helpful to include an image of the contents of the OEBPS folder at this point:




Coming back to the <item /> tag we are discussing:

<item href="Text/Section0001.xhtml" id="Section0001.xhtml" media-type="application/xhtml+xml" />

The second thing in the tag is an id or label for that file: id="Section0001.xhtml". Notice that Sigil has recycled the filename (Section0001.xhtml) and used it again as the label, but logically the filename and label are two different things. The filename must exactly match the name of the file, whereas the label can be any text you wish. Sigil just happens to have used the filename for the label in this case. Whatever is used for the label, it must be unique (i.e. all labels should be different) and should be a single text string with NO SPACES. The names are case SeNSiTiVe.

I suppose it would in principle be possible to make the filename and label different from one another by editing the entry in the manifest, but I cannot see what would be achieved by this and have not tried to do it. And, as the label is used to identify the file in the spine, you would need to edit that as well, which is getting messy and leaves you open to making errors, which you do not want.

The final thing in the tag is a ‘media-type’ attribute, which in this case has the value: ‘application/xhtml+xml’. This tells the e-reader that the file is an xhtml file.

Media Types:

Here is a table detailing the principal media-types you will need to know about:

kind of file media-type to use
.xhtml file application/xhtml+xml
.jpg file image/jpeg
.png file image/png
ncx table of contents      
(toc.ncx)  
application/x-dtbncx+xml
.css stylesheet text/css
true type font (.ttf) application/x-font-ttf

A complete list of all media-type attributes listed according to the file extension can be found here: http://reference.sitepoint.com/html/mime-types-full. NB I couldn’t find .ttf in that list! Sigil will know about all the valid media-types and so if you import a file into Sigil it will look up and enter the correct media-type for you when it creates the entry in the manifest.

Linking Images:

An Image would be linked like this:

<item href="Images/Cover.jpg" id="Cover.jpg" mediatype="image/jpeg" />

The syntax is identical with that used for an xhtml file, however note that the path is different: ‘Images/’. This is because the image is in the folder called ‘Images’, NOT in ‘Text’. Beware of the difference between the extension: .jpg and the final bit of the media-type: jpeg.

Linking the ncx Table of Contents:

The ncx table of contents is an important file within the ebook and is generated and linked by Sigil for you. It should be listed in the manifest with a media-type of: ‘application/x-dtbncx+xml’ This file is in the same folder as content.opf, and so no path need be specified:

<item href="toc.ncx" id="ncx" media-type="application/x-dtdncx+xml>

For some reason, Sigil has chosen to label the ncx table of contents ‘ncx’, rather than re-use the filename, which is different from the way it labels every other item in the manifest. I cannot see any explicit reason in the opf specification why the label in the manifest has to be ‘ncx’, and I would have thought another label could be used. However as this would achieve nothing, just be aware that Sigil labels the ncx differently from other items in the e-book. The choice made by Sigil follows the example given in the opf specification, but as far as I can see that is just an example. Still, best left alone. It works, so don’t fix it!

Be very careful to distinguish between the ncx table of contents (toc.ncx) and the html table of contents (named by you but let’s call it Contents.xhtml). This last file is a section in the e-book which contains hyperlinks to the various chapters within it. The ncx table of contents on the other hand is used by the e-reader to generate a completely different table of contents according to its own programming. This will most likely be accessed when the user presses a button somewhere on the e-reader, which then displays a table of contents which it has generated from the ncx table of contents file. You will need BOTH files for your kindle e-book. The html table of contents is entered in the manifest just like any other chapter in the e-book. See my posts on how to generate the html table of contents and how to generate the logical table of contents for further information.

html cover and Kindle cover IMAGE:

There are important differences between an epub and a kindle e-book for the front cover image and html table of contents and these are covered by a number of other posts. The foregoing relates only to epub. To get started converting a valid epub to Kindle click here for information on how to deal with the front cover and here for information on how to deal with the table of contents. Other information is linked from these posts.

For both kindle and epub you will have loaded a cover image, and this will be linked as indicated above, just like any other image file. For the e-pub, you will have an html cover file, and that will be linked just like any other chapter in the e-book text. It falls at the beginning of the e-book, but the location of its entry in the manifest is NOT important.

Linking Fonts:

I don’t necessarily recommend this, but should you have any embedded fonts, they will be in the ‘Fonts’ folder and will be entered in the manifest as follows:

<item href="Fonts/NewCenturySchoolbook.ttf" id="NewCenturySchoolbook.ttf" media-type="application/x-font-ttf" />

Notice again that the path to the file is different because the fonts live in the ‘Fonts’ folder. To avoid generating warnings in epubcheck, it is advisable that the names of the font files are single text strings with NO SPACES and these filenames MUST also be used consistently in the CSS stylesheet. For the sake of completeness I will post on how to embed fonts later, although I would advise against it. As I said earlier, I couldn’t find a media-type in the online list corresponding to an extension of .ttf. The code in the example above was generated by Sigil when I imported the font file into the epub and works just fine. Importing a font file into your e-book using Sigil should result in the correct media-type being entered in the manifest, whatever the type of the font.

Linking a CSS Stylesheet:

If you have a css stylesheet, it will have been linked like this by Sigil:

<item href="Styles/Style0001.css" id="Style0001.css" media-type="text/css" />

Once again, as this will have been created in the ‘Styles’ folder the path is different. The filename, Style0001.css, can of course be anything you like. Style0001.css is just the default filename Sigil gives it when it first created it. Renaming it in Sigil will update the entry for it in content.opf.

The stylesheet will need to be linked to each chapter in the e-book and any embedded fonts will need to be referenced in the stylesheet. See my post on how to create and link a css stylesheet for more information on how to link it. I will cover embedding fonts in a future post and link it here when it is published.

The id/label given to each item in the manifest is used to reference it in the <spine> NOT the filename and path. The order in which the items appear in the manifest is of no importance. What is important is that each item within the ebook MUST appear once and once only in the manifest and be assigned a unique id (except content.opf, which should NOT be included). It ends with a closing </manifest> tag.

Sigil will create the entries in the manifest for you but you are going to have to edit it to link the cover image and the html table of contents properly for kindle, so it is as well to be completely familiar with how the manifest is constructed.

Next Steps: My next post will complete this sequence by explaining the syntax of the <spine> and <guide> sections of content.opf. I will then go on to cover how to test your completed e-pub e-book and begin converting it to kindle.

Index to ‘how to …’ posts:

How to ‘unpack’ an epub file to edit the contents and see what’s inside.
How to understand what is inside an epub
How to link the html table of Contents in a Kindle e-book
How to restructure the html table of contents for a Kindle
How to delete the html cover for a Kindle ebook
How to link the cover IMAGE in a Kindle e-book
How to clean up your MS Word file before your get started
How to markup an MS Word file to identify the formats before importing it into an epub
How to create a new blank e-pub using Sigil
How to import your marked-up MS Word file into your ebook using Sigil
How to create and link a CSS stylesheet in an e-book using Sigil
How to replace the markup with CSS styles in your ebook using Sigil
How to style an e-book so it works with the limited CSS styling available to Kindle e-readers
How to understand the syntax of CSS
How to style Small Caps in an e-book
How to split your ebook up into chapters using Sigil
How to sequence your e-book
How to phrase the copyright declarations etc. in an e-book
How to generate the logical table of contents using Sigil
How to understand toc.ncx in an e-book
How to generate the html table of contents in an e-pub
How to style the html table of contents using CSS
How to create an html cover for your epub using Sigil
How to present references and notes in a book
How to use Mark Up to link notes in your e-book
How to present a bibliography in a book
How to use markup to link entries in a bibliography with the notes section
How to index an e-book
How to use the tools in MS Word to create an index
How to alphabetise an index or bibliography
How to adapt the print index in your MS Word file for an e-book using markup
How to adapt cross-references in your print index for e-book and how to use markup to make the links
How to understand content.opf
How to understand and edit the Metadata of an ebook using Sigil
How to understand the manifest in content.opf
How to understand the spine and guide in content.opf
How to test your e-pub using flightCrew in Sigil
How to test your e-pub using epubcheck
How to convert an e-pub to Kindle using kindlegen

TinyURL for this post:

No comments:

Post a Comment

 
Twitter Bird Gadget