Can Everton Jones find out how his father stole Emperor Bokassa’s diamonds and, more importantly, where he hid them; before the world and his brother get there first?
Click on the picture link in the sidebar to read an extract of my first novel, which was published by Paradise Press in August 2012.

Saturday, 18 October 2014

How to understand content.opf

OPF stands for Open Packaging Format and is one of the standards adopted by epub. It is used to construct a file called content.opf which describes the content of the ebook. This file is the most important part of an e-book. I am going to devote at least four posts to it including this one. In the process, I will also cover how the metadata is entered in content.opf and how to edit this using the tools in Sigil. You will need to do this to complete your e-book.

This post is purely informative. You do not need to change and SHOULD NOT change anything described in this post. Any necessary changes to content.opf for an epub can be made by Sigil. There ARE changes you need to make maually to content.opf to convert that epub to Kindle, but these are fully described in other posts.

Perhaps the most accessible way to examine content.opf is to open it using Sigil. This post is devoted to the header part of the file (the first two lines of the example below). In a new, blank, epub created by Sigil, content.opf looks like this:

<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<package xmlns="http://www.idpf.org/2007/opf" unique-identifier="BookId" version="2.0">
<metadata
>
<dc:identifier id="BookId" opf:scheme="UUID">urn:uuid:
</dc:identifier>
<dc:identifier
></dc:identifier>

</metadata>
<manifest
  … >

</manifest>
<spine
>

</spine>
<guide />
</package>


(I have replaced details not covered in this post with an ellipsis (…) and the text picked out in red is like that just for the purposes of illustration within this post.)

By way of explanation, in the first line:

xml version="1.0" tells the ebook reader which version of the language (xml) the file is written in.

encoding="utf-8" tells the ebook reader how the characters are ‘encoded’. ‘utf-8’ is the most widespread system for encoding web pages.

standalone="yes" is fairly self-explanatory: the file is complete in and of itself. It IS possible to have more than one content.opf file, but this is strongly NOT recommended, is not necessary, is hardly ever done, and Sigil isn’t set up to do it.

The next line is:

<package xmlns="http://www.idpf.org/2007/opf" unique-identifier="BookId" version="2.0">

It contains a single opening <package> tag.

Within that tag, xmlns="http://www.idpf.org/2007/opf" gives the location of the web page defining the xml namespace used in the file. Some terms in xml can have more than one meaning. An xml namespace code followed by a colon tells the e-reader which meaning is intended. An example below is dc:identifier. Here the xml namespace code dc: specifies the intended meaning of identifier. Another example in the manifest is opf:role, where opf: specifies the intended meaning of role. The value of xmlns MUST be exactly as entered above. All very technical. Leave it alone, Sigil will have got it right.

version="2.0" tells the e-reader software that the package is an OPF2.0 package.

Now to go back to the other part of the package tag:

unique-identifier="BookId"

The metadata (as shown in the example above) might contain more than one identifier for the e-book:

<dc:identifier id="BookId" opf:scheme="UUID">urn:uuid:</dc:identifier>
<dc:identifier
></dc:identifier>

ONE of these identifiers has been labelled with id="BookId" by Sigil.

The statement: unique-identifier="BookId" tells the e-book reader which of these identifiers is the unique identifier for the e-book (the one with the label BookId).

The text in red must match in both the manifest and package tag. One and only ONE identifier in the manifest must be labelled as the unique identifier. There must be at least one identifier in the manifest. By default, Sigil creates a Universally Unique Identifier or UUID for the e-book. (I will explain UUID in my next post.) If you think about it Sigil HAS to do something  like that when creating a new, blank, epub file, as an ISBN or other identifier cannot have been entered at that stage. In principle, you could edit content.opf so that the ISBN, once entered, is the unique identifier but, as the unique identifier is never seen by the user I can’t see the point. Best left alone.

The part of content.opf enclosed by the opening <metadata> and closing </metadata> tags defines the metadata for the e-book, such as the title, author, ISBN etc. It is best manipluated using Sigil. How to do this is addressed in my next post, which I will link here when published.

The part enclosed by the opening <manifest> and closing </manifest> tags lists each item in the e-book and gives it a label (or id). It is explained in another post.

The part enclosed by the opening <spine> and closing </spine> tags lists the items in the order in which they will be displayed, using the labels defined in the <manifest>.

The final part should be enclosed by opening <guide> and closing </guide> tags and points to specific items in the e-book, such as the html table of contents. Note that in a blank Sigil-created epub the guide contains no entries and is entered as a single empty <guide /> tag.

The spine and guide are described in a forthcoming post, which I will link here.

content.opf ends with a closing </package> tag.

Whilst Sigil will take care of setting up content.opf for you, you will need to edit it manually when you are converting your e-pub e-book to kindle. And it is also useful to be able to make manual changes at a late stage of preparation without having to go back and re-work it in Sigil. So it is useful to have a working knowledge of the syntax of this important file.

In particular, if you want to change the unique identifier from the UUID created by Sigil to an ISBN, you will also need to know your way around content.opf. The detail of how to change the unique identifier is properly a part of my next post, which explains the syntax of the <metadata> section of content.opf and how to edit the metadata using the tools in Sigil.

Next Steps: My next post explains the metadata section of content.opf and how to edit the metadata using Sigil. I will go on in further posts to explain the manifest, spine and guide. Then, once the metadata has been completed, you will be able to go on to test your epub e-book and begin converting it to Kindle.

Index to ‘how to …’ posts:

How to ‘unpack’ an epub file to edit the contents and see what’s inside.
How to understand what is inside an epub
How to link the html table of Contents in a Kindle e-book
How to restructure the html table of contents for a Kindle
How to delete the html cover for a Kindle ebook
How to link the cover IMAGE in a Kindle e-book
How to clean up your MS Word file before your get started
How to markup an MS Word file to identify the formats before importing it into an epub
How to create a new blank e-pub using Sigil
How to import your marked-up MS Word file into your ebook using Sigil
How to create and link a CSS stylesheet in an e-book using Sigil
How to replace the markup with CSS styles in your ebook using Sigil
How to style an e-book so it works with the limited CSS styling available to Kindle e-readers
How to understand the syntax of CSS
How to style Small Caps in an e-book
How to split your ebook up into chapters using Sigil
How to sequence your e-book
How to phrase the copyright declarations etc. in an e-book
How to generate the logical table of contents using Sigil
How to understand toc.ncx in an e-book
How to generate the html table of contents in an e-pub
How to style the html table of contents using CSS
How to create an html cover for your epub using Sigil
How to present references and notes in a book
How to use Mark Up to link notes in your e-book
How to present a bibliography in a book
How to use markup to link entries in a bibliography with the notes section
How to index an e-book
How to use the tools in MS Word to create an index
How to alphabetise an index or bibliography
How to adapt the print index in your MS Word file for an e-book using markup
How to adapt cross-references in your print index for e-book and how to use markup to make the links
How to understand content.opf
How to understand and edit the Metadata of an ebook using Sigil
How to understand the manifest in content.opf
How to understand the spine and guide in content.opf
How to test your e-pub using flightCrew in Sigil

TinyURL for this post: http://tinyurl.com/nls22ew

No comments:

Post a Comment

 
Twitter Bird Gadget