Can Everton Jones find out how his father stole Emperor Bokassa’s diamonds and, more importantly, where he hid them; before the world and his brother get there first?
Click on the picture link in the sidebar to read an extract of my first novel, which was published by Paradise Press in August 2012.

Wednesday, 24 September 2014

How to adapt the print index in your MS Word file for an e-book using markup

This post has been radically reworked and updated 30 October 2014. I have corrected some errors and worked through at least one very substantive issue. I hope this revision is more logically set out.

In this post I outline how to adapt a print index for e-book. It builds on my previous three posts on how to set out an index, how to use the tools in MS Word to do this and how to alphabetise an index. A further post on how to adapt cross-references in an index for e-book has now also been updated.

I am assuming you have a print index already prepared, and this post covers how to convert it for your e-book version, including how to make the links using markup. There are a number of significant differences between the print and e-book formats which you will need to take into account. Not the least of which is that the page references in the print index will need to be replaced with hyperlinks in your e-book. For the simple reason that there are no pages as such in an e-book.

And this is immediately problematic: A print index has page ranges, which convey the scope of the reference in a way which is impossible if a link is substituted: pages 12–27 obviously represents a more significant chunk of information than page 6, for example. A link, even though it represents some description of the information it relates to, cannot convey the scope of that information in the same way.

One solution I have seen suggested is simply to continue with the print book page references as the link text and these authors suggest embedding the print book page numbers in the e-book text by scattering it with tags such as <a id="page1"></a>, <a id="page2"></a>, etc. (Although I would have thought a safer alternative was to use span tags: <span id="page1"></span>, and so on.) However I am not sure what such tags are intended to achieve. Yes, you could link to them from the index, but these links would be to the top of wherever the print page was, rather than to the actual location of the indexed item which, for my money at least, rather misses the point of hyperlinks. Even worse, if the indexed material falls near the bottom of the print book page, linking to the top of that page might result in the material lying on the NEXT page when displayed on an e-reader.

Furthermore there is no obvious way, in an epub2 e-book at least, of indicating the extent of the target of the link after it has been clicked, and so the reader would jump to page 4 (say) and then be left guessing as to where page 8 (or wherever the page reference ended) was.

An e-book has no pages as such; the text can be made as big or as small as the user wants and the text ‘pages’ are scaled to fit. Some e-readers do have some sort of ‘page’ numbering systems, but these systems are not uniform over all e-readers, and so it would be unwise to build your index ‘page’ numbers on a numbering which was only valid on ONE particular e-reader. What you want is a system which works in the same way on ALL e-readers. And, alright, I suppose it may be possible – although UNFEASIBLY time-consuming – to make several editions of your e-book each of which was tailored to work with the ‘page’ numbering implemented on a particular e-reader. Worse, it would be a potential nightmare making these different products available on the websites appropriate to the e-reader in question. Do you really want to be selling (say) a kobo version and an ipad version and a sony version and … It’s bad enough keeping up with a kindle and epub version! Not to mention the need for different ISBNs for each of the different versions at upwards of £10 each.

One frequent moan I have come across in researching this post is that the lack of page numbers makes citing material in an e-book problematic at best. Well, if the e-book is an academic text with section numbers, then these can always be used: Hart §11.16.1, for example. And you could always use these section numbers in your index. However the same limitation would still apply. Turning (say) §11.16.1–3 into a hyperlink would still only take the reader to the beginning of the referenced material when the link was clicked. Wherever the end of the material lay would not be readily apparent. A possible solution for epub3 e-readers could be to use the CSS3 :target selector, as explained here to hilite the material referenced by the link in the index. BUT, assuming you are coding your e-book to work on epub2 AND epub3 e-readers, this is unfortunately NOT the answer, as epub2 e-books cannot use CSS3.

I can only conclude that in an e-book intended to function in the same way on all possible e-reading platforms the only way to reference material in the index is to turn each page reference into a hyperlink directly to the start of the material it points to.

And that raises another issue. Once the reader has clicked the link in the index, he or she is going to have to use the ‘back’ button on their e-reader to return to the index. As far as I can see there really IS no other way to do this. Putting a return link in the main text, quite apart from being difficult (but not, of course, impossible), would not only disfigure the text but runs into the intractable problem that the indexed material could be pointed to from more than one entry in the index and so the target of such a return link would be ambiguous: which of the potentially many places in the index would the link return to?

I was horrified, when researching this post, to find a statement that certain e-readers may NOT have a physical OR a logical back button. Well, there’s one on the Kindle and on the two epub2 e-readers I own, and I am writing this post on the assumption that having a back button is such an obvious requirement that almost all e-readers will too.

Which brings me to the actual text to use for the link in the index. You are, I am afraid, going to have to work out a short phrase from the context of each page reference in the index describing the nature of the information it relates to. There are some subtle requirements for this link text, which I will go into later on. Meanwhile, let’s begin with the principles of replacing the page references with links.

I am considering a set-out index in these posts. And I feel a fully run-on index made from headings and link text only would be far too cramped in an e-book. I am using an extract from the index in New Hart’s Rules as the basis for my examples, set out, in a box with a dark Oxford blue border. Alongside this I am presenting the same index adapted to follow Chicago style in a box with a maroon border, following the colours of Chicago University. Blame Wikipedia if I’ve got them wrong!

Index entries with only ONE page reference:

Consider this example, of a main entry with just ONE page reference after it (note the enspace after the headword in Oxford style compared with the comma in Chicago):

abbreviations  167–78
abbreviations, 167–78

The most elegant way to deal with this in an e-book index is to turn the entire heading into a hyperlink directly to the main text:

abbreviations
abbreviations

I would do this for ALL index entries with just the one page reference after them, either a main entry or a sub-entry.

Index entries with more than one page reference:

If there is more than one page reference after the index entry, for example:

abbreviations  167–78
apostrophe in  66-7, 170, 175
books of the Bible  140–2
capitalisation  126, 127, 170–2, 192
abbreviations, 167–78
apostrophe in, 66-7, 170, 175
books of the Bible, 140–2
capitalisation, 126, 127, 170–2, 192

Decide on a suitable piece of text which characterises the nature of the information each page reference points to, replace the page reference with this text and turn it into a hyperlink to the relevant place in the main text. In the exampe below, the linking text describes the information to be found in New Hart’s Rules at the location indicated by the page reference it replaces (note the ‘extra’ hanging indent for follow-on lines):

abbreviations
apostrophe in  contractions, elision, posessives
books of the Bible
capitalisation  acronyms, full points with, small capitals in, type treatment
abbreviations
apostrophe in, contractions, elision, posessives
books of the Bible
capitalisation, acronyms, full points with, small capitals in, type treatment

Immediately you will notice that the effect is for the linking text now to look more like RUN-ON sub-entries. The index has become a kind of hybrid of a set-out and a run-on index, reflecting the fact that even in a set-out index the page references always were run-on. As a bare minimum you will have to ensure that the linking text is alphabetised properly. Just replacing the page references in the order in which they occur will NOT achieve this, unless entirely by accident. You will have to adjust the alphabetisation manually. I have silently done this in the example. (See further important comments below regarding the niceties of alphabetising and styling the links.)

Personally, I find the outcome unnecessarily cramped, and prefer setting the index out fully, with each link on a new line. I will cover this option later on. For now, lets call this style of index a RUN-ON E-BOOK index.

For such a RUN-ON e-book index, I would still follow the conventions of Oxford or Chicago style (or whatever style you have opted for). Just use the links instead of the page references as direct replacements (albeit re-orderd to fall in the correct alphabetical sequence).

In Oxford Style, the first page reference is separated from its headword by an enspace. This character is NOT supported on the Kindle. HOWEVER this handy web page: www.cs.tut.fi/~jkorpela/chars/spaces.html explains that an enspace is usually ½ em wide, compared with a normal interword space of ¼ em. BUT normal spaces are sometimes adjusted, whereas nonbreaking spaces are normally NOT adjusted. So using two nonbreaking spaces might look like the better solution. BUT I found when composing this post that the very best way is to use a nonbreaking space followed by a normal space, as this allows the text to break at a better point. (Type &nbsp; to get a nonbreaking space in code view in Sigil.) And, whilst in theory you could use the code: &ensp; for the enspace in an e-pub e-book, the line of least resistance would be to use a nonbreaking and a normal space in your e-pub ebook as well. After all, you are going to use the epub as the source for your kindle, so why make work for yourself? You also ensure BOTH will most likely display the same way and guard against someone making an epub e-reader which, like the kindle, doesn’t support the enspace.

‘Reading’ from/to the main entry:

One principle in indexes is that subentries should ‘read’ from or to the main entry in a consistent way. The text you elect to use for the links should follow this general principle, as if the links were sub- or sub-sub-entries (in this case, run-on). So, in the example, under ‘capitalisation’, the entries would ‘read’ like this: ‘full points with → capitalisation’ and ‘small capitals in → captalisiation’, whereas under ‘apostrophe in’ the entries ‘read’ the other way, like this: ‘apostrophe in → contractions’, ‘apostrophe in → elision’, etc. Try to use link text which follows this convention, as it will make the index more easily readable. I think it will be impossible for the entire index to ‘read’ in the same direction, but it will help if entries within the same block ‘read’ consistently one way or the other. This is the method adopted in the actual index in New Hart’s Rules.

Obviously this is not appropriate in all cases. Chicago (16th edn §16.10) describes this style of index as ‘the subheadings form[ing] a grammatical relationship with the main heading whereby heading and subheading combine into a single phrase’. BUT also says ‘other subheadings [can] form divisions or units within the larger category of the heading’. And says both methods ‘can be mixed in the same index’. It is not said explicitly, but I would have thought it is obvious that it would be better* not to mix the two methods within the same block (i.e. after the same heading).

The comments above relate only to sub-entries and whether in Hart or the Chicago Manual of Style were made in the context of a print index. In an e-book index of the kind I am proposing, the links would be presented as though they were sub- or sub-sub-entries. The discussion above about the relationship between the sub-entries and the main entry is predicated on the organisation of the print index into categories. Your link text will probably NOT always fall neatly into one pattern of relationship between it and the heading it lies beneath. After all, the headings in the print index would have been thought out carefully to organise the information, whereas the page references (and hence the links) will inevitably have a more haphazzard relationship with their heading. So it is most likely that this will favour the former method: organising them into a single phrase which reads from/to one to/from the other*.

(*Although mixing the two may sometimes be unavoidable. But make sure, at least, that the entries within the same block which do ‘read’ to/from the heading do so in the same direction.)

It is possible that you could further subdivide your index to group the links into more logical sections under sub-sub-headings of their own, and so ensure each block relates to its heading consistently. This might give the index greater cohesion. The general guidance about avoiding further levels of sub-sub-sub-headings is in relation to print indexes where the narrow columns would get unfeasibly cramped as the indentation got progressively deeper. In an e-book index, where there can be no coulmns as far as I can see, this is less of an issue and so you could tolerate more sub-levels without any difficulty, provided you keep the link text short enough. The comments in the Chicago Manual of Style (16th edn §16.26) about formatting sub-sub-entries are not relevant to the style of e-book index I am proposing.

Alphabetisation of the link text:

The example above is fairly simple. In particular, it glosses over an important detail: the link text will need careful alphabetisation. And there is a conflict between making it readable and enabling the reader to find the items in it easily. Items above such as ‘apostrophe in’ present no problem. The meaningful part of the link comes first. But an item such as ‘in stages of book preparation’, which occurs in the example below, would need alphabetising on ‘book preparation’. Inverting the entry, to read: ‘book preparation, in stages of’ would make the index unnecessarily complicated. Hart says you should alphabetise your index ‘ignoring any leading prepositions, conjunctions and articles’, which is essentially the same as what I am suggesting. The actual index in New Hart’s Rules is alphabetised in the way I have outlined above. You might have to be a bit more flexible than Hart stipulates in an e-book index in order to accommodate all possible contexts for the link text, but do remember to keep the link text concise.

The Chicago Manual of Style also says to ‘ignore leading prepositions, conjunctions and articles’ but advises against having ‘substantive introductory words at the beginning of sub[entries], in order to preserve the alphabetic logic of the keywords’. And then says that, ‘in indented [set-out] style, where alphabetising functions more visually, it may be better to dispense with such introductory words [in sub-entries] or to invert the [entry]’ (16th edn §16.68, my italics). My feeling is that your link text will more often than not require such introduction to make it meaningful. And so, for an e-book index, I would disregard this advice from Chicago.

Where to begin the link:

I think, on balance, that with a RUN-ON E-BOOK index, the link text will need linking from the start to the end of the text. This is a bit unfortunate because, as discussed immediately above, the link may be prefaced by an introductory phrase of some kind, which ought not to be included in the alphabetisation. An option is to begin the link from the start of the alphabetised portion of the link text: ‘in stages of book preparation’, for example. But the commas which separate the links are visually not as obvious as the underlining for the link and so I do think the link has to extend across the link text in its entirety for a RUN-ON E-BOOK index, otherwise the links become too difficult to distinguish from one another. As outlined below, this is not a problem in a fully set-out index.

Putting all of the above together, my advice for converting a print index to a RUN-ON e-book index is:
  • for entries followed by ONE page reference:
    • turn the headword into a hyperlink directly to the appropriate place in the main text
  • for entries followed by MORE THAN ONE page reference:
    • decide on a short piece of text describing the information each page reference refers to
    • substiutute this text for the page reference
    • turn the text into a link directly to the appropriate place in the main text
    • run the link from the start to the end of each piece of link text
    • re-order the links to place them in the correct alphabetical order
    • begin the alphabetisation after any linking phrase
    • ensure the link text reads to/from the headword/sub-heading it follows in a consistent direction if appropriate
  • retain all the formatting required by the style you have adopted (Oxford or Chicago), working the hyperlink into the style as direct replacements for the page references
Here is an extended example, taken from the actual index in New Hart’s Rules. The link text describes the information which each page references finds in Hart. Cross-references have been removed from the example and will be dealt with in my next post, as they are stylistically more complex:

abbreviations  167–78
apostrophe in  66-7, 170, 175
books of the Bible  140–2
capitalisation  126, 127, 170–2, 192
accents  123, 361, 362
afterword  16
aliases  102, 103
articles:
electronic  350–1
titles  86, 125
 
brackets  81–3, 281–2
angle  82–3, 260, 278, 281, 351, 352
braces (curly)  82
double  281
nested  83
and punctuation  83
in scientific texts  82–3, 260, 274, 275, 278, 281–2
 
copy preparation  25–41
copy-editing  12, 25, 26–7, 31–4
 
dots:
leader  8
medial  274, 275
superscript and subscript  198
 
end matter  1, 8, 9, 10, 11, 13, 16–18
eth (ð)  229
 
Latin  227–8
abbreviations  175–7, 320–2
alphabet  227, 237
in citations  320–2
legal terms  246
 
references  129, 312–27
abbreviations  168, 319, 345
numbering system  327
page numbers  345
unpublished works  129, 132–3, 318, 340
abbreviations, 167–78
apostrophe in, 66-7, 170, 175
books of the Bible, 140–2
capitalisation, 126, 127, 170–2, 192
accents, 123, 361, 362
afterword, 16
aliases, 102, 103
articles:
electronic, 350–1
titles, 86, 125
 
brackets, 81–3, 281–2
angle, 82–3, 260, 278, 281, 351, 352
braces (curly), 82
double, 281
nested, 83
and punctuation, 83
in scientific texts, 82–3, 260, 274, 275, 278, 281–2
 
copy preparation, 25–41
copy-editing, 12, 25, 26–7, 31–4
 
dots:
leader, 8
medial, 274, 275
superscript and subscript, 198
 
end matter, 1, 8, 9, 10, 11, 13, 16–18
eth (ð), 229
 
Latin, 227–8
abbreviations, 175–7, 320–2
alphabet, 227, 237
in citations, 320–2
legal terms, 246
 
references, 129, 312–27
abbreviations, 168, 319, 345
numbering system, 327
page numbers, 345
unpublished works, 129, 132–3, 318, 340

When converted into a RUN-ON E-BOOK index, the example becomes:

abbreviations
apostrophe in  contractions, elision, posessives
books of the Bible
capitalisation  acronyms, full points with, small capitals in, type treatment
accents  alphabetisation, assimilated into English
afterword
aliases  born versus né(e), nicknames versus pseudonyms
articles:
electronic
titles  paraphrased, quote marks for
 
brackets  general principles, in science and maths
angle  in electronic addresses, for electronic sources in citations, versus greater/less than sign, in maths, for tags, usage
braces (curly)
double
nested
and punctuation
in scientific texts  in chemical formulae, for complex ions, clarity in marking copy, dirac bra and ket, general principles
 
copy preparation
copy-editing  in stages of book preparation, coding of headings, responsibilities, editorial style
 
dots:
leader
medial  for co-ordinated species, for single bonds
superscript and subscript
 
end matter  acknowledgements, acknowledgements for illustrations, in book, work published in fascicles, itemised discussion
eth (ð)
 
Latin
abbreviations  in citations, e.g., i.e., etc. & et. al.
alphabet  versus Cyrillic in Slavonic languages, usage
in citations
legal terms
 
references  chapter on, introduction
abbreviations  in citations, for locations in citations, for series
numbering system
page numbers
unpublished works  dating, published versus, typesetting of
abbreviations
apostrophe in, contractions, elision, posessives
books of the Bible
capitalisation, acronyms, full points with, small capitals in, type treatment
accents, alphabetisation, assimilated into English
afterword
aliases, born versus né(e), nicknames versus pseudonyms
articles:
electronic
titles, paraphrased, quote marks for
 
brackets, general principles, in science and maths
angle, in electronic addresses, for electronic sources in citations, versus greater/less than sign, in maths, for tags, usage
braces (curly)
double
nested
and punctuation
in scientific texts, in chemical formulae, for complex ions, clarity in marking copy, dirac bra and ket, general principles
 
copy preparation
copy-editing, in stages of book preparation, coding of headings, responsibilities, editorial style
 
dots:
leader
medial, for co-ordinated species, for single bonds
superscript and subscript
 
end matter, acknowledgements, acknowledgements for illustrations, in book, work published in fascicles, itemised discussion
eth (ð)
 
Latin
abbreviations, in citations, e.g., i.e., etc. & et. al.
alphabet, versus Cyrillic in Slavonic languages, usage
in citations
legal terms
 
references, chapter on, introduction
abbreviations, in citations, for locations in citations, for series
numbering system
page numbers
unpublished works, dating, published versus, typesetting of

I should add that when I was creating this example it threw up a few issues around deciding on the link text. Firstly, I was always able to decide on a suitable piece of text to describe the information. I guess this shows the indexer had chosen the items to reference in the index with some care. In the event that you cannot decide on a reason why an item is included in your index, then maybe it is better left out? There were a very few page references for which, as far as I could see, there was absolutely nothing on the page(s) referred to which related to the heading they appeared under in the index. I have assumed these are typos and have deleted them from the example. And then there were also a few cases where the index contained two page references, mostly consecutive, which repeated the same information. The best way to deal with this would be to delete the second reference altogether, which is what I have done. The reader would jump to the first reference and then read on, rapidly finding the second one. IF exactly the same information is presented in two places in the same book which are a long way apart, then you have a problem. I suppose you could always use something like: ‘and also here’ or: ‘repeated here’ as the text for the second link, but that looks amateurish. I would suggest deleting the reference to the place in the text which presents the material in the least depth, leaving the more significant one alone in the index. As with all aspects of indexing, it’s a judgement call.

So that covers how to adapt a print index to a RUN-ON E-BOOK index. The small matter of making the links using markup is covered below.

A FULLY SET-OUT E-BOOK index:

I am not really satisfied with the outcome of the run-on style. I have already said I find it far too cramped for my linking and think it will be difficult for the reader to find the item(s) they want in the index. In my opinion, what is needed is a fully set-out style, with each link on a line to itself, as though it were a SET-OUT sub-heading in its own right. And I’ll call this style of index a FULLY SET-OUT E-BOOK index.

This style also is immediately problematic: as setting the links down a line and indenting them in the same way as a new sub-heading (i.e. by ONE level) leads to the links being indented in the same way as any real sub-headings which might follow. It is impossible to find a satisfactory solution for alphabetising the links and sub-headings without mixing them in with each other OR having to accept that the alphabetical sequence be in some way broken (which is obviously unacceptable).

And so after CONSIDERABLE reflection, I have come to the conclusion that the ONLY way to make a set-out index of this kind is to set the links in by TWO levels. This does have the unfortunate consquence that the indentation is wrong, but it is the least worst solution. As a knock-on effect of this there will have to be an extra-extra hanging indent for follow-on lines. That is to say, if sub-entries are indented by ONE level, links will need indenting by TWO levels and the hanging indent will have to be THREE levels. Suitable styles would be:

.indexEntry {text-indent: left; margin-left: 3em, text-indent: -3em; font-size: 0.9em;}
.indexSubEntry {text-indent: left; margin-left: 4em, text-indent: -3em; font-size: 0.9em;}
.indexSubSubEntry {text-indent: left; margin-left: 5em; text-indent: -3em; font-size: 0.9em;}
.indexSubSubSubEntry {text-indent: left; margin-left: 6em; text-indent: -3em; font-size: 0.9em;}


As the original print index would have had only TWO levels of indents: entries and sub-entries, the SET-OUT E-BOOK index would have a maximum of FOUR levels of indentation. Because the tools in MS Word cannot readily handle more than two levels of indentation in an index I am limiting these posts to this scenario.

To increase the indent, just multiply the margin-left and text-indent numbers in the styles above by a uniform scale factor. The indent I have used is that recommended by Chicago for print indexes and seems about right to me.

For a RUN-ON print index you will need different styles, with a hanging indent of 2em, rather than 3, as shown above. Suitable styles in this case would be:

.indexEntry {text-indent: left; margin-left: 2em, text-indent: -2em; font-size: 0.9em;}
.indexSubEntry {text-indent: left; margin-left: 3em, text-indent: -2em; font-size: 0.9em;}


and so-on. i.e. reducing the margin-left and text-indent by 1em in all cases.

Adapting a print index to a fully set-out e-book index:

Obviously, for index entries with just the one page reference you would proceed as above, turning the headword into a link directly to the main text.

For index entries with more than one page reference after them, I would put a colon after the heading and drop the link text down onto a new line, one line per link. And indent the links by TWO further levels. The example I gave above would look like this when fully set-out following the style I am recommending:

abbreviations
apostrophe in:
contractions
elision
posessives
books of the Bible
capitalisation:
acronyms
full points with
small capitals in
type treatment

Cross-references aside, Oxford and Chicago style differ only in the way the heading is separated from the page references. The comma following the heading in Chicago style and the enspace following the heading in Oxford style would BOTH be replaced by a colon in my proposal and so, in a SET-OUT E-BOOK index there would be NO DIFFERENCE between Oxford and Chicago style AND no need to fake the enspace. Because this style is neither Oxford nor Chicago, I have used a different colour for the box.

The comments about the need for an extra-extra hanging indent for follow-on lines, whilst important, will mostly NOT make much difference, as each entry in such an index ought (unless your linking text is unusually lengthy) to be confined to a single line, as in my example. But of course you should still include the extra-extra hanging indent to take care of the possibility that the user increases the point size on their e-reader by more than you have anticipated.

The comments above about the way the links read to/from their headings are still relevant.

HOWEVER setting out the index provides a welcome opportunity to make the scheme of the alphabetisation more apparent to the reader. I would recommend beginning the link from the start of the alphabetised portion of the link text, leaving any introductory phrase OUT of the link. I DID try ending the link after the alphabetised portion, ignoring any linking phrase at the end of the link text, but found this actualy made the index harder to follow. I think this is because we read from left to right, or because the index is conventionally set flush-left (i.e ragged right) and so there is no fixed point for the eye to refer to on the right hand side. Whatever the reason, I recommend continuing the link to the END of the line.

I would summarise the way to do this as follows:
  • for entries followed by ONE page reference
    • turn the headword into a hyperlink directly to the appropriate place in the main text (exactly as for a run-on e-book index)
  • for entries followed by MORE THAN ONE page reference
    • end the heading with a colon
    • drop each page reference down onto a new line and indent it TWO further levels so:
      • following a main heading, each page reference is indented as though a new sub-sub-heading
      • following a sub-heading, each page reference is indented as though a new sub-sub-sub-heading
    • decide on a short piece of text describing the information each page reference refers to
    • substitute this text for the page reference
    • turn the text into a link directly to the appropriate place in the main text
    • re-order the links to place them in the correct alphabetical order
    • begin the alphabetisation after any introductory linking phrase
    • begin the link from the start of the alphabetised part i.e. missing out any introductory linking phrase
    • end the link at the END of each line
    • ensure the link text reads to/from the headword/sub-heading it follows in a consistent direction if appropriate
  • the styling is now the same, whether the original print index followed Oxford or Chicago style
  • each link has a line to itself
  • a colon at the end of a line indicates a relation between it and the line below and is now the ONLY punctuation in the index (apart from that embedded within the headings or link text)
  • follow-on lines (if any) will need a hanging indent THREE times larger than the indent used for sub-entries
My extended example re-worked as a FULLY SET-OUT E-BOOK index:

abbreviations
apostrophe in:
contractions
elision
posessives
books of the Bible
capitalisation:
acronyms
full points with
small capitals in
type treatment
accents:
alphabetisation
assimilated into English
afterword
aliases:
born versus né(e)
nicknames versus pseudonyms
articles:
electronic
titles:
paraphrased
quote marks for
 
brackets:
general principles
in science and maths
angle:
in electronic addresses
for electronic sources in citations
versus greater/less than sign
in maths
for tags
usage
braces (curly)
double
nested
and punctuation
in scientific texts:
in chemical formulae
for complex ions
clarity in marking copy
dirac bra and ket
general principles
 
copy preparation
copy-editing:
in stages of book preparation
coding of headings
responsibilities
editorial style
 
dots:
leader
medial:
for co-ordinated species
for single bonds
superscript and subscript
 
end matter:
acknowledgements
acknowledgements for illustrations
in book
work published in fascicles
itemised discussion
eth (ð)
 
Latin
abbreviations:
in citations
e.g., i.e., etc. & et. al.
alphabet:
versus Cyrillic in Slavonic languages
usage
in citations
legal terms
 
references:
chapter on
introduction
abbreviations:
in citations
for locations in citations
for series
numbering system
page numbers
unpublished works:
dating
published versus
typesetting of

I do think this style is a significant improvement over the run-on style. The information is set out visually and the relationships of the various parts to each other are readily apparent from the indentation. The individual links are more easily distinguishable, as each is now on a single line. This also means there is a reduced likelihood of a link being split over two lines, which is a further drawback of the run-on style. And I hope by now it is readily apparent that using linking text instead of page numbers is also a huge improvement. The unintended and unexpected bonus is that every difference between Oxford and Chicago style has been eliminated! Along the road, all punctuation has also vanished, other than that embedded within the headings and link text, resulting in a very clean look. I have added the odd colon, but this has a consistent meaning within this style and I do think it is necessary.

So we now come to the mechanics of making the links. I initially prepared this material in relation to the set-out style and, as that is discussed immediately above and also because the system you will need to follow is easier to explain for this style of index, I will cover it first.

Converting a print index into a fully set-out e-book index:

I have constructed a fanciful example for the puropses of this section of my post. This index has been designed to contain all the examples of links discussed above. The example begins in Oxford style, set-out:

animals 234–98
   cats 267–86, 273, 281
   cobras 110–14
   dogs 236, 239–47, 240

What follows is broken down into tiny steps for the purposes of clarity. This is going to be a complex process and you need to be methodical. I appologise if this is too prissily itemised, but I hope the attention to detail is welcomed.

The information required to build the links:

Each hyperlink will jump to a place in the main text, identified by a label, and so for each link you will need the following information:

  • a sequential number for each page reference (use this to build the label in the main text)
  • the chapter in the main text the page reference refers to (and in which the label will be)
  • the text to use for the link (this will replace the page reference)
To gather this information, you will need to be systematic. I would suggest using the print index file as a starting point.

This will have been created in MS Word and will be in a special section of your file. You will need to select the index and copy and paste it as formatted text into a new section in your MS Word file. It can now be edited, but will have lost its columns. This isn’t a problem: you won’t be able to have columns in your e-book anyway. Most likely this step will already have been done if you have a print book and, ironically, all you would then need to do would be to remove the columns which you probably would have put back in when the print index was finalised (see this place in an earlier post for more information).

Restructuring your index:

So begin by restructuring your index along the lines outlined above, keeping the page references for the moment (and note that any commas separating page references have also been deleted):

animals 234–98
   cats:
         267–86
         273
         281
   cobras 110–14
   dogs:
         236
         239–47
         240

Styling the restructured Index:

I would recommend keeping the MS Word styles at this stage. Use the built-in styles: ‘Index 1’, ‘Index 2’, ‘Index 3’ and ‘Index 4’. Make sure your index is consistent with these styles, main entries in ‘Index 1’, sub-entries in ‘Index 2’, sub-sub-entries in ‘Index 3’ and sub-sub-sub-entries in ‘Index 4’. This will help you remember the level of each index entry as you go along. Resist the temptation to use a series of spaces or tabs to indent the index: you will have to delete all the spaces and tabs before importing the file into your e-book anyway!!!

The easiest way to set the style to ‘Index 1’ etc. is to open the ‘styles pane’: click the tiny ‘ ’ button next to ‘Styles’ at the bottom of the ‘styles’ group in the ‘Home’ tab of the ribbon:


The styles pane helpfully stays open as you edit. If you cannot see ‘Index 1’ etc. in the dialog, click ‘options...’ at the bottom right and set it to ‘show all styles’. Once at least ONE paragraph is styled as ‘Index1’ etc., switching back to ‘show styles in use’ will remove any styles you do not need from the styles pane.

Gathering the information and anotating the file with markup:

Now go through the entire book, gathering the information you need and annotating the index as shown below enclosing your annotation in curly brackets (the hilite colours are informative for this post only, there is no need to colour them in in your own file!):

Only annotate index entries with a page reference at the end. Leave entries which are just plain text (ending in a colon) alone just now. (i.e. lines two and seven in the exampe.)

When there is text before a page reference in your modified index (i.e. lines one and six in the example), use that as the link text in the curly brackets, otherwise, where the line is just a page reference, decide on what you are going to use as the link and enter that in the curly brackets instead (you will have to decide what text to use from the context). In both cases, put the curly brackets at the beginning of the line:

{Ch4#1_animals} 234–98
   cats:
         {Ch4#2_breeding}267–86
         {Ch4#3_care of}373
         {Ch4#4_the musical}281
   {Ch2#5_cobras} 110–14
   dogs:
         {Ch4#6_training}236
         {Ch4#7_loyalty}239–47
         {Ch4#8_antipathy towards cats}240

The format to use inside the curly brackets is: ‘{Chchapter number#link number_link text}’

The reason I am suggesting marking up/annotating the file in this way is to make the index more readable when marked up. Note that lines with markup should all have the curly brackets as the first  item in the line (at this stage).

If for some reason you already have curly brackets in your index, you will have to replace it in your markup with something else. The simplest solution would be to use double curly brackets instead. Otherwise substitute something different which is not already in the file. A vertical line (|) or broken vertical bar (¦) or, if you are using a Mac, perhaps a section mark (§) might be suitable alternatives. But DO use a character you can easily type from your keyboard!

I have also used the hash character: ‘#’ as part of the markup. This is somewhat risky, as there might also need to be other hash characters in your e-book as part of hyperlinks to other labels, for example for links in the html table of contents to sub-headings in your main text. The underscore character: ‘_’ might very well occur elsewhere also. Fortunately, neither can occur in any web addresses, which could be in the header of the index, as standard html encoding will replace these characters with ‘%23’ and ‘%5F’ respectively. Nor, if you are confining yourself (as I recommend) to filenames consisting of single alphanumeric strings with no special characters, will there be any hash or underscore characters in these.

I think it unlikely that either hash or underscore would form part of the text of the index. So the way out is to do the replacements for the index first, making sure that Sigil only makes the replacements in that one file. Should your index already contain either hash or underscore you will just have to use something different or else double them up.

IMPORTANT NOTE: I strongly recommend that you mark up the index before you have added any other markup to your main text. Marking up the main text will cause it to get longer and the page boundaries will creep. This will make using the page numbers to locate the indexed items in the main text that much more difficult. I would suggest opening the document in a split view in MS Word, with the main text in one part and the index in another. (Or else use two documents, index in one and text in the other.) That way you can use the page numbers in the index to find the indexed items in the main text whilst seeing and being able to edit both parts of the document at the same time.

Another very strong recommendation is to bookmark each indexed item in the main text as you go along using bookmark names such as: ‘index1’, ‘index2’, etc, with the link number for that entry at the end. This will save you enormous amounts of time later on (see below).

Deleting the page references:

Once you have finished, go back and delete all the page references (and note that any associated spaces, enspaces or commas have also been deleted):

{Ch4#1_animals}
   cats:
         {Ch4#2_breeding}
         {Ch4#3_care of}
         {Ch4#4_the musical}
   {Ch2#5_cobras}
   dogs:
         {Ch4#6_training}
         {Ch4#7_loyalty}
         {Ch4#8_antipathy towards cats}

Correcting the Alphabetisation:

Eagle-eyed readers will have noticed that the link text (in orange) is NOT all in alphabetical order! This is a consequence of replacing the page references with your informative link text, which will be in the order of the original page references, and so very unlikely to be in alphabetical order, unless completely by accident! Your next task, therefore, is to resequence your index so it will come out in the correct alphabetical order. In my example, only the last three entries needed attention. You will need to read my earlier discussion on alphabetising your link text to inform your sequencing of these links. Here, the relevant items are alphabetised on ‘musical’ and ‘cats’.

When you are finished, your index should be in this order:

{Ch4#1_animals}
   cats:
         {Ch4#2_breeding}
         {Ch4#3_care of}
         {Ch4#4_the musical}
   {Ch2#5_cobras}
   dogs:
         {Ch4#8_antipathy towards cats}
         {Ch4#7_loyalty}
         {Ch4#6_training}
 
Marking it up for the styles to use in your e-book:

I would now add markup for the style to use for each line. Use /x/ for a main index entry, /xx/ for a sub-entry, /xxx/ for a sub-sub-entry and /xxxx/ for a sub-sub-sub-entry (or whatever other markup works for you):

/x/{Ch4#1_animals}
   /xx/cats:
         /xxxx/{Ch4#2_breeding}
         /xxxx/{Ch4#3_care of}
         /xxxx/{Ch4#4_the musical}
   /xx/{Ch2#5_cobras}
   /xx/dogs:
         /xxxx/{Ch4#8_antipathy towards cats}
         /xxxx/{Ch4#7_loyalty}
         /xxxx/{Ch4#6_training}

Adjusting the links:

As a final step you will need to adjust the markup to begin the link at the start of the alphabetised part of your text. For instance: ‘antipathy towards cats’. To do this, just move the first part of the markup (from the: ‘{’ up to and including the: ‘_’) to come immediately before the first alphabetised word [space before: ‘{’ and no space after: ‘_’], like this (lines five and eight):

/x/{Ch4#1_animals}
   /xx/cats:
         /xxxx/{Ch4#2_breeding}
         /xxxx/{Ch4#3_care of}
         /xxxx/the {Ch4#4_musical}
   /xx/{Ch2#5_cobras}
   /xx/dogs:
        /xxxx/antipathy towards {Ch4#8_cats}
        /xxxx/{Ch4#7_loyalty}
        /xxxx/{Ch4#6_training}

IF, in an inverted main entry, the link needs to end before the end of the line, then move the ‘}’ to the end of the link, like this (taken from the extended example above; note the markup for italic, which has been kept as a single unit [as discussed below]):

/x/{Ch7#13_/i/-able/ii/}, words ending in

Other markup:

Obviously, as with the whole book text, you will need to add markup for italic, etc. I have mostly chosen examples with no extra markup needed to stop this all getting more complicated than  it already is.

Using the markup to create your index:

Now you are ready to create the index using the markup you have added.

SAVE the MS Word file and preferably print out a copy, you may want to refer to it later!

Importing the file into your e-book:

Import your index into your e-book as unformatted text. It should look like this (note that the styling in MS Word has gone):

<p>/x/{Ch4#1_animals}</p>
<p>/xx/cats:</p>
<p>/xxxx/{Ch4#2_breeding}</p>

<p>/xxxx/{Ch4#3_care of}</p>
<p>/xxxx/the {Ch4#4_musical}</p>
<p>/xx/{Ch2#5_cobras}</p>
<p>/xx/dogs:</p>

<p>/xxxx/antipathy towards {Ch4#8_cats}</p>
<p>/xxxx/{Ch4#7_loyalty}</p>
<p>/xxxx/{Ch7#6_training}</p>


Replacing the markup using Sigil:

Now make the following replacements:

replace <p>/x/ with <p class="indexEntry">
and replace <p>/xx/ with <p class="indexSubEntry">
and replace <p>/xxx/ with <p class="indexSubSubEntry">
and replace <p>/xxxx/ with <p class="indexSubSubSubEntry">
and replace {Ch with <a href="Chapter
and replace # with .xhtml#x
and replace _ with ">
and finally replace } with </a>

(If you replace ‘{’ and ‘}’ with the same character, like ‘|’, you must make the replacement described for ‘|Ch’ before the replacement for ‘|’.)

The file will now look like this:

<p class="indexEntry"><a href="Chapter4.xhtml#x1">animals</a></p>
<p class="indexSubEntry">cats:</p>
<p class="indexSubSubSubEntry"><a href="Chapter4.xhtml#x2">breeding</a></p>
<p class="indexSubSubSubEntry"><a href="Chapter4.xhtml#x3">care of</a></p>
<p class="indexSubSubSubEntry">the <a href="Chapter4.xhtml#x4">musical</a></p>
<p class="indexSubEntry"><a href="Chapter2.xhtml#x5">cobras</a></p>
<p class="indexSubEntry">dogs:</p>

<p class="indexSubSubSubEntry">antipathy towards <a href="Chapter4.xhtml#x8">cats</a></p>
<p class="indexSubSubSubEntry"><a href="Chapter4.xhtml#x7">loyalty</a></p>
<p class="indexSubSubSubEntry"><a href="Chapter4.xhtml#x6">training</a></p>


(The ‘x’ after the hash character is there to make the labels unique.)

Styling your index:

You will also need corresponding styles in your CSS stylesheet, such as:

.indexEntry {font-size: 0.9em; text-align: left; margin-left: 3em; text-indent: -3em;}
.indexSubEntry {font-size: 0.9em; text-align: left; margin-left: 4em; text-indent: -3em;}
.indexSubSubEntry {font-size: 0.9em; text-align: left; margin-left: 5em; text-indent: -3em;}

.indexSubSubSubEntry {font-size: 0.9em; text-align: left; margin-left: 6em; text-indent: -3em;}

(These styles are left-aligned, 90% of the size of the runing text and a hanging indent of 3em. Each sub-level is indented a further 1em, so the continuation line (if any) will be 2em to the right of the next sub-level.)

The final index will look like this:

animals
   cats:
         breeding
         care of
         the musical
   cobras
   dogs:
         antipathy towards cats
         loyalty
         training

Creating the labels:

You are now ready to make the labels in the chapters for the links in the index to jump to. EACH page reference in the index should match a markup code created by MS Word. Use the bookmarks you created earlier to find the place in the chapter where these are and mark each one up like this in the MS Word file:

… and although a dog’s innate /xl/8/xxl/antipathy towards cats/xxxl/{·XE·"animals:dogs"·} is well …

The 8 hilited in yellow should match the number you gave to the index entry when you marked it up, and will also be stored in the bookmark name: ‘index8’, etc. Leave the invisible markup code: {·XE·"animals:dogs"·} which Word has created intact. It will generally follow the text you hilited when it was created, so the most logical place to markup your file is immediately before the hidden code, as shown above.

Place the markup: /xl/8/xxl//xxxl/ around a short bit of the text in the chapter just before the MS Word markup.

You will be using a <span> tag to do the label, so: /xl/8/xxl/ /xxx/ will be replaced by: <span></span>. DO NOT break up any preexisting markup which lies within. So keep /i//ii/ for italic as a single unit within your new markup: … /xl/1/xxl//i//ii//xxxl/

You might want to leave this step until after you have marked up the main text for formatting etc., as the markup for the labels will be a distraction.

After Importing the text:

When your chapter has been imported into your e-book as unformatted text, it will look like this:

/xl/8/xxl/antipathy towards cats/xxxl/

Notice that the hidden index code which was created by MS Word has gone. I have also left out the surrounding text, subsuming it in the ellipses: (…).

Replacing the markup for the labels:

You can now make the labels by using Find and Replace in Sigil to:

replace /xl/ with <span id="x
and replace /xxl/ with ">
and replace /xxxl/ with </span>

This will produce:

<span id="x8">antipathy towards cats</span></p>

Clicking the link you created in the index will now jump to Chapter4.xhtml and place the <span> with id="x8" at the top of the e-reader screen.

IMPORTANT: Obviously, all of the foregoing assumes you have chapters named ‘Chapter1.xhtml’, ‘Chapter2.xhtml’ etc. etc. Trust me, this really IS the easiest way to do it. It makes creating the html table of contents vastly easier than having custom filenames. The only alternative, if you have an index and also have custom filenames for the chapters is to replace the chapter number (hilited in green above) with the entire custom chapter filename (minus the .extension) and make suitable changes to the other replacements to match. (Although at a pinch I suppose you might create the links using Chapter1.xhtml etc, for the coding and then subsequently replace all instances of ‘Chapter1.xhtml’ and so on with the actual filenames, but that seems an incredibly convoluted way of going about it.) The scope for error creeping into the process is enormous! Given the complexity of the task of creating and then linking the index, I think the method I have outlined is the least likely to go wrong. The actual filename used for each chapter doesn’t matter: the reader never sees it. What the reader sees is the chapter title at the top of the page and the text used for the link in the html table of contents. The actual filemane is immaterial. So go with a simple scheme as I have and things will be easier.

The mechanics of making a RUN-ON index:

The run-on style is somewhat simpler than set-out, but paradoxically is more complex to convert. The explanation below builds on how to create the set-out index, so I’m afraid you'll need to read that FIRST.

There is no need to restructure your index, so gather your information and enter it into the index directly. It will be harder to follow because it is run-on, but that’s unavoidable. Place the curly brackets in an analogous place, BEFORE any heading or page reference and proceed as above. The outcome will be:

{Ch4#1_animals}  234–98
cats  {Ch4#2_breeding}267–86, {Ch4#3_care of}373, {Ch4#4_the musical)281
{Ch2#5_cobras} 110–14
dogs  {Ch4#6_training}236, {Ch4#7_loyalty}239–47, {Ch4#8_antipathy towards cats}240
{Ch4#1_animals}, 234–98
cats, {Ch4#2_breeding}267–86, {Ch4#3_care of}373, {Ch4#4_the musical}281
{Ch2#5_cobras}, 110–14
dogs, {Ch4#6_training}236, {Ch4#7_loyalty}239–47, {Ch4#8_antipathy towards cats}240

After deleting the page numbers, correcting the alphabetisation and adding the markup, the outcome will be:

/x/{Ch4#1_animals}
/xx/cats  {Ch4#2_breeding}, {Ch4#3_care of}, {Ch4#4_the musical}
/xx/{Ch2#5_cobras}
/xx/dogs  {Ch4#8_antipathy towards cats}, {Ch4#6_training}, {Ch4#7_loyalty}
/x/{Ch4#1_animals}
/xx/cats, {Ch4#2_breeding}, {Ch4#3_care of}, {Ch4#4_the musical}
/xx/{Ch2#5_cobras}
/xx/dogs, {Ch4#8_antipathy towards cats}, {Ch4#6_training}, {Ch4#7_loyalty}

DO NOT adjust the beginnings or endings of the links. Remember that in a RUN-ON e-book index I recommend that the WHOLE of the link text is hyperlinked, so that the items in the index remain distinct from one another.

You would then proceed to make the same replacements and you would also construct the links in the main text in exactly the same way as described above.

And that covers, in exhaustive detail, how to adapt your print index to suit an e-book. It looks like a lot of work, and it is. But making an index always was going to be difficult and exacting work. If you want one, that’s how you would go about it.

Next Steps: My next post will complete the sequence with a discussion of how to adapt the cross-references for e-book. These are stylistically more complex and needed a separate post. Not least because I have managed to pull off the same trick and the cross-references in a set-out index will also be the same whether your index began in Oxford or Chicago style. I stress that this was NOT an intended outcome: all adaptations to the original style are actually necessary to accommodate the set-out index style using links rather than page references. Later posts will cover completing the metadata in your e-book and testing it using the tools in Sigil and epubcheck. It will then be finished and ready to convert to Kindle.

Index to ‘how to …’ posts:

How to ‘unpack’ an epub file to edit the contents and see what’s inside.
How to understand what is inside an epub
How to link the html table of Contents in a Kindle e-book
How to restructure the html table of contents for a Kindle
How to delete the html cover for a Kindle ebook
How to link the cover IMAGE in a Kindle e-book
How to clean up your MS Word file before your get started
How to markup an MS Word file to identify the formats before importing it into an epub
How to create a new blank e-pub using Sigil
How to import your marked-up MS Word file into your ebook using Sigil
How to create and link a CSS stylesheet in an e-book using Sigil
How to replace the markup with CSS styles in your ebook using Sigil
How to style an e-book so it works with the limited CSS styling available to Kindle e-readers
How to understand the syntax of CSS
How to style Small Caps in an e-book
How to split your ebook up into chapters using Sigil
How to sequence your e-book
How to phrase the copyright declarations etc. in an e-book
How to generate the logical table of contents using Sigil
How to understand toc.ncx in an e-book
How to generate the html table of contents in an e-pub
How to style the html table of contents using CSS
How to create an html cover for your epub using Sigil
How to present references and notes in a book
How to use Mark Up to link notes in your e-book
How to present a bibliography in a book
How to use markup to link entries in a bibliography with the notes section
How to index an e-book
How to use the tools in MS Word to create an index
How to alphabetise an index or bibliography
How to adapt the print index in your MS Word file for an e-book using markup
How to adapt cross-references in your print index for e-book and how to use markup to make the links
How to understand content.opf
How to understand and edit the Metadata of an ebook using Sigil
How to understand the manifest in content.opf
How to understand the spine and guide in content.opf
How to test your e-pub using flightCrew in Sigil

TinyURL for this post: http://tinyurl.com/qjamqns

No comments:

Post a Comment

 
Twitter Bird Gadget