Re: E-readers
Reply #74 –
The problem lies in the chapter markers. For some reason the "odalszámok" entry in the TOC (something about numbers?) includes a reference to every single page in the book, meaning every page has a TOC marker on the progress bar so that the entire thing is colored in.
Here's a quick fix, but note that you should be able to do the same thing with pdfmod in a GUI:
# dump the metadata to a file called info
pdftk ActaOrientalia_41.pdf dump_data output info
# delete lines 221 through 2359 and save the result in a file called info.tmp
sed -e '221,2359d' info > info.tmp
# create a new PDF with the updated PDF metadata from info.tmp
pdftk ActaOrientalia_41.pdf update_info info.tmp output ActaOrientalia_41_fixed_toc.pdf
It's possible that something should be fixed on KOReader's end as well to prevent this situation from occurring, but in this case that would mean displaying only the top-level TOC markers, thereby significantly reducing their utility.
After you've done that, you could also consider:
# -j0 automatically selects the number of threads
pdf2djvu -j0 ActaOrientalia_41_fixed_toc.pdf > ActaOrientalia_41_fixed_toc.djvu
It'll take a while and it could potentially introduce a certain degree quality degradation, but the result will generally handle much better for reading. [Edit: in this case the result is somewhat disappointing, at least using the default settings — except for the fact that the resulting file is a mere 58.3 MB.)
PS Also see https://github.com/edouard-lopez/pdf2djvu-ocr which could be interesting to run some OCR to boot. Of course just running pdfsandwich and pdf2djvu manually isn't that big of a deal but the less you have to do the better.