Convert web pages to one file for ebook

I want to download HTMLs (example: http://www.brpreiss.com/books/opus6/) and join it to one HTML or some other format that i can use on ebook reader. Sites with free books don't have standard paging, they're not blogs or forums, so don't know how to do some automatic crawling and merging.

asked Mar 2, 2011 at 8:30 Hrvoje Hudo Hrvoje Hudo 582 2 2 gold badges 7 7 silver badges 14 14 bronze badges

5 Answers 5

You can use Calibre for you ebook converting needs. You can get it to make a single ebook of multiple HTML files by linking to them from a single HTML file you setup as a table of contents like this.

145 7 7 bronze badges answered Mar 2, 2011 at 9:37 6,757 2 2 gold badges 25 25 silver badges 27 27 bronze badges

I'm using Sigil for conversion to EPUB, but didn't know that Cailbre can make one ebook from bunch of linked htmls. I'll try, thanx!

Commented Mar 2, 2011 at 10:30

You can use httrack.com to download the webpage(s), then use Calibre to convert them all to an ePub format.

Commented Mar 21, 2011 at 18:47

My process is (using Chrome) to use the Instapaper Text bookmarklet to clean things up a bit, then right click -> Save As, choose to save as a single web page, HTML Only, then open this in Calibre, convert to EPub, then use the Edit Book functionality to tidy up any additional messy bits of markup that get pulled in.

Commented Jan 30, 2015 at 11:24

The way I used to do this was Calibre.

That became too much of a pain though so I built a Chrome Extension to make it easier.

It allows you to build an ebook from your Chrome tabs.

Hope that helps!

answered Apr 30, 2016 at 22:55 129 1 1 silver badge 3 3 bronze badges

The website in your link suggests that the packaging occurs on a 3rd party server the privacy is NOT guaranteed with this method.

Commented May 1, 2016 at 1:57

Do you have suggestions for changes that would make you feel more secure? I have done my best to only require the bare minimum information for creating a book, but I'm open to further feedback. If you look at any comparable service, you will find that any content you want to save is sent to a server. The difference is that those services also require an account and have all content associated to your name. They also don't provide source code for their websites to allow you to see what they collect. The extension is open source and I'm happy to answer any questions about that code.

Commented May 6, 2016 at 20:03 What a great tool! Thank you very much for providing it to the community for free! Commented Apr 14, 2018 at 6:13 It doesn't include images :( Commented Jan 4, 2021 at 12:31

Putting aside that OP didn't seem to query for any particular privacy/licensing requirement, EpubPress got its backend open-sourced in 2017. And (barring bugs) it should also be able to include pictures. The only caveat if any is that they should be publicly accessible since the conversion process happens independently of your browser credentials.

Commented Feb 26, 2023 at 22:59

Pandoc can take a link to a page (or a html file) and convert it to pdf/epub .

I'm not sure if it'd crawl. If it doesn't, you could crawl pages first with wget or something (or just collect links) and give it to pandoc.

answered Mar 21, 2011 at 17:55 Ananth Pattabiraman Ananth Pattabiraman 243 2 2 silver badges 6 6 bronze badges

according to the man page it will: "Instead of a file, an absolute URI may be given. In this case pandoc will fetch the content using HTTP"

Commented May 18, 2017 at 12:04 It could not crawl crosstocrown.org/books/fourseeds Commented Jun 2 at 14:46

You can use https://getpocket.com and the pocket recipe in calibre accessible via the "Fetch news" menu.

enter image description here

answered Aug 20, 2015 at 10:20 1,124 1 1 gold badge 11 11 silver badges 20 20 bronze badges

HTTrack is a good option - it will build an ebook from a website: It is available for download from here: https://www.httrack.com/ HTTrack "allows you to download a World Wide Web site from the Internet to a local directory, building recursively all directories, getting HTML, images, and other files from the server to your computer. HTTrack arranges the original site's relative link-structure."

You can then convert the HTML into an EPUB , AZW3 or PDF using Calibre, or any other HTML to epub conversion software.

A second option to convert directly to EPUB is EpubPress: It has extensions to allow use from Firefox (v44.0+ only) or Chrome. To use this software you need to open a browser window. Each tab is essentially a 'chapter' in your ebook. Arrange the tabs in the desired order of appearance, then activate epubpress - it will download and arrange the tabs in their order of appearance, in .epub format. Hope this helps!

*However, note that EpubPress downloads discrete webpages - not a 'website', at HTTrack does. To download a website with EpubPress you must open each link on the website as a separate tab, then use Epubpress to collect these links into .epub format.