DownloadAllContent

Lightweight web scraping script. Fetch and download main textual content from the current page, provide special support for novels

< Feedback on DownloadAllContent

Review: Good - script works

§
Posted: 06-01-2024

https://www.twbook.cc大佬能问下为什么这网址爬取的小说会有一大堆生辟字(错误字),在原网站上看又没有错误字体。这网站要用加速器

hoothinAuthor
§
Posted: 07-01-2024

因为这个网站用的是css防盗字体,这是字体路径at.alicdn.com/t/c/font_2048323_8o8i7rg3j9w.woff2

你得按照文字对应的实际字形一一替换,建议找chatgpt写个脚本ocr识别伪装字形后批量替换

Post reply

Sign in to post a reply.