Calibre插件开发

16,487

2021-12-18 15:24

643 字

8 分钟

Calibre插件开发

最近在使用calibre-web管理电子书，不过很多时候还是需要用到Calibre桌面版软件，批量管理，编辑电子书等功能，在calibre-web上已经使用calibre-web-douban-api搜素豆瓣元数据，但是桌面版Calibre软件缺没有办法使用，不过calibre可以使用插件，而且是使用python开发，因此可以把calibre-web-douban-api改造一下包装成calibre插件，简单元数据插件还是比较容易的

参考文档

https://manual.calibre-ebook.com/creating_plugins.html#pluginstutorial

首先本地已经安装了calibre软件

基本插件开发

开发HelloWorld插件

新建Python工程文件夹

新建__init__.py，名字必须是这个。

参考HelloWorld，这个插件是文件类型插件，继承FileTypePlugin：

from calibre.customize import FileTypePlugin

class HelloWorld(FileTypePlugin):

    name                = 'Hello World Plugin' # Name of the plugin
    description         = 'Set the publisher to Hello World for all new conversions'
    supported_platforms = ['windows', 'osx', 'linux'] # Platforms this plugin will run on
    author              = 'Acme Inc.' # The author of this plugin
    version             = (1, 0, 0)   # The version number of this plugin
    file_types          = set(['epub', 'mobi']) # The file types that this plugin will be applied to
    on_postprocess      = True # Run this plugin after conversion is complete
    minimum_calibre_version = (0, 7, 53)

    def run(self, path_to_ebook):
        from calibre.ebooks.metadata.meta import get_metadata, set_metadata
        with open(path_to_ebook, 'r+b') as file:
            ext  = os.path.splitext(path_to_ebook)[-1][1:].lower()
            mi = get_metadata(file, ext)
            mi.publisher = 'Hello World'
            set_metadata(file, mi, ext)
        return path_to_ebook

命令安装插件

在刚才工程目录下面执行下面的命令：

calibre-customize -b .
# 输出：Plugin updated: Hello World Plugin (1, 0, 0)

然后到calibre软件中查看：

可以看到插件已经安装到calibre中了。

这个插件是在转换格式的时候使用到。

开发元数据插件

现在可以开始开发元数据插件了。

可以先下载calibre的源码，然后参考里面的其他元数据插件开发：

calibre的开源地址：https://github.com/kovidgoyal/calibre

也可以参考其他人开发的豆瓣插件：https://github.com/jnozsc/calibre-douban

改造成元数据

现在可以改造HelloWorld插件，元数据插件需要继承。

calibre.ebooks.metadata.sources.base.Source，主要需要实现identity和download_cover方法

简单开发出基本结构：

import datetime
from calibre.ebooks.metadata.book.base import Metadata
from calibre.ebooks.metadata.sources.base import Source, Option
class NewDouban(Source):
    name = 'New Test Books'  # Name of the plugin
    description = 'Downloads metadata and covers from Douban Books web site.'
    supported_platforms = ['windows', 'osx', 'linux']  # Platforms this plugin will run on
    author = 'Gary Fu'  # The author of this plugin
    version = (1, 0, 0)  # The version number of this plugin
    minimum_calibre_version = (5, 0, 0)
    capabilities = frozenset(['identify', 'cover'])
    touched_fields = frozenset([
        'title', 'authors', 'tags', 'pubdate', 'comments', 'publisher',
        'identifier:isbn', 'rating', 'identifier:new_douban'
    ])  # language currently disabled
    options = (
    )
    def identify(
            self,
            log,
            result_queue,
            abort,
            title=None,
            authors=None,  # {{{
            identifiers={},
            timeout=30):
        result_queue.put(get_test_book())
def get_test_book():
    book = Metadata('深入理解计算机系统（原书第3版）', ['龚奕利', '贺莲'])
    book.identifiers = {'new_douban': '26912767'}
    book.pubdate = datetime.date(2016, 11, 1)
    book.publisher = '机械工业出版社'
    return book

if __name__ == "__main__":
    # To run these test use: calibre-debug -e ./__init__.py
    from calibre.ebooks.metadata.sources.test import (
        test_identify_plugin, title_test, authors_test
    )
    test_identify_plugin(
        NewDouban.name, [
            ({
                 'identifiers': {
                     'isbn': '9787111544937'
                 },
                 'title': '深入理解计算机系统（原书第3版）'
             }, [title_test('深入理解计算机系统（原书第3版）', exact=True),
                 authors_test(['龚奕利', '贺莲'])])
        ]
    )

插件本地测试：

# 插件定义安装
calibre-customize -b .
# 如果有测试代码，这个是可以调用测试
calibre-debug -e __init__.py

完善插件具体代码

参考calibre-web-douban-api的代码，改造成calibre插件，具体源代码见GitHub开源：

开源地址：https://github.com/fugary/calibre-douban

打包发布

calibre插件就是一个zip包，可以直接把__init__.py打包成zip即可，可以编写一个工具build.py

import os
import shutil
import zipfile
def zip_dir(input_path, output_file):
    output_zip = zipfile.ZipFile(output_file, "w", zipfile.ZIP_DEFLATED)
    for path, dir_names, file_names in os.walk(input_path):
        # 原路径修复: ./src/test -> /test
        parsed_path = path.replace(input_path, '')
        for filename in file_names:
            full_path = os.path.join(path, filename)
            print('zip adding file %s' % full_path)
            # 文件路径，压缩路径
            output_zip.write(full_path, os.path.join(parsed_path, filename))
    output_zip.close()
if __name__ == "__main__":
    input_path = "src"
    out_path = "out"
    output_file = out_path + "/NewDouban.zip"
    if os.path.exists(out_path):
        print('clean path %s' % out_path)
        shutil.rmtree(out_path)
    os.mkdir(out_path)
    zip_dir(input_path, output_file)

发布地址：https://github.com/fugary/calibre-douban/releases

calibre douban

图麻骨

2年前
2023-11-13 12:53:29

大佬，想问下，测试数据有多条，返回到calibre上只显示一条，但是看日志里还有多条是怎么回事，是获取数据处理不对么
- gary
  博主
  图麻骨
  
  已编辑
  
  2年前
  2023-11-13 13:57:57
  
  查查是否把每一本书的结果都加到result_queue了？要循环加进去
- - 图麻骨
    
    gary
    
    2年前
    2023-11-13 14:50:21
    
    感谢回复！以下是下载元数据的日志，有加循环，明明有两条，列表上只显示一条。
    Running identify query with parameters:
    {‘title’: ‘明日的文学’, ‘authors’: [‘柳无忌’], ‘identifiers’: {}, ‘timeout’: 30}
    Using plugins: Local Host Books (1, 1, 0)
    The log from individual plugins is below
    ** Local Host Books (1, 1, 0) **
    Found 2 results
    Downloading from Local Host Books took 2.060363531112671
    
    Title : 明日的文学
    Author(s) : 柳无忌
    Publisher : 建文书店
    Languages : zh_CN
    Published : 1943-04-30T16:00:00+00:00
    Comments : 收《明日的文学》、《国家文学的建设及其理论》、《海洋文学论》、《为新文学辩护》、《学术独立与思想自由》、《我们写新诗的态度》、《戏剧与批评》、《文字的力量》、《语言与文学》、《论翻译》、《西洋文学的研究》、《西洋文学与东方头脑》等12篇论文。
    
    Title : 明日的文学
    Author(s) : 张子三
    Publisher : 现代书局
    Languages : zh_CN
    Published : 1929-12-31T16:00:00+00:00
    Comments : 收《文艺与实社会》、《中国文学与社会文学》、《文学的社会阶级观》、《革命文学》等9篇文章。
    
    The identify phase took 2.26 seconds
    The longest time (2.060364) was taken by: Local Host Books
    Merging results from different sources
    We have 1 merged results, merging took: 0.00 seconds
    
    感谢回复！以下是下载元数据的日志，有加循环，明明有两条，列表上只显示一条。 Running identify query with parameters: {'title': '明日的文学', 'authors': ['柳无忌'], 'identifiers': {}, 'timeout': 30} Using plugins: Local Host Books (1, 1, 0) The log from individual plugins is below ****************************** Local Host Books (1, 1, 0) ****************************** Found 2 results Downloading from Local Host Books took 2.060363531112671 --- Title : 明日的文学 Author(s) : 柳无忌 Publisher : 建文书店 Languages : zh_CN Published : 1943-04-30T16:00:00+00:00 Comments : 收《明日的文学》、《国家文学的建设及其理论》、《海洋文学论》、《为新文学辩护》、《学术独立与思想自由》、《我们写新诗的态度》、《戏剧与批评》、《文字的力量》、《语言与文学》、《论翻译》、《西洋文学的研究》、《西洋文学与东方头脑》等12篇论文。 --- Title : 明日的文学 Author(s) : 张子三 Publisher : 现代书局 Languages : zh_CN Published : 1929-12-31T16:00:00+00:00 Comments : 收《文艺与实社会》、《中国文学与社会文学》、《文学的社会阶级观》、《革命文学》等9篇文章。 ******************************************************************************** The identify phase took 2.26 seconds The longest time (2.060364) was taken by: Local Host Books Merging results from different sources We have 1 merged results, merging took: 0.00 seconds
  - - gary
      博主
      图麻骨
      
      2年前
      2023-11-13 14:56:07
      
      那要看看是不是缺少什么唯一标识
    - - 图麻骨
        
        gary
        
        已编辑
        
        2年前
        2023-11-13 17:09:50
        
        感谢提醒，我看您的源代码，是将isbn与douban_id作为唯一标识符，我的都是民国书没有isbn，把数据库里的id写到api里作为id唯一标识符。但是还是显示返回结果的第一条，下面是日志和部分代码
        Found 2 results
        Downloading from Local Host Books took 2.0510895252227783
        
        Title : 明日的文学
        Author(s) : 柳无忌
        Publisher : 建文书店
        Languages : zh_CN
        Published : 1943-04-30T16:00:00+00:00
        Identifiers : roc_id:1100068
        Comments : 收《明日的文学》、《国家文学的建设及其理论》、《海洋文学论》、《为新文学辩护》、《学术独立与思想自由》、《我们写新诗的态度》、《戏剧与批评》、《文字的力量》、《语言与文学》、《论翻译》、《西洋文学的研究》、《西洋文学与东方头脑》等12篇论文。
        
        Title : 明日的文学
        Author(s) : 张子三
        Publisher : 现代书局
        Languages : zh_CN
        Published : 1929-12-31T16:00:00+00:00
        Identifiers : roc_id:1100041
        Comments : 收《文艺与实社会》、《中国文学与社会文学》、《文学的社会阶级观》、《革命文学》等9篇文章。
        写入代码
        def identify(self,log,result_queue,abort,title,authors,identifiers={},timeout=30):
        data = search_meta(title)
        for book_data in data:
        ans = to_meta(book_data)
        if isinstance(ans,Metadata):
        db = ans.identifiers[ROC_ID]
        if ans.isbn:
        self.cache_isbn_to_identifier(ans.isbn, db)
        if ans.cover:
        self.cache_identifier_to_cover_url(db, ans.cover)
        self.clean_downloaded_metadata(ans)
        result_queue.put(ans)
        
        感谢提醒，我看您的源代码，是将isbn与douban_id作为唯一标识符，我的都是民国书没有isbn，把数据库里的id写到api里作为id唯一标识符。但是还是显示返回结果的第一条，下面是日志和部分代码 Found 2 results Downloading from Local Host Books took 2.0510895252227783 --- Title : 明日的文学 Author(s) : 柳无忌 Publisher : 建文书店 Languages : zh_CN Published : 1943-04-30T16:00:00+00:00 Identifiers : roc_id:1100068 Comments : 收《明日的文学》、《国家文学的建设及其理论》、《海洋文学论》、《为新文学辩护》、《学术独立与思想自由》、《我们写新诗的态度》、《戏剧与批评》、《文字的力量》、《语言与文学》、《论翻译》、《西洋文学的研究》、《西洋文学与东方头脑》等12篇论文。 --- Title : 明日的文学 Author(s) : 张子三 Publisher : 现代书局 Languages : zh_CN Published : 1929-12-31T16:00:00+00:00 Identifiers : roc_id:1100041 Comments : 收《文艺与实社会》、《中国文学与社会文学》、《文学的社会阶级观》、《革命文学》等9篇文章。写入代码 def identify(self,log,result_queue,abort,title,authors,identifiers={},timeout=30): data = search_meta(title) for book_data in data: ans = to_meta(book_data) if isinstance(ans,Metadata): db = ans.identifiers[ROC_ID] if ans.isbn: self.cache_isbn_to_identifier(ans.isbn, db) if ans.cover: self.cache_identifier_to_cover_url(db, ans.cover) self.clean_downloaded_metadata(ans) result_queue.put(ans)