Both comicvine.py and douban.py return None instead of [] when an HTTP
error occurs during a metadata search. The consumer in search_metadata.py
iterates the result directly, which raises TypeError: 'NoneType' object
is not iterable and crashes the server on single-threaded deployments.
Fixes#3606
Books.atom_timestamp returned Books.timestamp (date added), which is
set at import and never changes. OPDS clients use atom:updated to
decide whether a book has changed on the server, so cover swaps,
metadata edits, and any other post-import change were invisible to
sync clients; they would keep serving the stale cover and title
until a manual refresh.
Atom RFC 4287 defines updated as "the most recent instant in time
when an entry or feed was modified", and Calibre already tracks that
field as last_modified, bumping it on every metadata and cover edit.
Switching the property to return last_modified (with a fallback to
timestamp when last_modified is NULL) aligns Calibre-Web's behaviour
with the Atom contract.
This change only affects the OPDS feed's atom:updated element. Kobo
sync uses its own last_modified comparison path, so it is unaffected.
calibredb export writes <uuid>.jpg next to <uuid>.<format> by
default, leaving an unused cover image in /tmp/calibre_web for
every download that goes through do_calibre_export. The cover
file is never read by calibre-web (cover serving uses a separate
path), so it is pure waste of disk space and IO.
Pass --dont-save-cover, paired with the existing --dont-write-opf,
to skip the unwanted side-effect at the source. Verified that
none of the three do_calibre_export call sites (download via
do_download_file, email via tasks/mail.py, kepubify via
tasks/convert.py) read the cover sibling.
When config_embed_metadata is enabled with config_binariesdir or
config_kepubifypath set, do_download_file stages a copy of every
downloaded book under get_temp_dir() but never removes it. A bulk
OPDS or Kobo sync can fill the host filesystem in minutes; the
effect is amplified for comic formats (CBZ/CBR), where each
staged file can run to hundreds of MB or several GB. Once the
disk fills, downloads silently 404 and a container restart does
not recover the space.
Add an after_this_request hook that removes the staged copy after
the response is sent, gated on filename == get_temp_dir() so the
non-embed and gdrive-direct paths are untouched.
The typical Linux umask of 0022, meaning the encrypted file is world-readable
(-rw-r--r--). Any OS-level user on the same system can read the key and decrypt
the encrypted credentials from app.db.
The `serve_book` function uses `get_book()` which performs no access filtering:
it simply fetches by ID. Compare with `read_book` at web.py:1562 which
correctly uses `get_filtered_book()`. The `common_filters()` function enforces
per-user tag restrictions, language restrictions, and hidden-book rules.
hashlib.md5(dbpath) returns a hash object, not a hex string. Comparing a string
(md5Checksum) to a hash object with != always returns True. This means the
DB-replacement code path is always entered, allowing an attacker who sends a
forged notification (with the known static token) to trigger an arbitrary
metadata.db download from GDrive, replacing the live database.
When an OAuth provider_user_id is already linked to User A, and User B
authenticates with the same OAuth identity, User B is silently logged in as
User A. This is by design for single-user OAuth, but in a multi-user
environment it means: if an attacker gains access to the same OAuth provider
account (e.g., a shared GitHub org account, or by compromising the OAuth
provider), they can log in as the linked Calibre-Web user with no password
needed.
The lxml.etree.fromstring() function use the default XML parser, which resolves
external entities because XML handling defaults in Python sucks. There is no
need for such dangerous misfeatures in calibre-web, so let's disable it.
A user able to upload epub/fb2 could add something like this to the file:
```xml
<?xml version="1.0"?>
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<container><rootfiles><rootfile full-path="&xxe;"/></rootfiles></container>
```
and obtain the content of the `/etc/passwd` file, which is bad™.
In series_list(), the SQLite query correctly orders results by
Series.sort, but a subsequent Python sorted() call (needed to
re-order after appending the "None" category entry) was using
Series.name as the sort key instead of Series.sort.
This caused series titles with leading articles (A, An, The) to
sort strictly alphabetically by the article rather than by the
meaningful word, e.g. "A Collins-Burke Mystery" appeared under
"A" instead of "C".
Fix by using Series.sort (with a fallback to Series.name if sort
is NULL) as the key in the Python re-sort, consistent with the
intent of the existing DB query.
Fixes#3583
request_username() is used as flask-limiter's key_func for the OPDS
blueprint. The limiter evaluates key_func in a before_request handler,
before the route's auth decorator runs. When no Authorization header is
present, request.authorization is None, causing an AttributeError and
a 500 response instead of the expected 401.
Guard against None so unauthenticated requests fall back to an empty
string key, allowing the auth decorator to handle the 401 correctly.
Fixes#3592
Disclaimer: AI assisted—humans supervised.