paperless-ngx/docs/administration.md
shamoon f674183593 Squashed commit of the following:
commit 80ff5677eacec9271f342525155c250d9245c54c
Author: shamoon <4887959+shamoon@users.noreply.github.com>
Date:   Mon Dec 4 06:40:17 2023 -0800

    Fix: bulk edit object permissions should use permissions object (#4797)

commit 62c417cd5189be0b93e3c0529326609f9d1e4f40
Author: Trenton Holmes <797416+stumpylog@users.noreply.github.com>
Date:   Sun Dec 3 19:09:02 2023 -0800

    Fixes the 0023 migration to include the new help text and verbose name

commit f27f25aa03d032ad7ca6a5f2ba41f83bd00c0bd7
Author: shamoon <4887959+shamoon@users.noreply.github.com>
Date:   Sun Dec 3 15:35:30 2023 -0800

    Enhancement: support assigning custom fields via consumption templates (#4727)

commit 285a4b5aef0aa4676ee66751a381e77a9e9fcdcf
Author: shamoon <4887959+shamoon@users.noreply.github.com>
Date:   Sun Dec 3 12:57:43 2023 -0800

    Fix: empty strings for consumption template fields should be treated as None (#4762)

commit 47a2ded30d7e77f2a264b187821af001232ce3d5
Author: shamoon <4887959+shamoon@users.noreply.github.com>
Date:   Sat Dec 2 16:52:48 2023 -0800

    Fix: use default permissions for objects created via dropdown (#4778)

commit 5b502b1e1a34528b59a0af0a01e73a1f9e7c0656
Author: Trenton H <797416+stumpylog@users.noreply.github.com>
Date:   Sat Dec 2 16:18:06 2023 -0800

    Use the original image file for the checksum, not the maybe alpha removed version (#4781)

commit aff56077a8ac4d4ec78860d52b824961fada847b
Author: shamoon <4887959+shamoon@users.noreply.github.com>
Date:   Sat Dec 2 08:26:42 2023 -0800

    Feature: update user profile (#4678)

commit 6e371ac5ac2227a6bdef5e15483f5b478ba78c4d
Author: Trenton H <797416+stumpylog@users.noreply.github.com>
Date:   Sat Dec 2 08:26:19 2023 -0800

    Enhancement: Allow excluding mail attachments by name (#4691)

    * Adds new filtering to exclude attachments from processing

    * Frontend use include / exclude mail rule filename filters

    ---------

    Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>

commit 1b69b89d2dd4ec021aa6e3bceb175340cdd4e0ff
Author: shamoon <4887959+shamoon@users.noreply.github.com>
Date:   Sat Dec 2 08:24:17 2023 -0800

    Chore: Remove unneeded .env entry, revert crowdin action rm, reduce frequency

commit 5a20c8e512a7c8a45c26eed5f9e0cf3b1bb04661
Author: shamoon <4887959+shamoon@users.noreply.github.com>
Date:   Sat Dec 2 07:56:56 2023 -0800

    Fix version checker GitHub api url (#4773)

commit 4ca1503beb1c65eb4f77ac992b4b3ea74fbc6fd4
Author: shamoon <4887959+shamoon@users.noreply.github.com>
Date:   Sat Dec 2 07:47:57 2023 -0800

    Fix: Limit global drag-drop to events with files (#4767)

commit 20f27fe32ffe144813d6c69e3326bf230a2da52a
Author: shamoon <4887959+shamoon@users.noreply.github.com>
Date:   Fri Dec 1 18:05:43 2023 -0800

    Remove the pngx .env file for crowdin action

commit 1a50d6bb8624a3dd419cadb48e4ee5416fd46ce6
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Sat Dec 2 02:04:16 2023 +0000

    Bump the actions group with 2 updates (#4745)

    Bumps the actions group with 2 updates: [actions/github-script](https://github.com/actions/github-script) and [stumpylog/image-cleaner-action](https://github.com/stumpylog/image-cleaner-action).

    Updates `actions/github-script` from 6 to 7
    - [Release notes](https://github.com/actions/github-script/releases)
    - [Commits](https://github.com/actions/github-script/compare/v6...v7)

    Updates `stumpylog/image-cleaner-action` from 0.3.0 to 0.4.0
    - [Release notes](https://github.com/stumpylog/image-cleaner-action/releases)
    - [Changelog](https://github.com/stumpylog/image-cleaner-action/blob/main/CHANGELOG.md)
    - [Commits](https://github.com/stumpylog/image-cleaner-action/compare/v0.3.0...v0.4.0)

    ---
    updated-dependencies:
    - dependency-name: actions/github-script
      dependency-type: direct:production
      update-type: version-update:semver-major
      dependency-group: actions
    - dependency-name: stumpylog/image-cleaner-action
      dependency-type: direct:production
      update-type: version-update:semver-minor
      dependency-group: actions
    ...

    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

commit 0b16c2db0372535d18e75267f0a26e4d7f4c812a
Author: shamoon <4887959+shamoon@users.noreply.github.com>
Date:   Fri Dec 1 17:54:00 2023 -0800

    Update crowdin action triggers

commit 76ac8883865e123bb48f80336cc0658f7d913249
Author: shamoon <4887959+shamoon@users.noreply.github.com>
Date:   Fri Dec 1 17:44:33 2023 -0800

    Chore: Implement crowdin GHA (#4706)

commit 7cfa05d7f2c63d2a1c25eb4e0b54ad3a52726a59
Author: Paperless-ngx Bot [bot] <99855517+paperlessngx-bot@users.noreply.github.com>
Date:   Fri Dec 1 17:44:05 2023 -0800

    New Crowdin updates (#4729)

commit e3496d04858337a193b623fabc8cc49ec0dc08cf
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Sat Dec 2 01:07:13 2023 +0000

    Bump the frontend-eslint-dependencies group in /src-ui with 3 updates (#4756)

    Bumps the frontend-eslint-dependencies group in /src-ui with 3 updates: [@typescript-eslint/eslint-plugin](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/eslint-plugin), [@typescript-eslint/parser](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/parser) and [eslint](https://github.com/eslint/eslint).

    Updates `@typescript-eslint/eslint-plugin` from 6.9.1 to 6.13.1
    - [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
    - [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/eslint-plugin/CHANGELOG.md)
    - [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v6.13.1/packages/eslint-plugin)

    Updates `@typescript-eslint/parser` from 6.9.1 to 6.13.1
    - [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
    - [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/parser/CHANGELOG.md)
    - [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v6.13.1/packages/parser)

    Updates `eslint` from 8.52.0 to 8.55.0
    - [Release notes](https://github.com/eslint/eslint/releases)
    - [Changelog](https://github.com/eslint/eslint/blob/main/CHANGELOG.md)
    - [Commits](https://github.com/eslint/eslint/compare/v8.52.0...v8.55.0)

    ---
    updated-dependencies:
    - dependency-name: "@typescript-eslint/eslint-plugin"
      dependency-type: direct:development
      update-type: version-update:semver-minor
      dependency-group: frontend-eslint-dependencies
    - dependency-name: "@typescript-eslint/parser"
      dependency-type: direct:development
      update-type: version-update:semver-minor
      dependency-group: frontend-eslint-dependencies
    - dependency-name: eslint
      dependency-type: direct:development
      update-type: version-update:semver-minor
      dependency-group: frontend-eslint-dependencies
    ...

    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

commit d2c33c00746ecc4ceba35a1ce5339354fdbf20ea
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Sat Dec 2 00:58:27 2023 +0000

    Bump the frontend-jest-dependencies group in /src-ui with 2 updates (#4744)

    Bumps the frontend-jest-dependencies group in /src-ui with 2 updates: [@types/jest](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/jest) and [jest-preset-angular](https://github.com/thymikee/jest-preset-angular).

    Updates `@types/jest` from 29.5.7 to 29.5.10
    - [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
    - [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/jest)

    Updates `jest-preset-angular` from 13.1.2 to 13.1.4
    - [Release notes](https://github.com/thymikee/jest-preset-angular/releases)
    - [Changelog](https://github.com/thymikee/jest-preset-angular/blob/main/CHANGELOG.md)
    - [Commits](https://github.com/thymikee/jest-preset-angular/compare/v13.1.2...v13.1.4)

    ---
    updated-dependencies:
    - dependency-name: "@types/jest"
      dependency-type: direct:development
      update-type: version-update:semver-patch
      dependency-group: frontend-jest-dependencies
    - dependency-name: jest-preset-angular
      dependency-type: direct:development
      update-type: version-update:semver-patch
      dependency-group: frontend-jest-dependencies
    ...

    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

commit 9c5caecafadc6f28e414117ba7be329243863608
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Sat Dec 2 00:49:46 2023 +0000

    Bump @playwright/test from 1.39.0 to 1.40.1 in /src-ui (#4749)

    Bumps [@playwright/test](https://github.com/microsoft/playwright) from 1.39.0 to 1.40.1.
    - [Release notes](https://github.com/microsoft/playwright/releases)
    - [Commits](https://github.com/microsoft/playwright/compare/v1.39.0...v1.40.1)

    ---
    updated-dependencies:
    - dependency-name: "@playwright/test"
      dependency-type: direct:development
      update-type: version-update:semver-minor
    ...

    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

commit 64651d5a8424488ad4878bb61098376c550d892e
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Sat Dec 2 00:38:35 2023 +0000

    Bump wait-on from 7.0.1 to 7.2.0 in /src-ui (#4747)

    Bumps [wait-on](https://github.com/jeffbski/wait-on) from 7.0.1 to 7.2.0.
    - [Release notes](https://github.com/jeffbski/wait-on/releases)
    - [Commits](https://github.com/jeffbski/wait-on/compare/v7.0.1...v7.2.0)

    ---
    updated-dependencies:
    - dependency-name: wait-on
      dependency-type: direct:development
      update-type: version-update:semver-minor
    ...

    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

commit 4493236879bcc70aa3e4b6be769a543d9ada239d
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Fri Dec 1 16:28:50 2023 -0800

    Bump @types/node from 20.8.10 to 20.10.2 in /src-ui (#4748)

    Bumps [@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node) from 20.8.10 to 20.10.2.
    - [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
    - [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node)

    ---
    updated-dependencies:
    - dependency-name: "@types/node"
      dependency-type: direct:development
      update-type: version-update:semver-minor
    ...

    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

commit ce643942ea4f2635a13c22e34a933b254647d560
Author: omahs <73983677+omahs@users.noreply.github.com>
Date:   Fri Dec 1 15:55:03 2023 +0100

    Documentation: Fix typos (#4737)

commit 46d216b02f35abb5fdbed0539984b8d262192e01
Author: shamoon <4887959+shamoon@users.noreply.github.com>
Date:   Thu Nov 30 23:14:59 2023 -0800

    Remove project actions in favor of GH workflows

commit 133d43ae3057bceb56cbedaf96742ce62c0d7930
Author: shamoon <4887959+shamoon@users.noreply.github.com>
Date:   Thu Nov 30 19:08:03 2023 -0800

    Enhancement: auto-refresh logs & tasks (#4680)

commit 27155cb7e3230a6909ff9b339e41bf423fb08e64
Author: Trenton H <797416+stumpylog@users.noreply.github.com>
Date:   Thu Nov 30 17:12:14 2023 -0800

    Fixes the image cleaner not running for the registry cache (#4732)

commit 69be86e16c956e2129baef318e76a5f4cadf8b7a
Author: Trenton Holmes <797416+stumpylog@users.noreply.github.com>
Date:   Thu Nov 30 07:11:46 2023 -0800

    Resets version string to 2.0.1-dev

commit 65f6b0881ea2dce37a9c8b8ee079278e4fd81865
Author: Trenton Holmes <797416+stumpylog@users.noreply.github.com>
Date:   Thu Nov 30 07:11:00 2023 -0800

    Bumps version to 2.0.1

commit c2bede40c7094e46f49640790cf2267bd273dadd
Merge: 6575c6940 33c2398de
Author: Trenton Holmes <797416+stumpylog@users.noreply.github.com>
Date:   Thu Nov 30 07:10:29 2023 -0800

    Merge remote-tracking branch 'origin/dev'

commit 33c2398de949dd4ee97b08736e5964efab2fcc01
Author: Paperless-ngx Bot [bot] <99855517+paperlessngx-bot@users.noreply.github.com>
Date:   Thu Nov 30 07:08:10 2023 -0800

    New Crowdin updates (#4695)

commit 64cfc43891fbc7da692c138e73922d45b2e12c8c
Author: Trenton H <797416+stumpylog@users.noreply.github.com>
Date:   Thu Nov 30 07:00:54 2023 -0800

    Inreases the length to 5, allowing for commas as well as values (#4719)

commit 6575c69409afeb68f70fd602832e15c9d0701bc2
Author: shamoon <4887959+shamoon@users.noreply.github.com>
Date:   Wed Nov 29 21:56:05 2023 -0800

    Fix: restore docs search

commit e3f4e0b77572a9a784f748a79aa78c25388fb77c
Author: Trenton H <797416+stumpylog@users.noreply.github.com>
Date:   Wed Nov 29 12:18:44 2023 -0800

    Adds new setting to control color conversions (#4709)

commit 5be89bfda5228e19351387351b27ece458528317
Author: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Date:   Wed Nov 29 11:50:30 2023 -0800

    Changelog v2.0.0 - GHA (#4693)

    Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>

commit e1b573adeb3f9b644fb51bd9359b51922df10732
Author: Trenton H <797416+stumpylog@users.noreply.github.com>
Date:   Wed Nov 29 11:28:27 2023 -0800

    Fix: Add a warning about a low image DPI which may cause OCR to fail (#4708)

commit 0913c7aa9efbd071bd5c423b5d5de5cf12617185
Author: shamoon <4887959+shamoon@users.noreply.github.com>
Date:   Wed Nov 29 11:10:55 2023 -0800

    Fix: share links for URLs containing 'api' incorrect in dropdown (#4701)
2023-12-04 17:13:00 -08:00

610 lines
21 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Administration
## Making backups {#backup}
Multiple options exist for making backups of your paperless instance,
depending on how you installed paperless.
Before making a backup, it's probably best to make sure that paperless is not actively
consuming documents at that time.
Options available to any installation of paperless:
- Use the [document exporter](#exporter). The document exporter exports all your documents,
thumbnails, metadata, and database contents to a specific folder. You may import your
documents and settings into a fresh instance of paperless again or store your
documents in another DMS with this export.
The document exporter is also able to update an already existing
export. Therefore, incremental backups with `rsync` are entirely
possible.
!!! caution
You cannot import the export generated with one version of paperless in
a different version of paperless. The export contains an exact image of
the database, and migrations may change the database layout.
Options available to docker installations:
- Backup the docker volumes. These usually reside within
`/var/lib/docker/volumes` on the host and you need to be root in
order to access them.
Paperless uses 4 volumes:
- `paperless_media`: This is where your documents are stored.
- `paperless_data`: This is where auxiliary data is stored. This
folder also contains the SQLite database, if you use it.
- `paperless_pgdata`: Exists only if you use PostgreSQL and
contains the database.
- `paperless_dbdata`: Exists only if you use MariaDB and contains
the database.
Options available to bare-metal and non-docker installations:
- Backup the entire paperless folder. This ensures that if your
paperless instance crashes at some point or your disk fails, you can
simply copy the folder back into place and it works.
When using PostgreSQL or MariaDB, you'll also have to backup the
database.
### Restoring {#migrating-restoring}
If you've backed-up Paperless-ngx using the [document exporter](#exporter),
restoring can simply be done with the [document importer](#importer).
Of course, other backup strategies require restoring any volumes, folders and database
copies you created in the steps above.
## Updating Paperless {#updating}
### Docker Route {#docker-updating}
If a new release of paperless-ngx is available, upgrading depends on how
you installed paperless-ngx in the first place. The releases are
available at the [release
page](https://github.com/paperless-ngx/paperless-ngx/releases).
First of all, ensure that paperless is stopped.
```shell-session
$ cd /path/to/paperless
$ docker compose down
```
After that, [make a backup](#backup).
1. If you pull the image from the docker hub, all you need to do is:
```shell-session
$ docker compose pull
$ docker compose up
```
The Docker Compose files refer to the `latest` version, which is
always the latest stable release.
1. If you built the image yourself, do the following:
```shell-session
$ git pull
$ docker compose build
$ docker compose up
```
Running `docker compose up` will also apply any new database migrations.
If you see everything working, press CTRL+C once to gracefully stop
paperless. Then you can start paperless-ngx with `-d` to have it run in
the background.
!!! note
In version 0.9.14, the update process was changed. In 0.9.13 and
earlier, the Docker Compose files specified exact versions and pull
won't automatically update to newer versions. In order to enable
updates as described above, either get the new `docker-compose.yml`
file from
[here](https://github.com/paperless-ngx/paperless-ngx/tree/main/docker/compose)
or edit the `docker-compose.yml` file, find the line that says
```
image: ghcr.io/paperless-ngx/paperless-ngx:0.9.x
```
and replace the version with `latest`:
```
image: ghcr.io/paperless-ngx/paperless-ngx:latest
```
!!! note
In version 1.7.1 and onwards, the Docker image can now be pinned to a
release series. This is often combined with automatic updaters such as
Watchtower to allow safer unattended upgrading to new bugfix releases
only. It is still recommended to always review release notes before
upgrading. To pin your install to a release series, edit the
`docker-compose.yml` find the line that says
```
image: ghcr.io/paperless-ngx/paperless-ngx:latest
```
and replace the version with the series you want to track, for
example:
```
image: ghcr.io/paperless-ngx/paperless-ngx:1.7
```
### Bare Metal Route {#bare-metal-updating}
After grabbing the new release and unpacking the contents, do the
following:
1. Update dependencies. New paperless version may require additional
dependencies. The dependencies required are listed in the section
about
[bare metal installations](setup.md#bare_metal).
2. Update python requirements. Keep in mind to activate your virtual
environment before that, if you use one.
```shell-session
$ pip install -r requirements.txt
```
!!! note
At times, some dependencies will be removed from requirements.txt.
Comparing the versions and removing no longer needed dependencies
will keep your system or virtual environment clean and prevent
possible conflicts.
3. Migrate the database.
```shell-session
$ cd src
$ python3 manage.py migrate # (1)
```
1. Including `sudo -Hu <paperless_user>` may be required
This might not actually do anything. Not every new paperless version
comes with new database migrations.
### Database Upgrades
In general, paperless does not require a specific version of PostgreSQL or MariaDB and it is
safe to update them to newer versions. However, you should always take a backup and follow
the instructions from your database's documentation for how to upgrade between major versions.
For PostgreSQL, refer to [Upgrading a PostgreSQL Cluster](https://www.postgresql.org/docs/current/upgrading.html).
For MariaDB, refer to [Upgrading MariaDB](https://mariadb.com/kb/en/upgrading/)
## Downgrading Paperless {#downgrade-paperless}
Downgrades are possible. However, some updates also contain database
migrations (these change the layout of the database and may move data).
In order to move back from a version that applied database migrations,
you'll have to revert the database migration _before_ downgrading, and
then downgrade paperless.
This table lists the compatible versions for each database migration
number.
| Migration number | Version range |
| ---------------- | --------------- |
| 1011 | 1.0.0 |
| 1012 | 1.1.0 - 1.2.1 |
| 1014 | 1.3.0 - 1.3.1 |
| 1016 | 1.3.2 - current |
Execute the following management command to migrate your database:
```shell-session
$ python3 manage.py migrate documents <migration number>
```
!!! note
Some migrations cannot be undone. The command will issue errors if that
happens.
## Management utilities {#management-commands}
Paperless comes with some management commands that perform various
maintenance tasks on your paperless instance. You can invoke these
commands in the following way:
With Docker Compose, while paperless is running:
```shell-session
$ cd /path/to/paperless
$ docker compose exec webserver <command> <arguments>
```
With docker, while paperless is running:
```shell-session
$ docker exec -it <container-name> <command> <arguments>
```
Bare metal:
```shell-session
$ cd /path/to/paperless/src
$ python3 manage.py <command> <arguments> # (1)
```
1. Including `sudo -Hu <paperless_user>` may be required
All commands have built-in help, which can be accessed by executing them
with the argument `--help`.
### Document exporter {#exporter}
The document exporter exports all your data (including your settings
and database contents) from paperless into a folder for backup or
migration to another DMS.
If you use the document exporter within a cronjob to backup your data
you might use the `-T` flag behind exec to suppress "The input device
is not a TTY" errors. For example:
`docker compose exec -T webserver document_exporter ../export`
```
document_exporter target [-c] [-d] [-f] [-na] [-nt] [-p] [-sm] [-z]
optional arguments:
-c, --compare-checksums
-d, --delete
-f, --use-filename-format
-na, --no-archive
-nt, --no-thumbnail
-p, --use-folder-prefix
-sm, --split-manifest
-z, --zip
-zn, --zip-name
```
`target` is a folder to which the data gets written. This includes
documents, thumbnails and a `manifest.json` file. The manifest contains
all metadata from the database (correspondents, tags, etc).
When you use the provided docker compose script, specify `../export` as
the target. This path inside the container is automatically mounted on
your host on the folder `export`.
If the target directory already exists and contains files, paperless
will assume that the contents of the export directory are a previous
export and will attempt to update the previous export. Paperless will
only export changed and added files. Paperless determines whether a file
has changed by inspecting the file attributes "date/time modified" and
"size". If that does not work out for you, specify `-c` or
`--compare-checksums` and paperless will attempt to compare file
checksums instead. This is slower.
Paperless will not remove any existing files in the export directory. If
you want paperless to also remove files that do not belong to the
current export such as files from deleted documents, specify `-d` or `--delete`.
Be careful when pointing paperless to a directory that already contains
other files.
The filenames generated by this command follow the format
`[date created] [correspondent] [title].[extension]`. If you want
paperless to use [`PAPERLESS_FILENAME_FORMAT`](configuration.md#PAPERLESS_FILENAME_FORMAT) for exported filenames
instead, specify `-f` or `--use-filename-format`.
If `-na` or `--no-archive` is provided, no archive files will be exported,
only the original files.
If `-nt` or `--no-thumbnail` is provided, thumbnail files will not be exported.
!!! note
When using the `-na`/`--no-archive` or `-nt`/`--no-thumbnail` options
the exporter will not output these files for backup. After importing,
the [sanity checker](#sanity-checker) will warn about missing thumbnails and archive files
until they are regenerated with `document_thumbnails` or [`document_archiver`](#archiver).
It can make sense to omit these files from backup as their content and checksum
can change (new archiver algorithm) and may then cause additional used space in
a deduplicated backup.
If `-p` or `--use-folder-prefix` is provided, files will be exported
in dedicated folders according to their nature: `archive`, `originals`,
`thumbnails` or `json`
If `-sm` or `--split-manifest` is provided, information about document
will be placed in individual json files, instead of a single JSON file. The main
manifest.json will still contain application wide information (e.g. tags, correspondent,
documenttype, etc)
If `-z` or `--zip` is provided, the export will be a zip file
in the target directory, named according to the current local date or the
value set in `-zn` or `--zip-name`.
!!! warning
If exporting with the file name format, there may be errors due to
your operating system's maximum path lengths. Try adjusting the export
target or consider not using the filename format.
### Document importer {#importer}
The document importer takes the export produced by the [Document
exporter](#exporter) and imports it into paperless.
The importer works just like the exporter. You point it at a directory,
and the script does the rest of the work:
```
document_importer source
```
When you use the provided docker compose script, put the export inside
the `export` folder in your paperless source directory. Specify
`../export` as the `source`.
!!! note
Importing from a previous version of Paperless may work, but for best
results it is suggested to match the versions.
### Document retagger {#retagger}
Say you've imported a few hundred documents and now want to introduce a
tag or set up a new correspondent, and apply its matching to all of the
currently-imported docs. This problem is common enough that there are
tools for it.
```
document_retagger [-h] [-c] [-T] [-t] [-i] [--id-range] [--use-first] [-f]
optional arguments:
-c, --correspondent
-T, --tags
-t, --document_type
-s, --storage_path
-i, --inbox-only
--id-range
--use-first
-f, --overwrite
```
Run this after changing or adding matching rules. It'll loop over all
of the documents in your database and attempt to match documents
according to the new rules.
Specify any combination of `-c`, `-T`, `-t` and `-s` to have the
retagger perform matching of the specified metadata type. If you don't
specify any of these options, the document retagger won't do anything.
Specify `-i` to have the document retagger work on documents tagged with
inbox tags only. This is useful when you don't want to mess with your
already processed documents.
Specify `--id-range 1 100` to have the document retagger work only on a
specific range of document id´s. This can be useful if you have a lot of
documents and want to test the matching rules only on a subset of
documents.
When multiple document types or correspondents match a single document,
the retagger won't assign these to the document. Specify `--use-first`
to override this behavior and just use the first correspondent or type
it finds. This option does not apply to tags, since any amount of tags
can be applied to a document.
Finally, `-f` specifies that you wish to overwrite already assigned
correspondents, types and/or tags. The default behavior is to not assign
correspondents and types to documents that have this data already
assigned. `-f` works differently for tags: By default, only additional
tags get added to documents, no tags will be removed. With `-f`, tags
that don't match a document anymore get removed as well.
### Managing the Automatic matching algorithm
The _Auto_ matching algorithm requires a trained neural network to work.
This network needs to be updated whenever something in your data
changes. The docker image takes care of that automatically with the task
scheduler. You can manually renew the classifier by invoking the
following management command:
```
document_create_classifier
```
This command takes no arguments.
### Document thumbnails {#thumbnails}
Use this command to re-create document thumbnails. Optionally include the ` --document {id}` option to generate thumbnails for a specific document only.
You may also specify `--processes` to control the number of processes used to generate new thumbnails. The default is to utilize
a quarter of the available processors.
```
document_thumbnails
```
### Managing the document search index {#index}
The document search index is responsible for delivering search results
for the website. The document index is automatically updated whenever
documents get added to, changed, or removed from paperless. However, if
the search yields non-existing documents or won't find anything, you
may need to recreate the index manually.
```
document_index {reindex,optimize}
```
Specify `reindex` to have the index created from scratch. This may take
some time.
Specify `optimize` to optimize the index. This updates certain aspects
of the index and usually makes queries faster and also ensures that the
autocompletion works properly. This command is regularly invoked by the
task scheduler.
### Managing filenames {#renamer}
If you use paperless' feature to
[assign custom filenames to your documents](advanced_usage.md#file-name-handling), you can use this command to move all your files after
changing the naming scheme.
!!! warning
Since this command moves your documents, it is advised to do a backup
beforehand. The renaming logic is robust and will never overwrite or
delete a file, but you can't ever be careful enough.
```
document_renamer
```
The command takes no arguments and processes all your documents at once.
Learn how to use
[Management Utilities](#management-commands).
### Sanity checker {#sanity-checker}
Paperless has a built-in sanity checker that inspects your document
collection for issues.
The issues detected by the sanity checker are as follows:
- Missing original files.
- Missing archive files.
- Inaccessible original files due to improper permissions.
- Inaccessible archive files due to improper permissions.
- Corrupted original documents by comparing their checksum against
what is stored in the database.
- Corrupted archive documents by comparing their checksum against what
is stored in the database.
- Missing thumbnails.
- Inaccessible thumbnails due to improper permissions.
- Documents without any content (warning).
- Orphaned files in the media directory (warning). These are files
that are not referenced by any document in paperless.
```
document_sanity_checker
```
The command takes no arguments. Depending on the size of your document
archive, this may take some time.
### Fetching e-mail
Paperless automatically fetches your e-mail every 10 minutes by default.
If you want to invoke the email consumer manually, call the following
management command:
```
mail_fetcher
```
The command takes no arguments and processes all your mail accounts and
rules.
!!! tip
To use OAuth access tokens for mail fetching,
select the box to indicate the password is actually
a token when creating or editing a mail account. The
details for creating a token depend on your email
provider.
### Creating archived documents {#archiver}
Paperless stores archived PDF/A documents alongside your original
documents. These archived documents will also contain selectable text
for image-only originals. These documents are derived from the
originals, which are always stored unmodified. If coming from an earlier
version of paperless, your documents won't have archived versions.
This command creates PDF/A documents for your documents.
```
document_archiver --overwrite --document <id>
```
This command will only attempt to create archived documents when no
archived document exists yet, unless `--overwrite` is specified. If
`--document <id>` is specified, the archiver will only process that
document.
!!! note
This command essentially performs OCR on all your documents again,
according to your settings. If you run this with
`PAPERLESS_OCR_MODE=redo`, it will potentially run for a very long time.
You can cancel the command at any time, since this command will skip
already archived versions the next time it is run.
!!! note
Some documents will cause errors and cannot be converted into PDF/A
documents, such as encrypted PDF documents. The archiver will skip over
these documents each time it sees them.
### Managing encryption {#encryption}
Documents can be stored in Paperless using GnuPG encryption.
!!! warning
Encryption is deprecated since [paperless-ng 0.9](changelog.md#paperless-ng-090) and doesn't really
provide any additional security, since you have to store the passphrase
in a configuration file on the same system as the encrypted documents
for paperless to work. Furthermore, the entire text content of the
documents is stored plain in the database, even if your documents are
encrypted. Filenames are not encrypted as well.
Also, the web server provides transparent access to your encrypted
documents.
Consider running paperless on an encrypted filesystem instead, which
will then at least provide security against physical hardware theft.
#### Enabling encryption
Enabling encryption is no longer supported.
#### Disabling encryption
Basic usage to disable encryption of your document store:
(Note: If [`PAPERLESS_PASSPHRASE`](configuration.md#PAPERLESS_PASSPHRASE) isn't set already, you need to specify
it here)
```
decrypt_documents [--passphrase SECR3TP4SSPHRA$E]
```
### Detecting duplicates {#fuzzy_duplicate}
Paperless already catches and prevents upload of exactly matching documents,
however a new scan of an existing document may not produce an exact bit for bit
duplicate. But the content should be exact or close, allowing detection.
This tool does a fuzzy match over document content, looking for
those which look close according to a given ratio.
At this time, other metadata (such as correspondent or type) is not
taken into account by the detection.
```
document_fuzzy_match [--ratio] [--processes N]
```
| Option | Required | Default | Description |
| ----------- | -------- | ------------------- | ------------------------------------------------------------------------------------------------------------------------------ |
| --ratio | No | 85.0 | a number between 0 and 100, setting how similar a document must be for it to be reported. Higher numbers mean more similarity. |
| --processes | No | 1/4 of system cores | Number of processes to use for matching. Setting 1 disables multiple processes |