A little more verbosity
This commit is contained in:
		
							parent
							
								
									946dbf4ea0
								
							
						
					
					
						commit
						be87e8c1dc
					
				
							
								
								
									
										54
									
								
								README.md
									
									
									
									
									
								
							
							
						
						
									
										54
									
								
								README.md
									
									
									
									
									
								
							@ -12,15 +12,57 @@ of not having the right document around.  Sometimes I recycled a document I
 | 
			
		||||
needed (who keeps water bills for two years?) and other times I just lost
 | 
			
		||||
it... because paper.  I wrote this to make my life easier.
 | 
			
		||||
 | 
			
		||||
Here's how it works:
 | 
			
		||||
## How it Works:
 | 
			
		||||
 | 
			
		||||
1. Buy a document scanner like [this one](http://welcome.brother.com/sg-en/products-services/scanners/ads-1100w.html).
 | 
			
		||||
2. Set it up to "scan to FTP".  This means you can use it without being
 | 
			
		||||
   connected to a running computer. It will just scan the document and save it
 | 
			
		||||
   as a PDF on a server in your house.
 | 
			
		||||
3. Setup a cronjob on that server to use *paperless* to OCR the PDF and index
 | 
			
		||||
   it into a local database.
 | 
			
		||||
2. Set it up to "scan to FTP" or something similar. It should be able to push
 | 
			
		||||
   scanned images to a server without you having to do anything.
 | 
			
		||||
3. Have the target server run the *paperless* consumption script to OCR the PDF
 | 
			
		||||
   and index it into a local database.
 | 
			
		||||
4. Use the web frontend to sift through the database and find what you want.
 | 
			
		||||
5. Download the PDF you need/want via the web interface and do whatever you
 | 
			
		||||
   like with it.  You can even print it and send it as if it's the original.
 | 
			
		||||
   In most cases, no one will care or notice.
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
## Requirements
 | 
			
		||||
 | 
			
		||||
This is all really quite simple, a shiny, user-friendly wrapper around some very
 | 
			
		||||
powerful tools.
 | 
			
		||||
 | 
			
		||||
* [ImageMagick](http://imagemagick.org/) converts the images between colour and
 | 
			
		||||
  greyscale.
 | 
			
		||||
* [Tesseract](https://github.com/tesseract-ocr) does the character recognition
 | 
			
		||||
* [Python 3](https://python.org/) is the language of the project
 | 
			
		||||
    * [Pillow](https://pypi.python.org/pypi/pillowfight/) converts the PDFs to
 | 
			
		||||
      images
 | 
			
		||||
    * [PyOCR](https://github.com/jflesch/pyocr) is a slick programmatic wrapper
 | 
			
		||||
      around tesseract
 | 
			
		||||
    * [Django](https://djangoproject.org/) is the framework this project is 
 | 
			
		||||
      written against.
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
## Instructions
 | 
			
		||||
 | 
			
		||||
1. Check out this repo to somewhere convenient and install the requirements
 | 
			
		||||
   listed here into your environment.
 | 
			
		||||
 | 
			
		||||
2. Configure `settings.py` and make sure that `CONVERT_BINARY`, `SCRATCH_DIR`,
 | 
			
		||||
   and `CONSUMPTION_DIR` are set to values you'd expect:
 | 
			
		||||
 | 
			
		||||
    * `CONVERT_BINARY`: The path to `convert`, installed as part of ImageMagick.
 | 
			
		||||
    * `SCRATCH_DIR`: A place for files to be created and destroyed.  The default
 | 
			
		||||
      is as good a place as any.
 | 
			
		||||
    * `CONSUMPTION_DIR`: The directory you scanner will be depositing files.
 | 
			
		||||
      Note that the consumption script will import files from here **and then
 | 
			
		||||
      delete them**.
 | 
			
		||||
 | 
			
		||||
3. Run `python manage.py migrate`.  This will create your local database.
 | 
			
		||||
 | 
			
		||||
4. Run `python manage.py consume`.  You may want to do this in a background
 | 
			
		||||
   process like a SystemD service or rc script because it runs in an infinite
 | 
			
		||||
   loop.
 | 
			
		||||
 | 
			
		||||
5. Start the webserver with `python manage.py runserver`.
 | 
			
		||||
 | 
			
		||||
6. Log into your new toy by visiting `http://localhost:8000/`.
 | 
			
		||||
 | 
			
		||||
		Loading…
	
	
			
			x
			
			
		
	
		Reference in New Issue
	
	Block a user