Pit Kleyersburg 
							
						 
					 
					
						
						
						
						
							
						
						
							aeab9a0e81 
							
						 
					 
					
						
						
							
							Detect language only on one page of PDF  
						
						... 
						
						
						
						To detect the language currently the entire document gets processed. If
a different language has been detected than the default one, the entire
document will be processed again for the new language.
This PR analyzes the middle page for its language and either processes
the remaining pages with the default language if it didn't differ, or
processes all pages for the new guessed language.
The amount of processed pages comes down from the worst case `2n` to
worst case `n+1`. 
						
						
					 
					
						2016-02-14 17:55:13 +01:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							7843ea5037 
							
						 
					 
					
						
						
							
							Added and implemented a rudimentary logger  
						
						
						
						
					 
					
						2016-02-14 16:09:52 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							9162e41507 
							
						 
					 
					
						
						
							
							Merge pull request  #33  from pitkley/fix/parallelism  
						
						... 
						
						
						
						Ensure `OCR_THREADS` is integer, add documentation 
						
						
					 
					
						2016-02-14 15:40:20 +00:00 
						 
				 
			
				
					
						
							
							
								Pit Kleyersburg 
							
						 
					 
					
						
						
						
						
							
						
						
							20b2408dbb 
							
						 
					 
					
						
						
							
							Ensure OCR_THREADS is integer, add documentation  
						
						
						
						
					 
					
						2016-02-14 16:37:38 +01:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							88acf50fe0 
							
						 
					 
					
						
						
							
							Merge pull request  #31  from pitkley/feature/paralellism  
						
						... 
						
						
						
						This is great.  It seriously sped up the OCR time. 
						
						
					 
					
						2016-02-14 15:29:05 +00:00 
						 
				 
			
				
					
						
							
							
								Pit Kleyersburg 
							
						 
					 
					
						
						
						
						
							
						
						
							f5beda9c56 
							
						 
					 
					
						
						
							
							Enable parallel OCR processing  
						
						... 
						
						
						
						At the moment, every page in a PDF will be processed one by one using
tesseract. Since the processing of a single page is independent from every
other page, one can make use of multi-core machines.
This PR introduces a multiprocessing pool to process multiple pages
simultaneously. The amount of threads to use can be specified in the
environment variable `PAPERLESS_OCR_THREADS`. This will default to the
number of cores/hyperthreads Python detects for your system. 
						
						
					 
					
						2016-02-14 15:57:42 +01:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							6b0a537bff 
							
						 
					 
					
						
						
							
							Added support for a shared secret in email  
						
						
						
						
					 
					
						2016-02-14 03:01:24 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							3b5d4cdd39 
							
						 
					 
					
						
						
							
							Added some error handling  
						
						
						
						
					 
					
						2016-02-14 01:32:25 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							fc5d89c6fc 
							
						 
					 
					
						
						
							
							Added a default algorithm  
						
						
						
						
					 
					
						2016-02-14 01:30:41 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							d9b7851de9 
							
						 
					 
					
						
						
							
							Added a default algorithm  
						
						
						
						
					 
					
						2016-02-14 01:30:18 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							cec9968cdb 
							
						 
					 
					
						
						
							
							Documented consumption  
						
						
						
						
					 
					
						2016-02-14 00:10:49 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							330dfa544b 
							
						 
					 
					
						
						
							
							Fixed a typo in the description. There's no need for a new migration here.  
						
						
						
						
					 
					
						2016-02-14 00:10:37 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							294f104474 
							
						 
					 
					
						
						
							
							Merge branch 'master' into feature/images-as-docs  
						
						
						
						
					 
					
						2016-02-13 01:01:10 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							68fa7d68fa 
							
						 
					 
					
						
						
							
							Merge branch 'master' of github.com:danielquinn/paperless  
						
						
						
						
					 
					
						2016-02-13 00:59:36 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							2ed2d641b5 
							
						 
					 
					
						
						
							
							Added a note about the plight of Apple users.  
						
						
						
						
					 
					
						2016-02-13 00:59:19 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							a846b3f7b8 
							
						 
					 
					
						
						
							
							Adding some more debugging  
						
						
						
						
					 
					
						2016-02-13 00:57:05 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							b7859a0ff3 
							
						 
					 
					
						
						
							
							Merge pull request  #26  from wttw/master  
						
						... 
						
						
						
						Document cloning from public URL rather than ssh 
						
						
					 
					
						2016-02-12 20:30:07 +00:00 
						 
				 
			
				
					
						
							
							
								Steve Atkins 
							
						 
					 
					
						
						
						
						
							
						
						
							a4903049a3 
							
						 
					 
					
						
						
							
							Document cloning from public URL rather than ssh  
						
						
						
						
					 
					
						2016-02-12 11:36:07 -08:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							9ed8a2b2d7 
							
						 
					 
					
						
						
							
							Merge branch 'master' into feature/images-as-docs  
						
						
						
						
					 
					
						2016-02-12 09:03:46 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							1d4b87ee46 
							
						 
					 
					
						
						
							
							Update for  #22  
						
						
						
						
					 
					
						2016-02-12 08:54:04 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							840472071c 
							
						 
					 
					
						
						
							
							Added the required verbosity reference  
						
						
						
						
					 
					
						2016-02-12 08:27:28 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							2421f559be 
							
						 
					 
					
						
						
							
							Simpler regex  
						
						
						
						
					 
					
						2016-02-12 08:27:09 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							a022fcb8f1 
							
						 
					 
					
						
						
							
							Fixed the auto-naming regexes  
						
						
						
						
					 
					
						2016-02-11 22:05:55 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							7aadab23cc 
							
						 
					 
					
						
						
							
							Added the Renderable mixin because DRY  
						
						
						
						
					 
					
						2016-02-11 22:05:38 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							ef1639208c 
							
						 
					 
					
						
						
							
							Tests for the consumer  
						
						
						
						
					 
					
						2016-02-11 12:25:23 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							cef4abc01d 
							
						 
					 
					
						
						
							
							version bump  
						
						
						
						
					 
					
						2016-02-11 12:25:12 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							78ee138ad7 
							
						 
					 
					
						
						
							
							Added migration and changelog updates  
						
						
						
						
					 
					
						2016-02-11 12:25:00 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							c423a13f85 
							
						 
					 
					
						
						
							
							Added a simple re-tagger  
						
						
						
						
					 
					
						2016-02-11 12:24:18 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							39134b517e 
							
						 
					 
					
						
						
							
							Cleaned up file_name()  
						
						
						
						
					 
					
						2016-02-10 23:53:48 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							a892abc701 
							
						 
					 
					
						
						
							
							Added dateutil  
						
						
						
						
					 
					
						2016-02-10 23:50:58 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							4a078dcfbc 
							
						 
					 
					
						
						
							
							Merge branch 'master' into feature/images-as-docs  
						
						
						
						
					 
					
						2016-02-09 17:20:45 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							642b2f7ee3 
							
						 
					 
					
						
						
							
							Merge pull request  #18  from mrwacky42/master  
						
						... 
						
						
						
						Add other prerequisites for Vagrant 
						
						
					 
					
						2016-02-09 09:41:53 +00:00 
						 
				 
			
				
					
						
							
							
								Sharif Nassar 
							
						 
					 
					
						
						
						
						
							
						
						
							6115b2f03d 
							
						 
					 
					
						
						
							
							Add other prerequisites  
						
						... 
						
						
						
						Vagrant setup didn't work for me unless I manually installed tesseract and ImageMagick. 
						
						
					 
					
						2016-02-09 01:07:48 -08:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							0eaed36420 
							
						 
					 
					
						
						
							
							The 'API' is written but untested  
						
						
						
						
					 
					
						2016-02-08 23:46:16 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							212752f46e 
							
						 
					 
					
						
						
							
							Fixt the tags to be optional  
						
						
						
						
					 
					
						2016-02-08 17:28:59 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							0c729e5675 
							
						 
					 
					
						
						
							
							Changed the name, forgot to change the check.  
						
						... 
						
						
						
						Closes  #17  
					
						2016-02-08 11:14:57 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							e5e4ee0350 
							
						 
					 
					
						
						
							
							Added file magic  
						
						
						
						
					 
					
						2016-02-08 11:12:14 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							c4311af263 
							
						 
					 
					
						
						
							
							Cleaned up the tests  
						
						
						
						
					 
					
						2016-02-06 17:41:11 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							febb45af81 
							
						 
					 
					
						
						
							
							Prettied up the interface a little  
						
						
						
						
					 
					
						2016-02-06 17:27:17 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							ce69e37256 
							
						 
					 
					
						
						
							
							Linked tag labels  
						
						
						
						
					 
					
						2016-02-06 17:14:44 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							48761911b3 
							
						 
					 
					
						
						
							
							Image imports and consumption by mail work  
						
						
						
						
					 
					
						2016-02-06 17:05:36 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							71075a691a 
							
						 
					 
					
						
						
							
							The mailconsumer isn't a consumer at all.  Best fixt that  
						
						
						
						
					 
					
						2016-02-05 20:15:08 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							d8ad6b589b 
							
						 
					 
					
						
						
							
							Added pytest and broke up the consumer into file and mail  
						
						
						
						
					 
					
						2016-02-05 00:23:36 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							3bc89d23c8 
							
						 
					 
					
						
						
							
							Sorting the filters  
						
						
						
						
					 
					
						2016-02-03 17:20:12 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							a70b40f618 
							
						 
					 
					
						
						
							
							Broke the consumer script into separate files and started on a mail consumer  
						
						
						
						
					 
					
						2016-01-30 01:18:52 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							84d5f8cc5d 
							
						 
					 
					
						
						
							
							Merge branch 'master' into feature/images-as-docs  
						
						
						
						
					 
					
						2016-01-29 23:41:13 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							cf4c437eca 
							
						 
					 
					
						
						
							
							Be a little more verbose about the passphrase  
						
						
						
						
					 
					
						2016-01-29 23:40:57 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							8701007a7a 
							
						 
					 
					
						
						
							
							Merge pull request  #15  from jat255/DOC_setup_enh  
						
						... 
						
						
						
						Clarify how to start server on a different port/ip 
						
						
					 
					
						2016-01-29 23:32:15 +00:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
						
						
							
						
						
							889fd93c5e 
							
						 
					 
					
						
						
							
							Merge pull request  #14  from gitter-badger/gitter-badge  
						
						... 
						
						
						
						Add a Gitter chat badge to README.rst 
						
						
					 
					
						2016-01-29 23:31:25 +00:00 
						 
				 
			
				
					
						
							
							
								The Gitter Badger 
							
						 
					 
					
						
						
						
						
							
						
						
							77a2a5bb8e 
							
						 
					 
					
						
						
							
							Add Gitter badge  
						
						
						
						
					 
					
						2016-01-29 23:27:37 +00:00