Discussion:
[Mayan EDMS: 2115] Mayan's document_cache size on disk?
Hans Fritz
2017-09-17 02:17:13 UTC
Permalink
I'm very surprised as I uploaded about 2GB of pdf files into Mayan. These
PDFs are scanned images (grayscale 300dpi) cleaned with textcleaner to make
OCR work better. It comes down to about 1MB per page.

Mayan has been processing the uploads for the last 11 hours now (my server
is a bit old), but what worries me is how much space it uses. The
mayan/media/document_storage directory is 3.8GB (it's possible I have
duplicates, I'd have to check once it's done processing), and the
mayan/media/document_cache is 16GB and growing.

What's in the cache that's taking so much space? Is it the JPEG
thumbnails/preview of pages? If so, is there a setting somewhere for the
JPEG quality? I don't need amazing quality for the previews, I'd rather
save the space and processing time.

Thanks,
--
---
You received this message because you are subscribed to the Google Groups "Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mayan-edms+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Jonathon Exley
2017-09-19 10:35:34 UTC
Permalink
It sound like a race condition in the watch folder process. If the files in
the watch folder do not finish being processed by the next time the watch
folder is checked then the same files will be processed again, leading to
multiple copies in the document storage.
You will need to increase the watch folder check interval so that all the
files can be processed before the next check cycle starts. I have mine set
to once a day, since it's a low powered raspberry pi.
You should check your recent documents and remove any duplicates, after
allowing some time for the documents to finish being processed. Usually
checking the CPU utilisation is a good way to tell when the queue has
emptied.

Jonathon.
Post by Hans Fritz
I'm very surprised as I uploaded about 2GB of pdf files into Mayan. These
PDFs are scanned images (grayscale 300dpi) cleaned with textcleaner to make
OCR work better. It comes down to about 1MB per page.
Mayan has been processing the uploads for the last 11 hours now (my server
is a bit old), but what worries me is how much space it uses. The
mayan/media/document_storage directory is 3.8GB (it's possible I have
duplicates, I'd have to check once it's done processing), and the
mayan/media/document_cache is 16GB and growing.
What's in the cache that's taking so much space? Is it the JPEG
thumbnails/preview of pages? If so, is there a setting somewhere for the
JPEG quality? I don't need amazing quality for the previews, I'd rather
save the space and processing time.
Thanks,
--
---
You received this message because you are subscribed to the Google Groups
"Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
---
You received this message because you are subscribed to the Google Groups "Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mayan-edms+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Loading...