Discussion:
[Mayan EDMS: 2372] Document date and title
Raul
2018-03-20 13:13:22 UTC
Permalink
Hi guys,

I am pretty new to Mayan EDMS.
So far I have been only testing on how to install it and what hardware
would be a good match for it.

I am not at a state where I have installed EDMS 3.0 and am wondering about
the following:

1.) How can I create an index of the year, month of the document date
itself?
2.) How can I set the document title so that it builds it out of for
example invoice number, date and company?
3.) How can I create a trigger that automatically assigns tags to documents
and adds them to a special cabinet? All outgoing from keywords that match
with the OCR result.

Thanks for your help :)
--
---
You received this message because you are subscribed to the Google Groups "Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mayan-edms+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Michael Price
2018-03-23 05:25:11 UTC
Permalink
Hi,

How are you liking version 3.0?

Here are the answer to your questions.
1. The file creation data is lost during the upload. These operating system
fields and do not persist during upload via web. It could be possible to
retain these values if the document is uploaded via a watch folder or
staging folder. Since these methods open the file to be loaded into Mayan
directly from the operating system the file creation could be read.

2. It is not possible using the normal installation. However I found this
online: https://gitlab.com/mayan-edms/document_renaming it seems to do what
you want. Haven't tried it and looks outdated. If there is enough interest
we could add something like this in our fork of Mayan.

3. Not yet possible at the level you want right now but it is getting
there. There is a workflow feature called triggers and another called
actions. These allow you to create a workflow that will respond (trigger)
based on an event (OCR finished) and perform an action (tag the document,
move to the cabinet). The problem is that the triggers and actions are
static. You can't program any kind of intelligence in them. There is no
method to add a decision (what folder based on what OCR content). We have
been talking about solving this with what we called workflow filters. The
specs are still in design phase as we don't want to create a whole separate
programming language for this. Eric is particularly interested in this
still (we wants to auto tag documents based on OCR content) so this will
get done as soon as we figure out the design.
Post by Raul
Hi guys,
I am pretty new to Mayan EDMS.
So far I have been only testing on how to install it and what hardware
would be a good match for it.
I am not at a state where I have installed EDMS 3.0 and am wondering about
1.) How can I create an index of the year, month of the document date
itself?
2.) How can I set the document title so that it builds it out of for
example invoice number, date and company?
3.) How can I create a trigger that automatically assigns tags to
documents and adds them to a special cabinet? All outgoing from keywords
that match with the OCR result.
Thanks for your help :)
--
---
You received this message because you are subscribed to the Google Groups "Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mayan-edms+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Raul
2018-04-01 17:55:44 UTC
Permalink
Hi Michael,

thanks for your response.
I finally got my server in place and am ready to rock now.

Regarding my questions:

1) I was more thinking about something like it is done in paperless
(https://github.com/danielquinn/paperless).
Here the date is being extracted out of the OCR content and used as
document date. Which is a very nice feature.

2) You are right the addon is outdated.

3) That would be definitely nice. I the way it is right now the OCR content
is only used for searching. However, there could be so much more you could
do with it. Like search for keywords and trigger the auto assign to a
cabinet or get the document date etc....
Post by Michael Price
Hi,
How are you liking version 3.0?
Here are the answer to your questions.
1. The file creation data is lost during the upload. These operating
system fields and do not persist during upload via web. It could be
possible to retain these values if the document is uploaded via a watch
folder or staging folder. Since these methods open the file to be loaded
into Mayan directly from the operating system the file creation could be
read.
2. It is not possible using the normal installation. However I found this
online: https://gitlab.com/mayan-edms/document_renaming it seems to do
what you want. Haven't tried it and looks outdated. If there is enough
interest we could add something like this in our fork of Mayan.
3. Not yet possible at the level you want right now but it is getting
there. There is a workflow feature called triggers and another called
actions. These allow you to create a workflow that will respond (trigger)
based on an event (OCR finished) and perform an action (tag the document,
move to the cabinet). The problem is that the triggers and actions are
static. You can't program any kind of intelligence in them. There is no
method to add a decision (what folder based on what OCR content). We have
been talking about solving this with what we called workflow filters. The
specs are still in design phase as we don't want to create a whole separate
programming language for this. Eric is particularly interested in this
still (we wants to auto tag documents based on OCR content) so this will
get done as soon as we figure out the design.
Post by Raul
Hi guys,
I am pretty new to Mayan EDMS.
So far I have been only testing on how to install it and what hardware
would be a good match for it.
I am not at a state where I have installed EDMS 3.0 and am wondering
1.) How can I create an index of the year, month of the document date
itself?
2.) How can I set the document title so that it builds it out of for
example invoice number, date and company?
3.) How can I create a trigger that automatically assigns tags to
documents and adds them to a special cabinet? All outgoing from keywords
that match with the OCR result.
Thanks for your help :)
--
---
You received this message because you are subscribed to the Google Groups "Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mayan-edms+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Michael Price
2018-04-01 20:51:10 UTC
Permalink
Hello,

1) They must be using a regular expression feature to extract the data. I
must warn that it is never a good idea to rely on OCR output for data. OCR
is one of those features that will never work 100%. If you do have some
fallback logic to avoid adding garbage data. We could add a post OCR
processing step to add a feature like this. The best place for that would
be the workflow engine. I think there are post OCR triggers. We would need
a regular expression workflow action.

2) A pity. Looks very interesting. I have a lot on my plate but will take a
look to see how difficult it would be add this as a standard app.

3) We added Filters to Paperattor which work a bit like SmartLinks. We are
looking into reusing this method to add workflow trigger filters. This
means that you can make a workflow to trigger the transition and add tags
or extract OCR data for metadata only is certain condition programmed in
the workflow filter is met.

Keep the ideas and use cases coming, they give us a good roadmap to
develops the next set of features.
Post by Raul
Hi Michael,
thanks for your response.
I finally got my server in place and am ready to rock now.
1) I was more thinking about something like it is done in paperless (
https://github.com/danielquinn/paperless).
Here the date is being extracted out of the OCR content and used as
document date. Which is a very nice feature.
2) You are right the addon is outdated.
3) That would be definitely nice. I the way it is right now the OCR
content is only used for searching. However, there could be so much more
you could do with it. Like search for keywords and trigger the auto assign
to a cabinet or get the document date etc....
Post by Michael Price
Hi,
How are you liking version 3.0?
Here are the answer to your questions.
1. The file creation data is lost during the upload. These operating
system fields and do not persist during upload via web. It could be
possible to retain these values if the document is uploaded via a watch
folder or staging folder. Since these methods open the file to be loaded
into Mayan directly from the operating system the file creation could be
read.
2. It is not possible using the normal installation. However I found this
online: https://gitlab.com/mayan-edms/document_renaming it seems to do
what you want. Haven't tried it and looks outdated. If there is enough
interest we could add something like this in our fork of Mayan.
3. Not yet possible at the level you want right now but it is getting
there. There is a workflow feature called triggers and another called
actions. These allow you to create a workflow that will respond (trigger)
based on an event (OCR finished) and perform an action (tag the document,
move to the cabinet). The problem is that the triggers and actions are
static. You can't program any kind of intelligence in them. There is no
method to add a decision (what folder based on what OCR content). We have
been talking about solving this with what we called workflow filters. The
specs are still in design phase as we don't want to create a whole separate
programming language for this. Eric is particularly interested in this
still (we wants to auto tag documents based on OCR content) so this will
get done as soon as we figure out the design.
Post by Raul
Hi guys,
I am pretty new to Mayan EDMS.
So far I have been only testing on how to install it and what hardware
would be a good match for it.
I am not at a state where I have installed EDMS 3.0 and am wondering
1.) How can I create an index of the year, month of the document date
itself?
2.) How can I set the document title so that it builds it out of for
example invoice number, date and company?
3.) How can I create a trigger that automatically assigns tags to
documents and adds them to a special cabinet? All outgoing from keywords
that match with the OCR result.
Thanks for your help :)
--
---
You received this message because you are subscribed to a topic in the
Google Groups "Mayan EDMS" group.
To unsubscribe from this topic, visit https://groups.google.com/d/
topic/mayan-edms/ONBovsdTKfI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
For more options, visit https://groups.google.com/d/optout.
--
---
You received this message because you are subscribed to the Google Groups "Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mayan-edms+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Matthias Löblich
2018-04-10 07:25:26 UTC
Permalink
Hello,

1) Please take a look at my mayan-extension:
https://gitlab.com/startmat/document_analyzer


br
Matthias
Post by Michael Price
Hello,
1) They must be using a regular expression feature to extract the data. I
must warn that it is never a good idea to rely on OCR output for data. OCR
is one of those features that will never work 100%. If you do have some
fallback logic to avoid adding garbage data. We could add a post OCR
processing step to add a feature like this. The best place for that would
be the workflow engine. I think there are post OCR triggers. We would need
a regular expression workflow action.
2) A pity. Looks very interesting. I have a lot on my plate but will take
a look to see how difficult it would be add this as a standard app.
3) We added Filters to Paperattor which work a bit like SmartLinks. We are
looking into reusing this method to add workflow trigger filters. This
means that you can make a workflow to trigger the transition and add tags
or extract OCR data for metadata only is certain condition programmed in
the workflow filter is met.
Keep the ideas and use cases coming, they give us a good roadmap to
develops the next set of features.
Post by Raul
Hi Michael,
thanks for your response.
I finally got my server in place and am ready to rock now.
1) I was more thinking about something like it is done in paperless (
https://github.com/danielquinn/paperless).
Here the date is being extracted out of the OCR content and used as
document date. Which is a very nice feature.
2) You are right the addon is outdated.
3) That would be definitely nice. I the way it is right now the OCR
content is only used for searching. However, there could be so much more
you could do with it. Like search for keywords and trigger the auto assign
to a cabinet or get the document date etc....
Post by Michael Price
Hi,
How are you liking version 3.0?
Here are the answer to your questions.
1. The file creation data is lost during the upload. These operating
system fields and do not persist during upload via web. It could be
possible to retain these values if the document is uploaded via a watch
folder or staging folder. Since these methods open the file to be loaded
into Mayan directly from the operating system the file creation could be
read.
2. It is not possible using the normal installation. However I found
this online: https://gitlab.com/mayan-edms/document_renaming it seems
to do what you want. Haven't tried it and looks outdated. If there is
enough interest we could add something like this in our fork of Mayan.
3. Not yet possible at the level you want right now but it is getting
there. There is a workflow feature called triggers and another called
actions. These allow you to create a workflow that will respond (trigger)
based on an event (OCR finished) and perform an action (tag the document,
move to the cabinet). The problem is that the triggers and actions are
static. You can't program any kind of intelligence in them. There is no
method to add a decision (what folder based on what OCR content). We have
been talking about solving this with what we called workflow filters. The
specs are still in design phase as we don't want to create a whole separate
programming language for this. Eric is particularly interested in this
still (we wants to auto tag documents based on OCR content) so this will
get done as soon as we figure out the design.
Post by Raul
Hi guys,
I am pretty new to Mayan EDMS.
So far I have been only testing on how to install it and what hardware
would be a good match for it.
I am not at a state where I have installed EDMS 3.0 and am wondering
1.) How can I create an index of the year, month of the document date
itself?
2.) How can I set the document title so that it builds it out of for
example invoice number, date and company?
3.) How can I create a trigger that automatically assigns tags to
documents and adds them to a special cabinet? All outgoing from keywords
that match with the OCR result.
Thanks for your help :)
--
---
You received this message because you are subscribed to a topic in the
Google Groups "Mayan EDMS" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/mayan-edms/ONBovsdTKfI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
For more options, visit https://groups.google.com/d/optout.
--
---
You received this message because you are subscribed to the Google Groups "Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mayan-edms+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Raul
2018-04-09 09:30:21 UTC
Permalink
Hi Robert,

what do you think about my question 1?
Would it be possible to implement such a feature - with a few lines - into
mayan?
This information could for example be used for sorting the documents.
Sorting them by the document name doesn't make a lot of sense if you ask me.

Looking forward to your feedback.
Post by Raul
Hi Michael,
thanks for your response.
I finally got my server in place and am ready to rock now.
1) I was more thinking about something like it is done in paperless (
https://github.com/danielquinn/paperless).
Here the date is being extracted out of the OCR content and used as
document date. Which is a very nice feature.
2) You are right the addon is outdated.
3) That would be definitely nice. I the way it is right now the OCR
content is only used for searching. However, there could be so much more
you could do with it. Like search for keywords and trigger the auto assign
to a cabinet or get the document date etc....
Post by Michael Price
Hi,
How are you liking version 3.0?
Here are the answer to your questions.
1. The file creation data is lost during the upload. These operating
system fields and do not persist during upload via web. It could be
possible to retain these values if the document is uploaded via a watch
folder or staging folder. Since these methods open the file to be loaded
into Mayan directly from the operating system the file creation could be
read.
2. It is not possible using the normal installation. However I found this
online: https://gitlab.com/mayan-edms/document_renaming it seems to do
what you want. Haven't tried it and looks outdated. If there is enough
interest we could add something like this in our fork of Mayan.
3. Not yet possible at the level you want right now but it is getting
there. There is a workflow feature called triggers and another called
actions. These allow you to create a workflow that will respond (trigger)
based on an event (OCR finished) and perform an action (tag the document,
move to the cabinet). The problem is that the triggers and actions are
static. You can't program any kind of intelligence in them. There is no
method to add a decision (what folder based on what OCR content). We have
been talking about solving this with what we called workflow filters. The
specs are still in design phase as we don't want to create a whole separate
programming language for this. Eric is particularly interested in this
still (we wants to auto tag documents based on OCR content) so this will
get done as soon as we figure out the design.
Post by Raul
Hi guys,
I am pretty new to Mayan EDMS.
So far I have been only testing on how to install it and what hardware
would be a good match for it.
I am not at a state where I have installed EDMS 3.0 and am wondering
1.) How can I create an index of the year, month of the document date
itself?
2.) How can I set the document title so that it builds it out of for
example invoice number, date and company?
3.) How can I create a trigger that automatically assigns tags to
documents and adds them to a special cabinet? All outgoing from keywords
that match with the OCR result.
Thanks for your help :)
--
---
You received this message because you are subscribed to the Google Groups "Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mayan-edms+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Loading...