Jump to content

Analytics Email Threading - Why should I not map the fields?


Randi King
 Share

Recommended Posts

In the "Eclipse Analytics - Email Threading.pdf" document available under Eclipse Best Practices on My Ipro, it states that it is a best practice to not map the metadata fields in the "Metadata to send to Analytics" section. Why is this the best practice if you have already used eCapture to extract this information and have it available in the database?

Link to comment
Share on other sites

In the "Eclipse Analytics - Email Threading.pdf" document available under Eclipse Best Practices on My Ipro, it states that it is a best practice to not map the metadata fields in the "Metadata to send to Analytics" section. Why is this the best practice if you have already used eCapture to extract this information and have it available in the database?

Link to comment
Share on other sites

In the "Eclipse Analytics - Email Threading.pdf" document available under Eclipse Best Practices on My Ipro, it states that it is a best practice to not map the metadata fields in the "Metadata to send to Analytics" section. Why is this the best practice if you have already used eCapture to extract this information and have it available in the database?

Link to comment
Share on other sites

In the "Eclipse Analytics - Email Threading.pdf" document available under Eclipse Best Practices on My Ipro, it states that it is a best practice to not map the metadata fields in the "Metadata to send to Analytics" section. Why is this the best practice if you have already used eCapture to extract this information and have it available in the database?

Link to comment
Share on other sites

In the "Eclipse Analytics - Email Threading.pdf" document available under Eclipse Best Practices on My Ipro, it states that it is a best practice to not map the metadata fields in the "Metadata to send to Analytics" section. Why is this the best practice if you have already used eCapture to extract this information and have it available in the database?

Link to comment
Share on other sites

In the "Eclipse Analytics - Email Threading.pdf" document available under Eclipse Best Practices on My Ipro, it states that it is a best practice to not map the metadata fields in the "Metadata to send to Analytics" section. Why is this the best practice if you have already used eCapture to extract this information and have it available in the database?

Link to comment
Share on other sites

In the "Eclipse Analytics - Email Threading.pdf" document available under Eclipse Best Practices on My Ipro, it states that it is a best practice to not map the metadata fields in the "Metadata to send to Analytics" section. Why is this the best practice if you have already used eCapture to extract this information and have it available in the database?

171cd1acc795f9ca806499cd78afb704.JPG.da3973a21e18d442862cb31d1a31142b.JPG

d346fb005f52d79b25717684ba4a80f0.JPG.98acb3471adf5bf79a71e25644fc9b37.JPG

Link to comment
Share on other sites

  • Moderators

Hi Randi,

 

While many of your email documents may come from eCapture, almost every case will also see documents which were processed in a different tool, but could also include paper documents, emails printed to PDF with extractable or OCR text and many other scenarios. When running email threading, Content Analyst has the ability to extract metadata itself from the text it reads, and can normalize and format that metadata for all of those different types of documents to ensure the most accurate and fully inclusive email threading results. This is my primary reason for not mapping metadata fields for CA Email Threading.

Link to comment
Share on other sites

  • Moderators

Hi Randi,

 

While many of your email documents may come from eCapture, almost every case will also see documents which were processed in a different tool, but could also include paper documents, emails printed to PDF with extractable or OCR text and many other scenarios. When running email threading, Content Analyst has the ability to extract metadata itself from the text it reads, and can normalize and format that metadata for all of those different types of documents to ensure the most accurate and fully inclusive email threading results. This is my primary reason for not mapping metadata fields for CA Email Threading.

Link to comment
Share on other sites

  • Moderators

Hi Randi,

 

While many of your email documents may come from eCapture, almost every case will also see documents which were processed in a different tool, but could also include paper documents, emails printed to PDF with extractable or OCR text and many other scenarios. When running email threading, Content Analyst has the ability to extract metadata itself from the text it reads, and can normalize and format that metadata for all of those different types of documents to ensure the most accurate and fully inclusive email threading results. This is my primary reason for not mapping metadata fields for CA Email Threading.

Link to comment
Share on other sites

  • Moderators

Hi Randi,

 

While many of your email documents may come from eCapture, almost every case will also see documents which were processed in a different tool, but could also include paper documents, emails printed to PDF with extractable or OCR text and many other scenarios. When running email threading, Content Analyst has the ability to extract metadata itself from the text it reads, and can normalize and format that metadata for all of those different types of documents to ensure the most accurate and fully inclusive email threading results. This is my primary reason for not mapping metadata fields for CA Email Threading.

Link to comment
Share on other sites

  • Moderators

Hi Randi,

 

While many of your email documents may come from eCapture, almost every case will also see documents which were processed in a different tool, but could also include paper documents, emails printed to PDF with extractable or OCR text and many other scenarios. When running email threading, Content Analyst has the ability to extract metadata itself from the text it reads, and can normalize and format that metadata for all of those different types of documents to ensure the most accurate and fully inclusive email threading results. This is my primary reason for not mapping metadata fields for CA Email Threading.

Link to comment
Share on other sites

  • Moderators

Hi Randi,

 

While many of your email documents may come from eCapture, almost every case will also see documents which were processed in a different tool, but could also include paper documents, emails printed to PDF with extractable or OCR text and many other scenarios. When running email threading, Content Analyst has the ability to extract metadata itself from the text it reads, and can normalize and format that metadata for all of those different types of documents to ensure the most accurate and fully inclusive email threading results. This is my primary reason for not mapping metadata fields for CA Email Threading.

Link to comment
Share on other sites

  • Moderators

Hi Randi,

 

While many of your email documents may come from eCapture, almost every case will also see documents which were processed in a different tool, but could also include paper documents, emails printed to PDF with extractable or OCR text and many other scenarios. When running email threading, Content Analyst has the ability to extract metadata itself from the text it reads, and can normalize and format that metadata for all of those different types of documents to ensure the most accurate and fully inclusive email threading results. This is my primary reason for not mapping metadata fields for CA Email Threading.

Link to comment
Share on other sites

  • 6 months later...

On the subject of CAAT and extracted text, it is most important to make sure the option to include or remove whitspaces is set the same between scanned images and native edocs and emails. Also, the format of the email header fields is tantamount. Avoid processing email to HTML as the extracted text of the email header format is changed and CAAT will have issues combining threads.

as an example, a thread like could be split as separate threads:

 

from: jeff

to: john

sent: Jan 1 2017

subject: hello

 

Hi

 

 

from: john

to: jeff Sent: on Jan 2 2017

Subject: re hello

 

Right back at ya

 

 

Original message from jeff

 

hi

 

 

 

this issue is more apparent for scanned emails that contain a header of the user's name from outlook or some gif images from an online mail service such as Gmail or hotmail.

 

the email header has to be normalized in order for CAAT email threading to work properly.

 

Link to comment
Share on other sites

On the subject of CAAT and extracted text, it is most important to make sure the option to include or remove whitspaces is set the same between scanned images and native edocs and emails. Also, the format of the email header fields is tantamount. Avoid processing email to HTML as the extracted text of the email header format is changed and CAAT will have issues combining threads.

as an example, a thread like could be split as separate threads:

 

from: jeff

to: john

sent: Jan 1 2017

subject: hello

 

Hi

 

 

from: john

to: jeff Sent: on Jan 2 2017

Subject: re hello

 

Right back at ya

 

 

Original message from jeff

 

hi

 

 

 

this issue is more apparent for scanned emails that contain a header of the user's name from outlook or some gif images from an online mail service such as Gmail or hotmail.

 

the email header has to be normalized in order for CAAT email threading to work properly.

 

Link to comment
Share on other sites

  • 2 weeks later...
  • S.W.A.T. Engineer

Thanks Jeff -- this is great input. Any time the email header information is not directly extracted from an email item (for instance, if the text is extracted from a PDF or MHT representation of an email, or if it is obtained from OCR), problems may be encountered during email threading. It is important to note, though, that the contents of the ExtractedText field in Eclipse is what is sent to CAAT for processing -- so if you have populated the ExtractedText field with the original extracted text for the document, you should typically expect good results, even if the document natives were exported as MHT/HTML.

Link to comment
Share on other sites

  • S.W.A.T. Engineer

Thanks Jeff -- this is great input. Any time the email header information is not directly extracted from an email item (for instance, if the text is extracted from a PDF or MHT representation of an email, or if it is obtained from OCR), problems may be encountered during email threading. It is important to note, though, that the contents of the ExtractedText field in Eclipse is what is sent to CAAT for processing -- so if you have populated the ExtractedText field with the original extracted text for the document, you should typically expect good results, even if the document natives were exported as MHT/HTML.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...