Jump to content

Eclipse \ E-mail threading job completes without error but no data is written to case


Guest Joshua DeLapp

Recommended Posts

Guest Joshua DeLapp

When running an e-mail threading job on a case, the job may complete without any apparent errors, but when reviewing the case, none of the analytics information has been written to the relevant CAAT fields.

 

This can occur when the data set in question contains a large volume of extracted text and/or a number of documents with an extracted text size greater than 4 megabytes. Content Analyst's e-mail threading processes have an upper limit of 30 megabytes per document, but in practice it will frequently fail on such large sets of extracted text.

 

To work around this issue, identify the documents in question via SQL and update them so the system views them as having already been processed. The following SQL queries will do this:

/* Identify the BegDocs, size, and status of any documents with more than 4mb of extracted text */

SELECT VDF.BegDoc, VDF.ExtractedTextSize, D.EmailThreadingProcessed

FROM Documents D

JOIN vDocumentFields VDF

ON D.DocId = VDF.DocId

WHERE VDF.ExtractedTextSize > 4096

 

/* Flag these documents as having already been processed by e-mail threading */

UPDATE D.

SET EmailThreadingProcessed = 1

FROM Documents D

JOIN vDocumentFields VDF

ON D.DocId = VDF.DocId

WHERE VDF.ExtractedTextSize > 4096

 

Once these items are flagged, re-run the e-mail threading and it should complete and write back the data to the case. Note: After processing the majority of the data, the above results can be marked as "not processed" again and then run through e-mail threading; with a smaller volume of text to work with, Content Analyst may be successful in processing these larger items. Ensure that the option to group the new data with the existing, however.

 

Link to comment
Share on other sites

Guest Joshua DeLapp

When running an e-mail threading job on a case, the job may complete without any apparent errors, but when reviewing the case, none of the analytics information has been written to the relevant CAAT fields.

 

This can occur when the data set in question contains a large volume of extracted text and/or a number of documents with an extracted text size greater than 4 megabytes. Content Analyst's e-mail threading processes have an upper limit of 30 megabytes per document, but in practice it will frequently fail on such large sets of extracted text.

 

To work around this issue, identify the documents in question via SQL and update them so the system views them as having already been processed. The following SQL queries will do this:

/* Identify the BegDocs, size, and status of any documents with more than 4mb of extracted text */

SELECT VDF.BegDoc, VDF.ExtractedTextSize, D.EmailThreadingProcessed

FROM Documents D

JOIN vDocumentFields VDF

ON D.DocId = VDF.DocId

WHERE VDF.ExtractedTextSize > 4096

 

/* Flag these documents as having already been processed by e-mail threading */

UPDATE D.

SET EmailThreadingProcessed = 1

FROM Documents D

JOIN vDocumentFields VDF

ON D.DocId = VDF.DocId

WHERE VDF.ExtractedTextSize > 4096

 

Once these items are flagged, re-run the e-mail threading and it should complete and write back the data to the case. Note: After processing the majority of the data, the above results can be marked as "not processed" again and then run through e-mail threading; with a smaller volume of text to work with, Content Analyst may be successful in processing these larger items. Ensure that the option to group the new data with the existing, however.

 

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...