Krista Schmidt Posted June 19, 2017 Share Posted June 19, 2017 While exploring some different options and some questions which have come up recently, I wanted to put together a Q&A to help assist with some common processing questions that we often receive. Q: With standard eCapture Discovery and Data Extract jobs, there is a Task Retry option for Data Extraction which can be configured to retry files multiple times before failing them as an exception, is this the case with Streaming Discovery jobs as well and does the same value apply? A: While this is the case for standard eCapture processing, for Streaming Discovery jobs, when a file fails processing, it will automatically fall back to standard eCapture Data Extraction to attempt the file. If the file does fall back to standard eCapture Data Extraction, at that point it will use the configured number of retries set for standard Data Extraction. Q: During the Streaming Discovery job, what types of errors are held back from being exported or pushed to Review during initial processing? A: During Streaming Discovery, Node level exceptions will cause the exception file and any corresponding family to be held back from being pushed to review. These Node level exceptions could be partially discovered containers, or containers which were unable to be discovered at all. Item level exceptions will be automatically pushed to review with the exception of ‘Detect Container’ errors and ‘Read Email Fields’ exceptions (as well as their families). As a general rule, any errors that result in files not being discovered or being discovered improperly will hold the files back. Examples: A PST file is extracted that results in many emails, but contains a folder that was unable to be extracted. The PST would be identified as a node level error and all files extracted from the PST will be held back during initial processing. If a user is able to resolve the folder issue and re-queue the container successfully, then all files will be pushed to review automatically. If nothing is done to the container, when the user Publishes Errors, the successfully extracted documents and their families would be pushed to review.A corrupt PST file is encountered and streaming discovery is unable to extract anything from it. The PST would be identified as a node level exception and no files will be pushed to review. If nothing is done to the container, when the user Publishes Errors, nothing will be pushed to review.A ZIP file is processed and some loose files are successfully processed but the ZIP file also contains an encrypted PST. The ZIP file is successfully extracted and the loose files are pushed to review. The PST is marked as a Node Level exception Q: What does the ‘Detect Container’ exception mean? A: The Detect Container error generally indicates that the processing engine believes the file is a container of some kind, but it wasn’t able to extract any children from it. Re-queuing these types of errors may be able to resolve the exception. Q: What does the ‘Read Email Fields’ exception mean? A: This exception generally means the processing engine was unable to extract one or more values that it is using for hash generation and therefore the document fails hash generation. This could potentially result in extra files being delivered if the missing fields resulted in the file not being de-duplicated. Re-queuing these types of errors may be able to resolve the exception. Q: What is the best way to review exception messages for my job? A: While the eCapture Controller UI will provide you with an error message for the selected error, reviewing the Detailed Error Report is the best way to get the exact error message resulting from the exception. Link to comment Share on other sites More sharing options...
Krista Schmidt Posted June 19, 2017 Author Share Posted June 19, 2017 While exploring some different options and some questions which have come up recently, I wanted to put together a Q&A to help assist with some common processing questions that we often receive. Q: With standard eCapture Discovery and Data Extract jobs, there is a Task Retry option for Data Extraction which can be configured to retry files multiple times before failing them as an exception, is this the case with Streaming Discovery jobs as well and does the same value apply? A: While this is the case for standard eCapture processing, for Streaming Discovery jobs, when a file fails processing, it will automatically fall back to standard eCapture Data Extraction to attempt the file. If the file does fall back to standard eCapture Data Extraction, at that point it will use the configured number of retries set for standard Data Extraction. Q: During the Streaming Discovery job, what types of errors are held back from being exported or pushed to Review during initial processing? A: During Streaming Discovery, Node level exceptions will cause the exception file and any corresponding family to be held back from being pushed to review. These Node level exceptions could be partially discovered containers, or containers which were unable to be discovered at all. Item level exceptions will be automatically pushed to review with the exception of ‘Detect Container’ errors and ‘Read Email Fields’ exceptions (as well as their families). As a general rule, any errors that result in files not being discovered or being discovered improperly will hold the files back. Examples: A PST file is extracted that results in many emails, but contains a folder that was unable to be extracted. The PST would be identified as a node level error and all files extracted from the PST will be held back during initial processing. If a user is able to resolve the folder issue and re-queue the container successfully, then all files will be pushed to review automatically. If nothing is done to the container, when the user Publishes Errors, the successfully extracted documents and their families would be pushed to review.A corrupt PST file is encountered and streaming discovery is unable to extract anything from it. The PST would be identified as a node level exception and no files will be pushed to review. If nothing is done to the container, when the user Publishes Errors, nothing will be pushed to review.A ZIP file is processed and some loose files are successfully processed but the ZIP file also contains an encrypted PST. The ZIP file is successfully extracted and the loose files are pushed to review. The PST is marked as a Node Level exception Q: What does the ‘Detect Container’ exception mean? A: The Detect Container error generally indicates that the processing engine believes the file is a container of some kind, but it wasn’t able to extract any children from it. Re-queuing these types of errors may be able to resolve the exception. Q: What does the ‘Read Email Fields’ exception mean? A: This exception generally means the processing engine was unable to extract one or more values that it is using for hash generation and therefore the document fails hash generation. This could potentially result in extra files being delivered if the missing fields resulted in the file not being de-duplicated. Re-queuing these types of errors may be able to resolve the exception. Q: What is the best way to review exception messages for my job? A: While the eCapture Controller UI will provide you with an error message for the selected error, reviewing the Detailed Error Report is the best way to get the exact error message resulting from the exception. Link to comment Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.