Jump to content

Streaming VS Normal Discovery - MD5 Comparison


Avinash Yerramsetty

Recommended Posts

Hi All,

 

I observed difference in MD5 when compared to normal discovery job and Streaming discovery job and also change in the counts

Provided the same inputs for the both the jobs.

 

-Can any one please let me know what all values does eCapture consider to generate MD5 hash values for Normal discovery and Streaming Discovery ?

 

Details:

 

Version : 2017.3.5

Application Version : 17.4.10010.1809

No of workers used : 4 ( Add enabled)

 

Input for Discovery Job : PST file( 94 Mb)

 

Thanks.

Avi

Link to comment
Share on other sites

Hi All,

 

I observed difference in MD5 when compared to normal discovery job and Streaming discovery job and also change in the counts

Provided the same inputs for the both the jobs.

 

-Can any one please let me know what all values does eCapture consider to generate MD5 hash values for Normal discovery and Streaming Discovery ?

 

Details:

 

Version : 2017.3.5

Application Version : 17.4.10010.1809

No of workers used : 4 ( Add enabled)

 

Input for Discovery Job : PST file( 94 Mb)

 

Thanks.

Avi

Link to comment
Share on other sites

  • S.W.A.T. Engineer

Avi, Thank you for the details. The values available to calculate the hash are the same between the two job types for emails, however, the algorithm used between the two is slightly different due to a technology difference. For emails, you can see the fields used in the discovery options. For loose files or non email type files, the entire file is hashed. Because of this difference, streaming jobs and standard jobs cannot deduplicate against each other.

Link to comment
Share on other sites

  • S.W.A.T. Engineer

Avi, Thank you for the details. The values available to calculate the hash are the same between the two job types for emails, however, the algorithm used between the two is slightly different due to a technology difference. For emails, you can see the fields used in the discovery options. For loose files or non email type files, the entire file is hashed. Because of this difference, streaming jobs and standard jobs cannot deduplicate against each other.

Link to comment
Share on other sites

  • 2 weeks later...

Hi Micheal,

 

Thank you for your reply.

 

Also I found count difference in Normal and Streaming discovery when I gave same input data for both the jobs.

 

Counts :

Normal Discovery : 14056

Streaming Discovery : 14052

 

Input for Discovery/Streaming Job : PST file( 94 Mb)

 

 

As said above is that eCapture is using two different methods to extract data. If so can you please explain the differences?

 

Thanks,

Avi,

Link to comment
Share on other sites

Hi Micheal,

 

Thank you for your reply.

 

Also I found count difference in Normal and Streaming discovery when I gave same input data for both the jobs.

 

Counts :

Normal Discovery : 14056

Streaming Discovery : 14052

 

Input for Discovery/Streaming Job : PST file( 94 Mb)

 

 

As said above is that eCapture is using two different methods to extract data. If so can you please explain the differences?

 

Thanks,

Avi,

Link to comment
Share on other sites

  • 2 weeks later...

Avi,

 

One other item that can contribute to count differences is the method of extraction of embedded items between the two engines. For example, standard discovery extracts more fembedded file types at the top level while streaming goes deeper and extracts within several layers of embedded files. The two methods are mostly similar in file type support but different in certain extraction methods.

Link to comment
Share on other sites

Avi,

 

One other item that can contribute to count differences is the method of extraction of embedded items between the two engines. For example, standard discovery extracts more fembedded file types at the top level while streaming goes deeper and extracts within several layers of embedded files. The two methods are mostly similar in file type support but different in certain extraction methods.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...