Knowledgebase

status_loader

Searching for duplicate Office documents on SharePoint

Question / Problem

I want to search for duplicate files on a SharePoint site. I know there are duplicate Office files, but TreeSize doesn't show them.

Other file types are being found as expected.

Answer / Solution

When uploading an Office file to SharePoint, the SharePoint itself alters the file. As such, they differ binary and can no longer be considered duplicates when comparing their MD5 checksums.
This can be verified by uploading a file twice, download both files again and compare their binary values and/or checksums.

Other file formats (e.g. PDF, PNG, ..) are - by default - not being altered by SharePoint and work as expected.

Need further help getting started?

You did not find what you were looking for? Please contact us so we can provide an answer to your question.

Contact Form