- Occasionally vendors (WCP aka WorldCat Cataloging Partners vendors) may send an invoice more than once, resulting in duplicate (or even more) copies of records within a file delivered for download by the user. OCLC’s system does not dedupe the files received from vendors, OCLC’s system is working as designed. The provider is sending the duplicates. The user will need to de-dupe the files before uploading them into their ILS.
- WorldCat Collection Manager
Many OCLC member libraries use MarcEdit* which is a free utility that they can download to use for de-duping and many other purposes. OCLC staff do not support MarcEdit but the steps below may help a user through this process.
Deduping MARC files
MarcEdit Record DeDuplication with a MARC File for Windows
1. Open MarcEdit.
2. Open the Tools drop-down menu in the Toolbar at the top of the Window.
3. Select Find Duplicate Records.
4. Specify the file you need to deduplicate.
5. Specify the path to the file where you want to save the duplicates. I named the file Duplicates.
6. Click on the Save Button.
7. click on the Next Button.
8. Select the OCLC Number for the Control Field.
9. Leave Dedup Keeping set to the default of "First Record".
10. Click on the Process button.
And the save file contains the deduplicated records.
This workflow will help libraries de-dupe their WorldCat Cataloging Partners record files USING MACs.
Use MarcEdit to export OCLC numbers (MARC field 001) into a tab-delimited text file that can be opened in Excel. You can optionally include title field (245 $a) so the title is in the list.
Use MARCsplit to separate the records into individual files
Have records split into their own folder on the local computer
Set “Records per file to 1”
Leave “Number of files box unchecked”
Review the list of OCNs, duplicates should be grouped. Use conditional formatting in Excel to highlight duplicate values if they are not grouped.
Move one of the copies of each record into a new folder
Use MARCjoin to merge the single copies into one file for easy loading, or load the records one at a time.
*MarcEdit is not owned by OCLC, it is a free utility used by many libraries.