OCLC Expert Cataloging Community Sharing Session minutes, January 2019

Last updated
Save as PDF

Minutes of the OCLC Expert Cataloging Community Sharing Session
ALA Midwinter Meeting
Friday, 2019 January 25
10:30 a.m.-12:00 noon
Seattle, Washington

The ALA Midwinter 2019 Edition of Breaking Through: What’s New and Next from OCLC and the compilation of News From OCLC were distributed. The following items were highlighted:

The records of the National Library of Cuba are being added to WorldCat, 133,000 so far, 97,000 of which are unique records. The University of Florida is facilitating this project.
OCLC and Atlas Systems continue the rapid development of ILLiad and Tipasa, the first cloud-based ILL management system. There have already been many updates to the ILLiad software.
Five librarians have been selected as the 2019 IFLA/OCLC Fellows. They are from Nigeria, Lebanon, Mongolia, Jamaica and Bolivia. The fellowships not only provide continuing education and exposure to a broad range of issues to the recipients, but allow OCLC to learn a lot more about what is happening in many libraries around the world. 95 librarians from 42 countries have participated in this program so far.
Bart Murphy has been named OCLC Chief Technology and Information Officer. He has much experience in this work in many organizations and industries and so is well prepared for the OCLC position.
Thomas Padilla is the new Practitioner Researcher-in-Residence at OCLC Research. This is a six-month appointment during which he will facilitate OCLC’s engagement with the library data science community.
The Linked Data Prototype Project of 2017-2018, in which several research, academic, public and federal libraries participated, has wrapped up. The participants used the high quality sets of name entities available from FAST, VIAF and Wikidata to reconcile names of all sorts of entities and create language-tagged headings, persistent identifiers and entity descriptions and describe relationships between entities.

Cynthia Whitacre (Manager, Metadata Policy) spoke to address this pre-submitted question:

“A topic I would like to hear more about is what the standards are for vendor-loaded records. For example, are there minimal requirements for completeness/correctness? Are they required to follow any cataloging rules like AACR2, RDA, ISBD, etc.? Libraries increasingly rely on vendor records as cataloging budgets shrink, but catalogers often complain about the quality of vendor-loaded records -- is OCLC aware of these complaints and are there any plans to improve these records?”: Vendor records run the gamut from being excellent to horrid. Most vendors fall somewhere in between those extremes. OCLC loaded a first file from a new vendor partner just last week where the records were EXCELLENT. Please don’t lump all vendor records in one bucket in terms of quality. For vendors that work directly with OCLC to load records, we ask that they load at least minimal level records, meeting the MARC minimal level standards, which translates to OCLC level K. A few vendors may contribute at Encoding Level 3. We load the vast majority of vendors via Data Sync, so you’ll see encoding level M assigned to their records. We do not allow sparse records from a vendor. We must have title, publisher or place of publication, and date. We also strongly suggest a numeric identifier, like ISBN, as well. We do not require AACR2, RDA, or ISBD, but if they code records as such, we expect them to follow those standards. We work with eBook vendors to assure they follow provider-neutral guidelines. A main thing we evaluate is whether records pass validation. Our Data Sync service has 3 levels of errors. We let minor errors be added to WorldCat, but we do not load records that have error levels 2 or 3 (serious and egregious). We also look for accuracy in content. We have evaluated numerous vendors that use MARC coding incorrectly or put names in direct order instead of indirect order (100 Firstname Lastname instead of 100 Lastname, Firstname). We don’t accept those for loading until they fix the errors. We’ve gotten quite strict about that in the past few years for newer vendors who wish to load.
What percentage of records get upgraded from Encoding Level M?: Upgrading Encoding Level M records requires a library to manually upgrade it or for one of the PCC libraries, including LC, to load a record to overlay it with a higher-level record. OCLC has no statistics on how often that happens.

Charlene Morrison (Database Specialist II, Metadata Quality) gave an update on the OCLC Member Merge Project. It is open to all PCC members and we are encouraged to apply by contacting OCLC at AskQC@oclc.org. The training is modeled after the PCC training process. The libraries in training start off with three training webinars: an introduction to the process, a group review of some sample records, and an explanation of how the process actually works and the tools needed to get started. It is usually a three- to six-month process, starting with monographs, including both print and electronic, and then moving on to other formats. Archival materials, sound recordings, and maps are to be added soon. All training documents were recently consolidated in one place, which will facilitate the process. More than 26,000 records have already been merged. Ten libraries have achieved independence, all 8 in the first two cohorts and two from the third cohort.

How there can be duplicates for archival materials?: Although duplicates may be less likely for archival materials, imperfections in the algorithms or different archival cataloging schemes at the same institution can result in duplicates. Loads of digitalized archival materials also allow occasional duplicates to creep in.

Nathan Putnam (Director, Metadata Quality) reported on metadata remediation at OCLC. They receive about 2500 merge requests monthly, which helps explain why getting to them may take a while. In December 2018 the metadata staff did batch updates of 8.7 million records. Since July 2018 the DDR program has looked at 70 million plus records and removed 7 million.

The floor was opened for questions.

What is going on with Encoding Levels at OCLC?

In recent years, OCLC Metadata Quality has been working toward bringing OCLC-MARC into closer alignment with MARC 21 proper. This helps to facilitate the understanding and exchange of data globally. One of the major areas in which OCLC-MARC remains different from MARC 21 is the OCLC-defined alphabetic Encoding Levels (Leader/17; OCLC mnemonic “ELvl”) in bibliographic records. In the early days of MARC 21, only the Library of Congress was authorized to use the defined numeric Encoding Levels, so OCLC defined our own alphabetic codes for use by members of the cooperative. Over the decades, LC loosened its control over some of the numeric Encoding Levels. It’s now time to begin planning to convert the OCLC-defined Encoding Levels I, K, and M to the numeric codes in MARC 21. In December 2018, we held four focus groups with volunteer institutions representing academic, ARL, public, and special libraries, to discuss how Encoding Levels play into their workflows, what impact these envisioned changes might have, and how OCLC can make the changes as easy as possible. Since then, we have begun to discuss the criteria for converting Encoding Levels, how to retain some indication that a record was added to WorldCat via a batch process rather than manually, and other issues. There is currently no timeline for any of these changes and we will give you plenty of advance notice.

How is OCLC dealing with the PCC request for changes in punctuation?

Beginning on 2019 March 31, PCC participants will be allowed to omit terminal punctuation in all descriptive fields of new or authenticated PCC records. Any further PCC decisions will be announced later in 2019, but no earlier than May 2019. The notion of removing ISBD punctuation from bibliographic records has been around long enough that Descriptive Cataloging Form (Leader/18; OCLC mnemonic “Desc”) code values already exist for:

c - ISBD punctuation omitted - Descriptive portion of the record contains the punctuation provisions of ISBD, except ISBD punctuation is not present at the end of a subfield.
i - ISBD punctuation included - Descriptive portion of the record contains the punctuation provisions of ISBD.
n - Non-ISBD punctuation omitted - Descriptive portion of the record does not follow International Standard Bibliographic Description (ISBD) cataloging and punctuation provisions, and punctuation is not present at the end of a subfield.

Many WorldCat records, particularly those coded for German language-of-cataloging (field 040 subfield $b coded “ger”) already lack ISBD punctuation. And of course, pre-AACR2 records have always had punctuation now considered not standard. OCLC is conferring with PCC and with Terry Reese/MarcEdit regarding the development of means by which to remove punctuation in WorldCat and to replace punctuation upon output when requested.

There was a question about overlaying vendor records.

If there is a vendor record which is a duplicate of a cataloger’s record, don’t use it for another purpose. Report it to OCLC and they will merge the two records. When updating a vendor record is harder than starting from scratch you can use a macro to get rid of the junk in the vendor record.

What can be done about junk fields, including inappropriate subject headings and local 856 fields, that have transferred from incoming records and proliferated?

For too long, Data Sync was allowing the transfer of more data, including 6XX and 856 fields, than we would have preferred. Because of important revisions we have recently installed to rein in the excessive transfer of such fields, that problem has been alleviated. We have been working and will continue to work to clean up as much as we can. But we urge you to use the powers you have as members of the cooperative and the Expert Community to manually clean up those records that you encounter, as well.

Are members adding subfields $1?

Control subfield $1 was validated as part of the OCLC-MARC Update 2018, and some catalogers are using it. OCLC would much prefer that catalogers wait until there are clear guidelines about the use of subfield $1 and how its use differs from that of subfield $0.

When a heading is controlled, is subfield $1 deleted as subfield $0 is?

No, the subfield $1 stays in the field.

What are subfields $0 and $1 used for?

Control subfield $0 “contains the system control number of the related authority or classification record, or a standard identifier such as an International Standard Name Identifier (ISNI). These identifiers may be in the form of text or a URI.” Control subfield $1 “contains a Uniform Resource Identifier (URI) that identifies an entity, sometimes referred to as a Thing or Real World Object (RWO), whether actual or conceptual.”

Do subfields $0 get wiped out in vendor records when they are loaded into WorldCat?

If the access points to which they are attached are controlled to the authority file, yes, the explicit subfield $0 is removed. If the access point is not controlled, the associated subfield $0 remains in the field.

Wouldn't ISNI and ORCID identifiers be in the authority record?

Wouldn't ISNI and ORCID identifiers be in the authority record?

Will additions be made to the dropdown menus in Connexion?

This answer has evolved since ALA Midwinter. OCLC is in the very early stages of rethinking the “Connexion client end-of-life.” Our OCLC colleague David Whitehair has recently been telling people both within and outside of OCLC that the Connexion client end-of-life would be somewhere between – and he was exact about this – next week and 43 years from now. As you probably know, NACO capabilities were added to Record Manager late in 2018, which was a big step forward. But the developments of Connexion and of Record Manager are being rethought substantially. Let me repeat that this is early and still vague, but Record Manager itself may not embody everything that the Connexion client has been able to do. Rather, our developers are thinking about a browser-based suite of applications and functions, some of which already exist (such as the WorldCat Metadata API) and some of which are yet to be developed. In other words, we may need to stop thinking about an “end-of-life date” for the Connexion client, but instead to think about redefining the Connexion client end-of-life entirely. That means, among many other things, that Connexion client dropdown menus may eventually change. Please consider becoming part of either of two groups that are helping to shape the future of both Record Manager and Connexion. The Record Manager Advisory Group focuses on new features for Record Manager proper. A second and newer group, the Connexion Advisory Group, will focus on Connexion users, their use cases, and how those needs can be met. You are urged to send your concerns and interest in either of those groups to RM-Product@oclc.org. And don’t forget about the OCLC Community Center (https://www.oclc.org/community/home.en.html), which has a whole area devoted to Metadata Services, including Record Manager.

Can OCLC create new authority records on behalf of non-NACO institutions?

Send a message to authfile@oclc.org and Metadata Quality staff will take care of that for you. Supply a copy of the title page and other relevant data. You can also access an online form for such requests via Bibliographic Formats and Standards Chapter 5, Quality Assurance (https://www.oclc.org/bibformats/en/quality.html).

Please remember to join OCLC Metadata Quality staff to discuss WorldCat quality issues and cataloging questions on the second Wednesday of most months for Virtual AskQC Office Hours (https://help.oclc.org/WorldCat/Metadata_Quality/AskQC).

Respectfully submitted by
Doris Seely
University of Minnesota
2019 February 1

With contributions and edits from Bryan Baldus, Luanne Goodson, Charlene Morrison, Nathan Putnam, Cynthia Whitacre, and Jay Weitz.

OCLC
2019 March 26