Skip to main content
OCLC Support

Data extract

Find information about creating and delivering a scoped and effective catalog extract to be loaded into GreenGlass.

 Note: If your library uses WMS to manage local collections, OCLC staff will extract WMS data on your behalf for use in GreenGlass.  

If not using WMS, it is necessary to pull and deliver a well-scoped and effectual catalog extract. The extract will include bibliographic, item, and transaction data for all relevant resources. Once GreenGlass has received your extract, it will cleanse, normalize, filter, and parse your records into GreenGlass databases. Then, your titles are matched to WorldCat, HathiTrust, and CHOICE databases before loading into GreenGlass. 


As you prepare to pull data from your local system, the first consideration is the scope. 

Before extracting records, know which collections, which locations, and which formats are to be included in the analysis. Err on the side of inclusion as GreenGlass will filter out any records inadvertently included. Also consider, whether to include or exclude items that are non-circulating, lost, missing, or damaged. If your filtering is not specified, GreenGlass will apply additional filters after receipt of your records. 

 Note: Some GreenGlass projects are designed to include a comparison of the library’s print book collection and the library’s owned ebooks.  If this is an attribute of your library’s analysis, please be sure to exclude your ebook records from the primary extract and deliver a separate file of ebook records, clearly labeled as such.  

In most catalogs, ebooks owned in perpetuity are represented by discrete MARC records but have no corresponding items.  Discuss any situations that do not match this expectation with your liaison.

MARC records for print monographs

Bibliographic data is typically delivered in the form of MARC records and include:

  • Record type a (print resource) in MARC leader byte 06
  • Bib level m or a (monograph or monograph part) in MARC leader byte 07

MARC XML files are acceptable, but standard MARC is preferred. 

Item and usage data

In addition to your in-scope bibliographic records, include all the associated item and usage data. Item data can be embedded in the MARC bibliographic record (often in 9xx sub-fields) or it can be delivered in a separate tab-delimited file.  When item or usage data is delivered in stand-alone files, include the bibliographic record number with each item record to allow a link back to the appropriate bibliographic record.  

Below is a list of the most critical item-level details that will be integrated into GreenGlass: 

Item-level detail Description
Item record number

The item record number and the barcode are standard identification numbers that are included in item lists exported from GreenGlass.  


Item-level call number

In GreenGlass,  the call number that appears on the spine of the book is displayed, so an item-level call number is best. If there is no item-specific call number, we will pull one from the bibliographic record based on a call number hierarchy established via your cataloging and data questionnaire.

Enumeration (volume number)

Enumeration data or volume numbers are used in combination with copy numbers to help distinguish between multi-volume sets and multiple copies of a title. If volume numbers have been appended to your call numbers, we will attempt to identify and harvest them from there.


Copy number
Collection code and name

GreenGlass supports two levels of location-related data elements, referred to as Collection and Location Codes. It is possible that they are labeled differently in your system, but be sure to include both if needed to locate items in your library. 


Location code and name
Item create date The item create date is used as the acquisition or add date, but if it is not available, we can use a bib-level add or cataloging date instead.
Item type code

Item type and status codes are included in lists exported from GreenGlass. GreenGlass uses them to ensure appropriate filtering of your dataset.

Item status code

Total charges

  • In-house uses or reshelving counts
  • Year-to-date charges
  • Reserve charges
  • Historical charges

With regard to usage data, we want to collect as much as possible. For most libraries, total charges will contain the complete tally. However, if there are other usage tallies that are not already included in total charges, we encourage you to send them separately. They will be combined in GreenGlass.

Additional charges in your library may include in-house browses or re-shelving counts, year-to-date charges, reserve charges, or historical charges that were not migrated into your current system. These secondary tallies should only be sent if they have not already been included in the total charges.

Last charge date

In addition to the number of charges, also include the last charge date to the extent that it is available. GreenGlass will make use of data from multiple sources if necessary.

For GreenGlass to take advantage of secondary or supplemental files of transaction information, an item record number or barcode must be associated with each line of data.  

Review and deliver extraction data

Once you have compiled your extraction of bibliographic, item, and circulation data, open and review the files.  Multiple files are acceptable.

Check that the numbers of records match your intent and that all the relevant item details have been included. If you have delivered item and/or transaction data in .csv or Excel files, be sure that each line represents a single item and that each column contains a header that will be easy to interpret.

Deliver extraction data via FTP

Once reviewed, deliver the data via FTP. Send an email to your GreenGlass liaison, which includes a description of the files, including numbers of bibliographic and item records.  

Your GreenGlass liaison will send your FTP credentials once they have received a completed Cataloging and Data Questionnare and Code Keys for collection, location, status and item type codes.

GreenGlass tasks

Once GreenGlass has received your extract, they will:

  • Filter out-of-scope bibliographic records (eBooks, maps, DVDs, Gov Docs)
  • Eliminate duplicate bibliographic records
  • Choose and normalize call numbers
  • Eliminate trailing spaces in control numbers
  • Validate OCLC numbers
  • LCCN/ISBN/title-string lookups for records lacking an OCLC number
  • Identify and accommodate unusual implementations of MARC
  • Identify bibliographic records without items and items with multiple bibliographic records
  • Map item-level data and interpret codes
  • Assign LC and/or Dewey Classes to records

At various points in this process, your liaison may be in touch with questions.