Collecting data on open access publications

Author: Amy Devenney (Research and Business Intelligence Strategic Lead, Jisc)

An article by the Knowledge Exchange (KE) highlighted the difficulty of efficiently collating consistent article-level metadata to enable the monitoring and evaluation on Transitional Agreements  (TAs) and the subsequent burden this placed on our members. Therefore, over the last eighteen months, we have been working with publishers to collect the metadata elements recommended by KE and the Efficiency and Standards for Article Charges (ESAC).  We are now working with 31 publishers who provide us with regular metadata reports on outputs under Transitional and Native open access agreements. 

How do we get the data? 

Using a template created by the KE Monitoring OA Group (that repurposes the article-level metadata checklist from the article mentioned above), we have worked with these 31 publishers to implement the template as a standard reporting tool. This template identifies the key metadata that is required by institutions and consortia to enable the monitoring of the volume and compliance of output, as well as undertaking more detailed evaluation such as the value and equity of use across the consortium and mission groups of the agreements.

As the license terms for an agreement are being arranged, we work with the publisher to discuss the metadata that they can provide and where this comes from, so that we can understand the publisher’s internal metadata workflow. We also agree with the publisher a schedule of reporting that fits with both Jisc’s reporting needs and the publisher’s ability to provide data. Data streamWhat do we do with the data? 

Once we have received the data we clean, verify, enhance and standardise it to ensure the dataset is complete and will allow holistic and comparable analysis across all publishers. The following work is carried out: 

  • DOIs: these are cleaned and de-duplicated to avoid the double counting of articles and to enable the use of external DOIs to enhance the data, 
  • Institution name: these are standardised to the legal name and the PID (Ringgold and Jisc ID) is added to verify that the institutions is subscribed to the agreement and to facilitate detailed analysis, 
  • Currency: where the APC list price is included this is standardised to GBP to enable comparable analysis in one standard currency, 
  • Article type: these are mapped to the COAR 3.0 standard following discussion with the publisher to ensure we have a coherent article type across all publishers, 
  • License type: these are mapped to the CC BY standard to enable us to monitor compliance across publishers. 

We also verify that all the institutions listed in the publisher’s report are subscribed to the agreement so that we can exclude any records that have been sent in error. We then enhance the data by using Crossref to add the funders of an article and Unpaywall to show the open access (OA) status of the article. This provides us with a cleaned, verified and standardized dataset of articles accepted and published under a Jisc agreement. In 2021 we received data from 97% of publishers with a TA and 100% of publishers with a Native Open Access agreement. 

How the data supports the transition to open access 

We are supporting higher education with the transition to open access (OA) through the negotiation of a range of agreements that open up publishing opportunities – wherever the author is based, whatever their funding situation and whichever venue they choose to publish in. 

The collection, verification and analysis of the article level metadata alongside publisher and sector data enables us to monitor the effectiveness and administrative implications of these OA agreements, and we use this evidence to inform our negotiation objectives. 

Two women discussing data in front of a dashboardNext steps

We are exploring how we can work with the cleaned, verified and standardized dataset of articles accepted and published under a Jisc agreement to benefit the community, and currently have several strands of activity: 

  • Analysing the data internally to look at year on year patterns and trends within the sector, band and mission group level, 
  • Comparing the data with pre-TA data to understand and evaluate the impact, value and costs of these agreements, 
  • Working with the Transitional Agreements Oversight Group (TAOG) to develop a series of dashboards that will enable them to collectively evaluate the impact of TAs and inform prospective post TA business models, 
  • Working with a number of institutions on a data verification project to identify any inaccuracies and missing entries, and to evaluate the metadata fields available. 

Contact us

We are also exploring ways in which this dataset could be made available to you, our members, to help your internal processes and procedures. So please get in touch if you have any thoughts or ideas. 

You can also follow us on Twitter and LinkedIn to keep up to date with Jisc data analytics.

Leave a Reply

Your email address will not be published. Required fields are marked *