The Scholars Portal Dataverse team has been hard at work on the new Dataverse Data Curation Tool as part of our Canarie RDM grant project.
Development on this project is being led by Victoria Lubitch, Programmer/Analyst at Scholars Portal.

The Data Curation Tool (DCT) allows data owners and curators to create and edit variable-level metadata for any tabular file in a dataset.
Users can access this tool as a modular application once they’ve uploaded a
tabular file (e.g., SPSS, R, Excel, CSV) to a dataset in Dataverse.

The Data Curation Tool

The Data Curation Tool

Similar to tools like SPSS, the DCT allows users to view summary statistics about their data, add variable information like 'Interviewer Instructions'
or 'Notes', create variable groups, and indicate weighting variables.

Summary Statistics in the DCT

Summary statistics in the DCT

Variable editor in the DCT

Variable editor in the DCT

Once edits have been completed and saved back to Dataverse, these changes can then be downloaded as an XML file or exported to a codebook.

Example of a codebook in Dataverse

Example of a codebook in Dataverse

Usability testing sessions were recently completed with 5 participants, who worked through a series of tasks and helped us identify areas where
the user experience could be improved in the tool. We’re now working on translating this tool to be used in French, with translations provided by
the University of Ottawa.

A demo of this tool is available online, and the code can be accessed on GitHub. The Data Curation Tool will be launched with the next
Scholars Portal Dataverse upgrade, currently scheduled for the end of October, and will be available for community testing soon.

If you have any comments or suggestions, contact us at If you would like to see all the updates and
have a SpotDocs account, click the "Watch this blog" button on the top right corner of the page to receive notifications.


Welcome to our Scholars Portal Dataverse blog, where we will be sharing news and updates about the Dataverse platform and service, including development work. Our first blog post provides an update about the development project "Dataverse for the Canadian Research Community"! This project is funded by CANARIE's RDM grant program and led by Scholars Portal and University of Toronto Libraries, with support from CARL and Portage. 

We're currently about half way through our 18 months of development work (October 2018-March 2020).

The aim of the grant is to enhance Dataverse to address the needs of a broad range of researchers in Canada through improved scalability, improved integrations with Canadian cloud storage and authentication providers, and better support for data curation workflows. These three areas of development are described further below and will be discussed in more detail in future blog posts. 


The goals of the first leg of the project include:

  • Optimize system architecture for scalable use
  • Connect to existing Canadian cloud data storage environments
  • Support large files in upload/download contexts

Planned deliverables:

  • Develop and test connections to SWIFT object storage, such as the Ontario Library Research Cloud (OLRC)
  • Support Globus endpoints with file access mediated outside of Dataverse application
  • Develop large-file upload utility to support deposit of larger file sizes (2GB+) into Dataverse


The goals of the second leg of the project include:

  • Integrate with Canadian authentication infrastructure

  • Streamline login workflows

Planned deliverables:

  • Integrate Dataverse with CAF Shibboleth Login for single-sign on
  • Investigate further integration with ORCID to support linking research outputs

Data Curation

The goals of the third leg of the project include:

    • Enhance multi-disciplinary support for data curation
    • Enable users to adopt metadata standards and best practices

Planned deliverables:

  • Data Curation Tool, a modular application integrated within Dataverse that would allow users to create and edit variable-level metadata of tabular data files to aid in data re-usability

Status update

Our Project Timeline & Deliverables roadmap is included below. We have completed our first two deliverables and are currently working on the third.

For our first deliverable to connect Dataverse with Swift as the primary storage service, we stood up a test instance of Dataverse connected to the OLRC. The SP team tested upload and download functionality, as well as the integrity of files stored, with a variety of file types and sizes, along with other functionalities core to Dataverse. The idea behind this type of configuration would allow us to more easily scale the system, add storage resources, and run the platform more optimally.

We have also successfully configured Dataverse to work with Shibboleth for single sign-on using the University of Toronto as the test case. We are now initiating a pilot project with interested institutions to test out new sign-up and login workflows. More details to come in another blog post.

Currently, we are working on completing our third deliverable - developing the Data Curation Tool. We presented the DCT prototype at NADDI (link to slides) and at the Dataverse Community Meeting (link to slides). Feel free to test out the Data Curation Tool - Prototype and stay tuned for a future blog post describing the development of this tool.

In the fall, we will start to focus on the large-file support and storage connection pieces of the project.

We will be sharing more details about these deliverables and details about the development work in upcoming blog posts! If you have any comments or suggestions, please feel free to contact us at If you would like to see all the updates and have a Spotdocs account, click the "Watch this blog" button on the top right corner of the page to receive notifications.

Project Timeline & Deliverables

The Scholars Portal Dataverse is a repository for research data collected by individuals and organizations affiliated with Ontario universities. It is open to anyone in the world to deposit, and has over 78 Dataverses, with over 500 studies deposited to date. 

To update our existing infrastructure, we are excited to announce that we will be upgrading to Dataverse 4 over the summer and fall 2016. The upgrade will improve the overall look and feel of the Scholars Portal Dataverse, and offer new features for data management.

Fall: Release

As of September 30th 2016, Scholars Portal will be releasing a new version of Dataverse. The service will be down from September 26th to 30th in order to properly upgrade and test the new system.

If you have questions about this upgrade, please contact