Finding and collecting data

Datasets can be found in different places

Finding Healthcare data in Region Stockholm

The Center for Health Data offers researchers a secure, uniform process for the delivery of health data. The result of data use should lead to better prevention, diagnostics and treatment.

The Center for Health Data facilitates researchers with ethical approval and a research plan as researchers do not need to turn to several different health care providers to gain access to health data.

Collecting healthcare data

If data is collected in health care (e.g. if data is collected within the framework of clinical studies) and the collection involves recurring patient contact, the hospital / region is often also the principal (and data processor) for the study. It is important to then list both KI and the healthcare principal(s) in the ethics application and, if necessary, additional agreements can be drawn up to clarify the division of responsibilities between the various principals.

If the collection does not involve recurring patient contact (e.g. if you want to use existing data and / or samples), it may be sufficient that only KI is the principal (and data processor) for the study.

In order to transfer data from the health care system to KI for research analyzes, it is usually the disclosure (utlämning) of data that is used and KI takes over responsibility for the data received. For more information on the disclosure (utlämning) of data from Karolinska University Hospital check their webpage.

In exceptional cases, data is not disclosed, instead a data processor agreement is drawn up that regulates what KI may or may not do with the data received.

Finding data in repositories


Datasets can be found in repositories.

There are controlled access repositories for sensitive data such as the European Genome-Phenome Archive, discipline-specific repositories and general data repositories such as Zenodo and Figshare.

Re3data is a register of data repositories that maps platforms for research data from all over the world. It lists both general, specific and institutional research data. 

Swedish National Data service SND is a national platform for submitting research data that can be shared openly or made searchable by merely describing metadata to a dataset which for various reasons cannot be shared openly. Karolinska Institutet has been included since 1 January 2018 together with several other major Swedish universities in a consortium with SND. SND has a guide on data management where you get support in how data can be handled throughout the research process. A web form is available on SND's website to describe and submit research data.

Finding register data

Register data is a good way of getting large amounts of data for analysis, data that in most cases already has been collected and is available for research.

Sweden is known for its quality registers, and in combination with the Swedish personal identification numbers (personnummer), these provide great possibilities for merging and adding value to the data.

In order to use register data for research, ethical approval is needed as well as approval from the agency or organization that provides the source data.

Below some common sources for register data is listed:

The NIH database dbGaP

All requests for access to dbGaP have to go through the KI Grants Office, since the NIH requires that the university vouches for all its researchers.

The data access request form can be downloaded from the KI web.


Examples of tools/resources for data collection

KI Survey




Contact Research data office

If you have questions regarding finding register data please Contact

More information for logged in staff

There is more information for those of you working in the following groups

  • C8.Department of Medical Epidemiology and Biostatistics
Log in with KI-ID