Our goal is to offer customers ready-made datasets on thematic subject areas. Currently we offer a ready-made dataset collection consisting of the registry data collected in the FinRegistry research project and a COVID-19-themed ready-made dataset.
Ready-made datasets are, as the name suggests, pre-compiled and pre-processed datasets that are available more quickly, without the need for cost estimates or extraction fees from controllers.
How to apply for a ready-made dataset?
Apply for the ready-made dataset by using the data permit application form in Findata’s e-service (asiointi.findata.fi).
The data to be handed over are taylored and pseudonymized separately for each customer and permit. Ready-made datasets are individual-level data that can only be analyzed in a secure processing environment that meets the requirements. The primary processing environment is Findata’s Kapseli.
How much does a ready-made dataset cost?
A data permit for one ready-made dataset costs 300 euros. In addition to the data permit fee, Findata’s extraction costs are charged based on the amount of work involved.
If you want to analyze the data in another environment than Findata’s Kapseli, two working hours, i.e. 294,00 EUR (VAT +0%), will be charged for the delivery costs. We inform the client of the costs if the workload estimate exceeds two hours.
If you want to combine other data with the ready-made dataset, the price and processing time of the normal data permit will apply.
The price is based on the regulation of the Ministry of Social Affairs and Health. The current prices are valid until 31 December 2024.
See the Pricing page for more information.
Available datasets
FinRegistry-DATASET
More detailed description: Aineistokatalogi.fi
The FinRegistry dataset consists of the registry data collected in the FinRegistry research project and the research data generated from them. The material includes data from Digital and Population Data Services Agency (DVV), Cancer Registry, Finnish Centre for Pensions (ETK), Kanta services, Kela, THL and Statistics Finland. It contains over 20 datasets and covers data from several decades.
There are three different types of datasets in Findata’s ready-made dataset collection:
- datasets created in the project, with a completely new file structure,
- datasets modified in the project, with file structures similar to the original datasets and
- datasets covering the original data collected for the project.
Type 3 datasets are included in Findata’s ready-made materials only when the corresponding type 2 datasets are not.
Findata’s FinRegistry ready-made material will be compiled gradually during spring 2024, starting with type 1 and type 2 datasets and progressing to type 3 datasets. The source data of type 3 datasets have already been described in the Data Resources Catalog by the original controller. However, the data collected for the FinRegistry project typically contain fewer variables than the original data.
Datasets per controller
Type 1:
- Minimal phenotype, Detailed longitudinal
Type 2:
- Digital and Population Data Services Agency: Pedigree, Relative pairs, Relatives, Marriages, Living history
- Finnish Centre for Pensions: Unpaid periods and benefit periods under VEKL, Pension-insured earnings, Earnings-related pensions
- Kanta Services: Patient Data Repository: Laboratory results
- The Finnish Institute for Health and Welfare: Children born, Vaccinations, Infectious diseases, Malformations, Social assistance, Social welfare
Type 3:
- Finnish Cancer Registry: Cancer
- Statistics Finland: Causes of death
- The Social Insurance Institution of Finland: Dispensed medicines reimbursable under the National Health Insurance scheme, Entitlements to reimbursement of pharmaceutical expenses
- Kanta Services: Kanta Prescription Centre: Prescriptions, Dispensed medicines
Contrary to previously given information, the following source datasets collected by the FinRegistry reserach project have not been included in Findata’s FinRegistry ready-made dataset due to their size and structure: Primary health care visits, Health care, Intensive care.
Code lists in English can be found in the corresponding dataset descriptions in the National Data Catalogue (aineistokatalogi.fi) produced by the FinRegistry research project. Links to these are included in Findata’s dataset descriptions.
COVID-19-ready-made dataset
More detailed description: Aineistokatalogi.fi
The COVID-19 dataset contains data from four controllers: The Finnish Institute of Health and Welfare (THL), Kela/Kanta, Fimea and Statistics Finland. The target group is formed based on THL’s Infectious Disease Register. The data includes people who fell ill with COVID-19 in the HUS area in 2020–2021.
Data contents specific to the controller
- Fimea: information on side effects of corona vaccinations
- THL:
- primary healthcare and specialist healthcare information (Hilmo and Avohilmo registers) on COVID-19 related reception visits and ward treatment periods
- Various background information and more detailed information about COVID-19 from the Infectious Disease Register
- Kela/Kanta: comprehensive COVID-19 vaccination information
- Statistics Finland: cause of death data
Basic information
Findata’s ready-made dataset: COVID-19 | N | % |
---|---|---|
Cohort size | 138 396 | |
Male | 69 843 | 50,47 |
Female | 68 553 | 49,53 |
A diagnosis of COVID-19 in 2020 a | 20 755 | 15,00 |
A diagnosis of COVID-19 in 2021 a | 118 217 | 85,42 |
Those who received a positive diagnosis by age group in 2020 | ||
0–15 | 2 379 | 11,46 |
16+ | 18 376 | 88,54 |
Those who received a positive diagnosis by age group in 2021 | ||
0–15 | 27 040 | 22,87 |
16+ | 91 177 | 77,13 |
Those who died during the follow-up period b | 1 183 | 0,85 |
a Some of the persons included in the material were diagnosed with COVID-19 in both 2020 and 2021.
b All causes of death