Advantages of Open Data: case COVID-19

Published: 03.11.2022 / Blog / Publication / Research

Access to data is crucial in our society today. Much of the everyday life of individuals is centred around this access to data, from finding out what kind of weather it will be, what meetings will take place tomorrow, and what heartbeat and step count measures were recorded during the last workout.

In the context of everyday life, much of the understanding we are trying to achieve is underpinned by data, and access to data (Shadbolt et al., 2022). When it comes to healthcare, access to data is even more crucial, as innovation, quality improvement, and increased efficiency is highly dependent on access to high quality data (Conway & VanLare, 2010). However, data within healthcare varies considerably in both quality and coverage, and in many critical areas the data may be difficult or impossible to access and therefore use (Shadbolt et al., 2022). As data are crucial in modelling preparedness, prevention, detection and response of different health related issues, a difficulty to use data, often because of restrictions in access, can lead to difficulties. Because of this, many governments and institutions around the world have taken action in creating policies to increase data openness, or open data, especially concerning data relating to public health (D’Agostino et al., 2018).

Open data, defined by D’Agostino et al. (2018) as data that can be freely used and shared by anyone for any purpose, under strict principles of privacy and confidentiality, when appropriate, is a general umbrella term that is used to describe all forms of data made available to the public. The advantages and benefits of having appropriate data on which to base decisions strongly point to the need for open public health data, including demographic, socioeconomic and clinical characteristics of the population, data on key outcomes such as disease complications and mortality, as well as potentially mitigating or aggravating behaviours including risk factors (D’Agostino et al., 2018; Soucie, 2012). Unlike personal health data, whose improper release might violate an individual’s privacy, open public health data has many advantages and benefits. Not only can the aggregation of open public health data help governments in making better decisions regarding health issues, but it can also increase transparency, democratic control, societal participation and self-empowerment. Data and patterns in large data volumes also offer key building blocks to create new scientific knowledge, also for researchers and areas with limited resources (D’Agostino et al., 2018; Record et al., 2022). Moreover, other types of value are also expected from open data, including innovating and improving (public) services, businesses and health-related products (Gao & Janssen, 2022).

A current example of the benefits of open data is the COVID-19 pandemic, an area or context (infectious diseases) where the aforementioned fact that our knowledge and understanding of something is underpinned by access to data, is highlighted. Concerning infectious diseases, data are needed to parameterise and validate models and projections, as well as in building scenario analysis of epidemic trajectories used to inform and affect public health policy decisions (Shadbolt et al., 2022). Organised open data sources collected and assembled by the research community was seen early in the COVID-19 pandemic, even if initially data was sparse, scraped from news outlets, press briefings updating daily case counts, search engines and social media (Shadbolt et al., 2022). When entering the pandemic stage many countries’ governments begun to publish the number of positive infections and deaths every day. Large online platforms also started to provide data directly generated from interaction with their users. Facebook provided data on people’s movement, symptoms, vaccination status. Google provided a wide range of search and mobility data. And Apple provided mobility and travel insights (Shadbolt et al., 2022). This increase in data collection and sharing also made collation more systematic, resulting in various open data repositories, for instance the Coronavirus Resource Center External link and the COVID-19 dashboard at Johns Hopkins University that collected the data about infection, recovery, and death from every country and area and combined them into a map (Figure 1) to give a clear picture of the evolvement of the pandemic (Gao & Janssen, 2022). Another example is the Our World in Data External link service, that collects the statistics on the coronavirus pandemic for every country in the world, and provides interactive visualizations, explanations of the presented metrics, and the details on the sources of the data (Shadbolt et al., 2022). Services like these, based on open and transparent data sources, not only let the public have an understanding of the pandemic situation and the outreach and consequences of measures to avoid its effect, but also provided a useful tool for governments, researchers and health departments to determine and measure the effectiveness of crisis response and continuously adjust decisions, measures and policies in regard to the pandemic (Gao & Janssen, 2022).

Figure 1. A screenshot of the COVID-19 dashboard
Figure 1. A screenshot of the COVID-19 dashboard by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University (JHU),

Even if open data as a concept is nothing new, the pandemic has accentuated the need for open data policies globally. There is growing emphasis on making data in various sectors, not only public health, more available and accessible to the public for free, and in formats that allow a variety of uses (Conway & VanLare, 2010; D’Agostino et al., 2018). Here in Finland, the Declaration for Open Science and Research, published in January 2020, and signed by most Finnish higher education institutions, including Arcada UAS, presents a common vision for the Finnish research community. This vision states that open science and research should be integrated in researchers’ everyday work. Moreover, the mission for open science and research emphasises openness as a fundamental value throughout the research community and its activities, including open and accessible data, research method and good data management (Secretariat for the National Open Science and Research Coordination, 2022).

Due to the advantages that open data had in regards to some aspects of COVID-19 pandemic, the drive for open and accessible data is gaining momentum in society and the research community. The benefits are not limited to only pandemic response, but with the help of open data and consequently through early detection and control, enhancing preparedness, or even prevention of future pandemics is plausible (Shadbolt et al., 2022). The advantages of open data are of course not limited to pandemics and public health crises but can be broadened to many other health related issues. However, enabling this kind of data-driven public health is only possible if the required data are available (D’Agostino et al., 2018). This is why open data and open science initiatives, like the Declaration for Open Science and Research, are important and should be prioritised within higher education institutions and the research community.

Yet, it needs to be stated, that initiatives and policies, as well as the concrete action of making data openly available, does not automatically create value. The value of open data becomes concrete when matched with the right researcher and research question. Having used open data for the majority of all research I’ve done, I urge researchers on all levels, from students to senior researchers, to explore the possibilities that open data has to offer.

Jonas Tana, Ph.D., senior lecturer,


Conway, P. H., & VanLare, J. M. (2010). Improving access to health care data: the Open Government strategy. JAMA, 304(9), 1007-1008.

D’Agostino, M., Samuel, N. O., Sarol, M. J., Cosio, F. G. D., Marti, M., Luo, T., … & Espinal, M. (2018). Open data and public health. Revista Panamericana de Salud Pública, 42, e66.

Gao, Y., & Janssen, M. (2022). The Open Data Canvas–Analyzing Value Creation from Open Data. Digital Government: Research and Practice, 3(1), 1-15.

Record, S., Jarzyna, M. A., Hardiman, B., & Richardson, A. D. (2022). Open data facilitate resilience in science during the COVID‐19 pandemic. Frontiers in Ecology and the Environment, 20(2), 76-77.

Shadbolt, N., Brett, A., Chen, M., Marion, G., McKendrick, I. J., Panovska-Griffiths, J., … & Swallow, B. (2022). The challenges of data in future pandemics. Epidemics, 100612.

Secretariat for the National Open Science and Research Coordination (2022). Declaration for Open Science and Research 2020-2025 External link. Accessed: 4.10.2022

Soucie, J. M. (2012). Public health surveillance and data collection: general principles and impact on hemophilia care. Hematology, 17, 144-146.

Can Machine Learning aid in finding key factors to improve the Finnish healthcare system?

Finland is in the process of change in our health care system. The Nordic well-fare system is challenged in Finland, for instance, due to difficulties in attracting nurses, changing demographics in Finland, and a general pressure to reduce costs in the whole public sector. This poses severe challenges for the entire healthcare sector. Can Machine Learning (i.e., the subfield of Artificial Intelligence, which focuses on having a machine imitate intelligent human behavior) be used to understand relationships between different critical properties of our healthcare system? Yes, it can! An excellent example of how this can be done is found in a scientific paper by Hu et al. (2020), where the authors investigated nurses' willingness to report errors in a specific geographical area of the US.

Category: Publication

Arcada awarded €500,000 to further develop trustworthy AI

The Ministry of Education and Culture has granted ten universities of applied sciences in Finland a total of EUR 5 million in state funding for research, development and innovation. Arcada is awarded 500,000 euro to continue developing its strategic focus on trustworthy AI (artificial intelligence).

Category: Research