The Application of Association Rules to Detect the Effects of Vaccinations against Covid-19 in the EU-27. Preliminary Estimates

Application of


Introduction
For the third year in a row, the unprecedented economic and social upheavals caused by the COVID-19 pandemic pose a critical global threat.According to Johns Hopkins University, the number of global confirmed cases of COVID-19 was more than 521.4 million, and the number of global confirmed COVID-19 deaths exceeded 6.26 million (Donovan, 2022).Our World in Data announced that more than 11.65 billion doses of vaccinations against  Vaccinations, 2022) have been administered worldwide.According to The World Health Organization, the number of COVID-19 deaths in the WHO European region in May 2022 exceeded 2 million (COVID-19 deaths cross 2 million marks…, 2022).
Uneven vaccination causes risks to the successful elimination of the coronavirus consequences, to economic recovery, and increases geopolitical tensions.Today, the economic problems related to COVID-19 are predicted to cause the global economy to fall by 2.3% compared to the pre-pandemic level and create new risks, such as inflation, rising prices, and debts.Only the fast vaccination progress and the use of the latest information technologies to analyse the situation can prompt a smooth return to pre-pandemic growth (The Global Risks Report 2022).Therefore, scientists have already conducted empirical research on the impact of the COVID-19 pandemic in various spheres of socio-economic life (Bieszk-Stolorz & Dmytrów, 2020;Bieszk-Stolorz & Dmytrów, 2021;Dmytrów et al., 2021) and SARS-CoV-2 vaccine effectiveness (Results for Vaccines, 2022).Although the pandemic has not yet been fully overcome, governments need to make effective management decisions to address its consequences in a coordinated manner, which can only be supported by effective socio-economic research methods (Berezka & Kovalchuk, 2018;Grzeskowiak & Stanimir, 2007).
The aim of the paper was to search for useful association rules to identify the hidden dependencies and links between the dynamics of vaccinations against COVID-19 and the effects of the pandemic (disease cases, hospitalisations, and deaths).

Data selection and description
The information base of the study was the following official statistical dataset: the daily number of cases of COVID-19 disease, vaccinations against COVID-19, hospitalisations due to COVID-19, and deaths from COVID-19 in the EU countries from March 2020 to March 2022 (COVID-19 Data on the daily number…, 2022).
The empirical research was performed in Statistica software.Preliminary graphical analysis suggests that COVID-19 cases in most EU countries peaked in January 2022 (Figure 1).The countries with the highest average monthly COVID-19 cases in the study period were France, Germany, Italy, and Spain (Figure 2).

Research methodology
Association rules are the data mining methods designed to identify interesting research-relevant relations between variables in large datasets.The method involves the search for strong rules (abstract associations) identified in the dataset under study using certain measures of interest (Srikant & Agraval, 1996).As a result of the application of association rules, as well as other data mining methods, previously unknown non-trivial information was obtained.The obtained knowledge describes new relations between properties and can be used to predict the values of some features based on others.This knowledge can be applied to new data with some degree of certainty.The usefulness is that this knowledge brings some effectiveness in its application.Knowledge should be presented in a form understandable to non--mathematicians.
In data analysis, it is quite often necessary to determine sets of frequently occurring objects from a large set of objects.Before giving a generalised description of this problem, the authors introduced some notations and provided some definitions.
Let I be the set of objects included in the studied sets: 1 2 { , ,..., ,..., } where i j is an object included in the studied sets, n is the total number of objects.
Let D be a set of transactions, each of which is a set of objects from set I: where T j is a transaction that is a subset of set I (T ⊆ I), m is the total number of transactions available in the study.
The subset of transactions, which includes object i j , is denoted by An arbitrary set of objects (item set) is denoted by F (F ⊆ I), and the set of transactions, including set F, as D F (D F ⊆ D) .
A set of objects consisting of k elements is called a k-element object set.The support of a set of objects is the ratio of the number of transactions, including set F, to the total number of transactions: where n(D F ) is the number of transactions that include set F, n(D) is the total number of transactions available in the study, n(D) = m.
An association rule is implication F → G, where F ⸦ I, G ⸦ I, and F ∩ G = Ø.Association rules are characterised by a confidence indicator, which indicates how often the rule is true.The confidence of rule F → G is the ratio of the number of transactions that include both set F and set G to the number of transactions that include set F: When conducting a study, the analyst determines the thresholds -the minimum value of support for the analysed sets Supp min and the minimum value of confidence Conf min .
Set F is called frequent if its support value is greater than the minimum specified support value: Rule F → G is an association rule if it has the support greater than Supp min and the confidence greater than Conf min .
After the above definitions, the authors presented an algorithm for generating association rules, which consists of two steps: • search for frequent item sets; • formation of association rules from the sets of step 1.
Generating association rules is not an easy task.The first step is especially algorithmically difficult because it requires searching through all possible combinations of objects.As the number of objects in set I grows, the number of possible sets of objects grows exponentially.
The search for association rules is greatly facilitated by data mining.
The K-mean clustering method was used for cluster analysis.This method clusters the input dataset in quantified number (k) of groups (Anderberg, 1973).

Empirical results and discussion
The study used the association rules method to identify the relation between the number of vaccinations against COVID-19 (taking into account the cumulative effect) and the number of new cases of COVID-19, hospitalisations due to COVID-19, and deaths caused by COVID-19 in the EU countries from 03.2020 to 03.2022.
The analysis revealed the following association rules: 1.If the number of vaccinations against COVID-19 in one of the EU countries in the current month is higher than the average monthly number of vaccinations against COVID-19 in this country from 03.2020 to 03.2022, then with a probability of 0.9% the number of COVID-19 cases in this country in the same month will be lower than the average monthly number of COVID-19 cases in the same country for the same period; and vice versa (Table 1).
The formal representation of the defined association rules with minimum support = 0.9, minimum confidence = 0.9, and minimum correlation = 0.9 is the following: <IF Monthly number of vaccinations against COVID-19 > Limit vaccinations THEN Monthly number of cases of the disease COVID-19 is lower than Limit cases> Limit vaccinations -a variable in which the calculated function of the average monthly number of vaccinations against COVID-19 for the EU Member States from 03.2020 to 03.2022 is stored; Limit cases -a variable in which the calculated function of the average monthly number of cases of COVID-19 for the EU member states from 03.2020 to 03.2022 is stored.2. If the number of vaccinations against COVID-19 in one of the EU countries in the current month is higher than the average monthly number of COVID-19 vaccinations in this country from 03.2020 to 03.2022, then with a probability of 0.9% the monthly number of deaths from COVID-19 in this country in the same month will be lower than the average monthly number of deaths caused by COVID-19 (Table 2).
The formal representation of the defined association rules with minimum support = 0.9, minimum confidence = 0.9, and minimum correlation = 0.9 is the following: < IF Monthly number of vaccinations against COVID-19 > Limit vaccinations THEN Monthly number of deaths from COVID-19 is lower than Limit death > Limit death -a variable in which the calculated function of the average monthly number of deaths caused by COVID-19 for the EU member states from 03.2020 to 03.2022 is stored.3.If the number of vaccinations against COVID-19 in the current month in one of the EU countries is higher than the average number of monthly vaccinations against COVID-19 in this country from 03.2020 to 03.2022, then with a probability of 0.5%, the average monthly number of hospitalisations due to COVID-19 in this country in the same month will be lower than the average monthly number of COVID-19 hospitalisations in the same country for the same period, and vice versa (Table 3).
The formal representation of the defined association rules with minimum support = 0.5, minimum confidence = 0.5, and minimum correlation = 0.5 is the following:  Support shows what percentage of transactions support the received rule.In this case, sufficient support has been achieved so that the identified associative rule can be extended with high reliability to all EU countries.1.If in Austria the number of vaccinations against COVID-19 in the current month was higher than 0.7% of the population, then with a probability of 90% the number of COVID-19 cases in the same month was lower than 1.9% of the population, the number of deaths caused by COVID-19 in the same month was lower than 0.008% of the population; with a probability of 50%, the number of hospitalisations due to COVID-19 in the same month was lower than 0.46% of the population.
2. If in Belgium the number of vaccinations against COVID-19 in the current month was higher than 0.8% of the population, then with a probability of 90% the number of COVID-19 cases in the same month was lower than 1.5% of the population, the number of deaths caused by COVID-19 in the same month was lower than 0.012% of the population; with a probability of 50%, the number of hospitalisations due to COVID-19 in the same month was lower than 0.57% of the population.
3. If in Bulgaria the number of vaccinations against COVID-19 in the current month was higher than 0.2% of the population, then with a probability of 90% the number of COVID-19 cases in the same month was lower than 0.6% of the population, the number of deaths caused by COVID-19 in the same month was lower than 0.019% of the population; with a probability of 50%, the number of hospitalisations due to COVID-19 in the same month was lower than 1.37% of the population.
4. If in Croatia the number of vaccinations against COVID-19 in the current month was higher than 0.4% of the population, then with a probability of 90% the number of COVID-19 cases in the same month was lower than 1.0% of the population, the number of deaths caused by COVID-19 in the same month was lower than 0.015% of the population; with a probability of 50%, the number of hospitalisations due to COVID-19 in the same month was lower than 0.69% of the population.
5. If in Cyprus the number of vaccinations against COVID-19 in the current month was higher than 0.7% of the population, then with a probability of 90% the number of COVID-19 cases in the same month was lower than 2.1% of the population, the number of deaths caused by COVID-19 in the same month was lower than 0.004% of the population; with a probability of 50%, the number of hospitalisations due to COVID-19 in the same month was lower than 0.05% of the population.
6.If in the Czech Republic the number of vaccinations against COVID-19 in the current month was higher than 0.6% of the population, then with a probability of 90% the number of COVID-19 cases in the same month was lower than 1.5% of the population, the number of deaths caused by COVID-19 in the same month was lower than 0.015% of the population; with a probability of 50%, the number of hospitalisations due to COVID-19in the same month was lower than 0.75% of the population.
7. If in Denmark the number of vaccinations against COVID-19 in the current month was higher than 0.8% of the population, then with a probability of 90% the number of COVID-19 cases in the same month was lower than 2.2% of the population, the number of deaths caused by COVID-19 in the same month was lower than 0.004% of the population; with a probability of 50%, the number of hospitalisations due to COVID-19 in the same month was lower than 0.19% of the population.
8. If in Estonia the number of vaccinations against COVID-19 in the current month was higher than 0.5% of the population, then with a probability of 90% the number of COVID-19 cases in the same month was lower than 1.6% of the population, the number of deaths caused by COVID-19 in the same month was lower than 0.007% of the population; with a probability of 50%, the number of hospitalisations due to COVID-19 in the same month was lower than 0.50% of the population.
9. If in Finland the number of vaccinations against COVID-19 in the current month was higher than 0.7% of the population, then with a probability of 90% the number of COVID-19 cases in the same month was lower than 0.7% of the population, the number of deaths caused by COVID-19 in the same month was lower than 0.003% of the population; with a probability of 50%, the number of COVID-19 hospitalisations in the same month was lower than 0. 1% of the population.
10.If in France the number of vaccinations against COVID-19 in the current month was higher than 0.7% of the population, then with a probability of 90% the number of COVID-19 cases in the same month was lower than 1.6% of the population, the number of deaths caused by COVID-19 in the same month was lower than 0.009% of the population; with a probability of 50%, the number of COVID-19 hospitalisations in the same month was lower than 0.81% of the population.
11.If in Germany the number of vaccinations against COVID-19 in the current month was higher than 0.7% of the population, then with a probability of 90% the number of COVID-19 cases in the same month was lower than 1.0% of the population, the number of deathscaused by COVID-19 in the same month was lower than 0.006% of the population; with a probability of 50%, the number of COVID-19 hospitalisations in the same month was lower than 0.11% of the population.
12. If in Greece the number of vaccinations against COVID-19 in the current month was higher than 0.6% of the population, then with a probability of 90% the number of COVID-19 cases in the same month was lower than 1.1% of the population, the number of deaths caused by COVID-19 in the same month was lower than 0.010% of the population; with a probability of 50%, the number of COVID-19 hospitalisations in the same month was lower than 0.12% of the population.
13.If in Hungary the number of vaccinations against COVID-19 in the current month was higher than 0.6% of the population, then with a probability of 90% the number of COVID-19 cases in the same month was lower than 0.7% of the population, the number of deaths caused by COVID-19 in the same month was lower than 0.018% of the population; with a probability of 50%, the number of COVID-19 hospitalisations in the same month was lower than 0.86% of the population.
14.If in Ireland the number of vaccinations against COVID-19 in the current month was higher than 0.8% of the population, then with a probability of 90% the number of COVID-19 cases in the same month was lower than 1.4% of the population, the number of deaths caused by COVID-19 in the same month was lower than 0.006% of the population; with a probability of 50%, the number of COVID-19 hospitalisations in the same month was lower than 0.29% of the population.
15.If in Italy the number of vaccinations against COVID-19 in the current month was higher than 0.7% of the population, then with a probability of 90% the number of COVID-19 cases in the same month was lower than 1.0% of the population, the number of deaths caused by COVID-19 in the same month was lower than 0.011% of the population; with a probability of 50%, the number of COVID-19 hospitalisations in the same month was lower than 0.67% of the population.
16.If in Latvia the number of vaccinations against COVID-19 in the current month was higher than 0.4% of the population, then with a probability of 90% the number of COVID-19 cases in the same month was lower than 1.4% of the population, the number of deaths caused by COVID-19 in the same month was lower than 0.009% of the population; with a probability of 50%, the number of COVID-19 hospitalisations in the same month was lower than 0.54% of the population.17.If in Lithuania the number of vaccinations against COVID-19 in the current month was higher than 0.4% of the population, then with a probability of 90% the number of COVID-19 cases in the same month was lower than 1.1% of the population, the number of deaths caused by COVID-19 in the same month was lower than 0.010% of the population; with a probability of 50%, the number of COVID-19 hospitalisations in the same month was lower than 0.40% of the population.
18.If in Luxembourg the number of vaccinations against COVID-19 in the current month was higher than 0.9% of the population, then with a probability of 90% the number of COVID-19 cases in the same month was lower than 1.9% of the population, the number of death caused by COVID-19 in the same month was lower than 0.009% of the population; with a probability of 50%, the number of COVID-19 hospitalisations in the same month was lower than 0.46% of the population.
19.If in Malta the number of vaccinations against COVID-19 in the current month was higher than 1.1% of the population, then with a probability of 90% the number of COVID-19 cases in the same month was lower than 0.8% of the population, the number of deaths caused by COVID-19 in the same month was lower than 0.006% of the population; with a probability of 50%, the number of COVID-19 hospitalisations in the same month was lower than 0.24% of the population.20.If in the Netherlands the number of vaccinations against COVID-19 in the current month was higher than 0.7% of the population, then with a probability of 90% the number of COVID-19 cases in the same month was lower than 1.9% of the population, the number of deaths caused by COVID-19 in the same month was lower than 0.005% of the population; with a probability of 50%, the number of COVID-19 hospitalisations in the same month was lower than 0.20% of the population.
21.If in Poland the number of vaccinations against COVID-19 in the current month was higher than 0.5% of the population, then with a probability of 90% the number of COVID-19 cases in the same month was lower than 0.6% of the population, the number of deaths caused by COVID-19 in the same month was lower than 0.012% of the population; with a probability of 50%, the number of COVID-19 hospitalisations in the month was lower than 0.73% of the population.
22. If in Portugal the number of vaccinations against COVID-19 in the current month was higher than 0.7% of the population, then with a probability of 90% the number of COVID-19 cases in the same month was lower than 1.4% of the population, the number of deaths caused by COVID-19 in the same month was lower than 0.009% of the population; with a probability of 50%, the number of COVID-19 hospitalisations in the same month was lower than 0.37% of the population.
23.If in Romania the number of vaccinations against COVID-19 in the current month was higher than 0.3% of the population, then with a probability of 90% the number of COVID-19 cases in the same month was lower than 0.5% of the population, the number of deathscaused by COVID-19 in the same month was lower than 0.012% of the population; with a probability of 50%, the number of COVID-19 hospitalisations in the same month was lower than 0.54% of the population.
24.If in Slovakia the number of vaccinations against COVID-19 in the current month was higher than 0.5% of the population, then with a probability of 90% the number of COVID-19 cases in the same month was lower than 1.8% of the population, the number of deaths caused by COVID-19 in the same month was lower than 0.014% of the population; with a probability of 50% the number of COVID-19 hospitalisations in the same month was lower than 0.69% of the population.
25.If in Slovenia the number of vaccinations against COVID-19 in the current month was higher than 0.5% of the population, then with a probability of 90% the number of COVID-19 cases in the same month was lower than 2.0% of the population, the number of deaths caused by COVID-19 in the same month was lower than 0.013% of the population; with a probability of 50%, the number of COVID-19 hospitalisations in the same month was lower than 0.68% of the population.
26.If in Spain the number of vaccinations against COVID-19 in the current month was higher than 0.7% of the population, then with a probability of 90% the number of COVID-19 cases in the same month was lower than 1.0% of the population, the number of deaths caused by COVID-19 in the same month was lower than 0.009% of the population; with a probability of 50%, the number of COVID-19 hospitalisations in the same month was lower than 0.43% of the population.
27.If in Sweden the number of vaccinations against COVID-19 in the current month was higher than 0.7% of the population, then with a probability of 90% the number of COVID-19 cases in the same month was lower than 1.1% of the population, the number of deaths caused by COVID-19 in the same month was lower than 0.008% of the population; with a probability of 50%, the number of COVID-19 hospitalisations in the same month was lower than 0.34% of the population (Table 4).
The association rules helped to determine the hidden dependencies and connections between the studied variables at the stage of intelligence analysis.The main criterion in the distribution of countries by selected clusters were limit cases (Figure 5).If the difference between the average threshold values of other studied indicators for the countries of the first and second groups was not more than 0.1, then for the countries of the first cluster the threshold value of COVID-19 cases was more than twice the threshold value of this indicator for the countries of the second cluster (Table 5).Thus, it can

15
be argued that in the countries of the first cluster, increasing the level of vaccinations against COVID-19 significantly reduced the risks of COVID-19 cases.

Conclusions
On the basis of empirical data on the first two years of the pandemic, a preliminary analysis of the impact detection of vaccinations in the EU-27 on COVID-19 health consequences was conducted.Non-obvious useful association rules were identified between the dynamics of vaccinations against COVID-19 and the separate consequences of a pandemic (COVID-19 cases, hospitalisations, and deaths).It was found that the dynamics of vaccination significantly affected the number of cases.
When detecting hidden dependencies of the number of COVID-19 cases and the number of deaths caused by COVID-19 from the number of vaccinations against COVID-19, 90% probability was used, and this indicates the high usefulness of the obtained rule.It contains reliable information that was not previously known but has a logical explanation.Such rules can be used to make beneficial decisions, e.g. to regulate vaccination policies in individual EU countries.The use of association rules makes it possible to find possible hidden dependencies and connections at the stage of intelligence analysis.The results obtained are only preliminary estimates and require further in-depth research, but they can already provide important information for vaccination policy in the EU-27 and can be used to build regression and forecasting models.The k-mean clustering results show that increasing the level of vaccinations against COVID-19 significantly reduces the risks of COVID-19 cases.The COVID-19 pandemic has become a global challenge and has far-reaching devastating consequences for the world, but effective management decisions, including vaccination, are needed to support economic recovery and protect the health and well-being of citizens.

The
Application of Association Rules to Detect the Effects of Vaccinations...

Fig. 2 .
Fig. 2. Average monthly cases of COVID-19 in the EU countries from 03.2020 to 03.2022 Source: own elaboration using Statistica software.

Fig. 4 .
Fig. 4. Average monthly vaccinations against COVID-19 in the EU countries from 03.2020 to 03.2022 Source: own elaboration using Statistica software.

<
IF Monthly number of vaccinations against COVID-19 > Limit vaccinations THEN Monthly hospitalisations due to COVID-19 is lower than Limit hospitalizations > Limit hospitalisations -a variable in which the calculated function of the average monthly number of COVID-19 hospitalisations for the EU member states from 03.2020 to 03.2022 is stored.
The obtained results were used to cluster EU countries according to the level of vaccinations against COVID-19, cases of COVID-19, deaths caused by COVID-19, and COVID-19 hospitalisations for the EU member states.To divide the EU-27 countries into groups according to the thresholds of the average values of COVID-19 cases, vaccinations against COVID-19, hospitalisations due to COVID-19, and death caused by COVID-19 by the EU country from 03.2020 to 03.2022, the cluster analysis was conducted.As a result of the application of the k-mean clustering method, two clusters were determined.The first cluster included economically developed industrial countries, such as Austria, Belgium, Cyprus, Czechia, Denmark, Estonia, France, Ireland, Latvia, Luxembourg, the Netherlands, Portugal, Slovakia, and Slovenia.The second cluster consisted of Bulgaria, Croatia, Finland, Germany, Greece, Hungary, Italy, Lithuania, Malta, Poland, Romania, Spain, and Sweden.

Fig. 5 .
Fig. 5. Plot of means for each cluster Source: own elaboration using Statistica software.

Table 3 .
Summary of association rules (vaccinations and hospitalisations)

Table 4 .
Limit values of the monthly number of vaccinations against COVID-19, cases of the disease COVID-19, deaths caused by COVID-19, and hospitalisations due to COVID-19 for the EU member states

Table 5 .
Cluster means