The estimation of the costs of children based on random equivalence scales

This paper measures the cost of children using the random equivalence scale (RES). From the general population perspective, the deterministic microeconomic equivalence scales appear as continuous random variables. The stude derived the distribution of RES, assuming the lognormal distribution of income. The truncated distribution of RES can account for possible economies of scale in income or expenditure. The positional measures of the truncated distribution of RES may serve as single equivalence scales. A society’s attitude towards inequality may help chose such scales. RES for Poland 2015 was estimated using microdata from the Household Budget Survey. Polish households exhibited a remarkable level of economies of scale in that year and the equivalence scales declined with expenditure. It was observed that the cost of bringing up a child is not constant; generally, it decreases with increasing household size.


Introduction
The aim of this paper was to measure the costs of bringing up children using the random equivalence scale (RES) developed herein. RES is a stochastic counterpart of the deterministic equivalence scales offered by the microeconomic theory of consumer behaviour.
Equivalence scales are used to enable assessments of inequality, welfare and poverty in the distribution of income or expenditure (hereafter treated interchangeably). Such appraisals require homogeneous populations of comparable units. However, data on income come from surveys where households are statistical units. The populations of households are heterogeneous since households may differ in many aspects other than income, e.g. the size, the composition, the age of the adults, the age of the children, the disabilities of the household members, etc. The adjustment of household's incomes by an equivalence scale gives an equivalent income in a homogeneous population of equivalent units enjoying the same standard of living (Buhmann, Rainwater, Schmaus, and Smeeding, 1988;Jones and O'Donnell, 1995;Ebert and Moyes, 2003).
There are well-known shortcomings of microeconomic equivalence scales. The scales require strong assumptions concerning the relationship between income and needs, and serious identification problems arise in the estimation of equivalence scales (see, e.g. Wales, 1979, 1992;Blundell and Lewbel, 1991;Blackorby and Donaldson, 1993;Slesnick, 1998;Chiappori, 2016).
A macroeconomic perspective on microeconomic equivalence scales reveals that in the general population of households, income and expenditure, and their admissible transformations, become random variables. As countries' household populations are large, these random variables become continuous, according to the limit theorems.
The study derived the distribution of RES from the distributions of incomes in compared populations of households. A positional measure of RES may serve as a single equivalence scale. The author shows how society's attitude to inequality may help choose such single equivalence scales.
In this paper RESs for Poland in 2015 were estimated assuming the lognormal distribution of expenditure, using statistical microdata from the Polish Household Budget Survey, followed by applying RES to assess the costs of having children.
The rest of this paper has the following structure. Section 2 formulates the theoretical frameworks of RES. In Section 3, the distribution of RES is derived when income obeys the lognormal distribution. Section 4 presents the relationship between society's inequality aversion and the positional measures of RES distribution. Section 5 shows the empirical results of estimating RES. Section 6 concludes.

The standard microeconomic equivalence scales
Let h = 1, …, m, be an indicator of household populations, which differ concerning a particular attribute, e.g. size, demographic composition etc., h = 1 was reserved for the reference households consisting of single childless persons, although other specifications are possible. Hereafter, the term 'h-household' will denote the household that has the characteristic h, while xh and y represent actual incomes or expenditures of h-household and the reference household, respectively.
The standard microeconomic equivalence scale for h-household, given reference household h = 1, is defined as follows: where c(•) is the household expenditure (cost) function, i.e. the minimum cost of attaining utility level u0 for a given demographic attribute h and prices p (Deaton and Muellbauer, 1980). Hereafter, it was assumed that all households face the same prices, thus one may omit the argument p in Eq.
(1). This does not mean neglecting the impact of prices on equivalence scales, assuming a particular price vector, po, faced by all households.
It is useful to express the equivalence scale (1) in terms of expenditure x. Inverting the cost function with respect to u results in the indirect utility function u = vh(x), h = 1, …, m, following the usual assumption that the utility functions are continuous and increasing (see e.g. Donaldson and Pendakur, 2003). The expenditure functions and indirect utility functions are related by identities y = c(u0, 1) ↔ v1(y) = u0 and xh = c(u0, h) ↔ vh(xh) = u0, h = 2, …, m (Blackorby and Donaldson, 1993). Then, the equivalence scale (1) will be the ratio of actual expenditure , for 2, ..., if and only if the households attain the same welfare level u0, namely if and only if vh(xh) = v1(y) = u0. Note that equations (1) and (2) do not define a single equivalence scale but a family of equivalence scales indexed by u. However, demand data do not provide information about u. This is the main reason for the nonidentification of a single equivalence scale from the demand data alone (Lewbel, 1999, p. 193) Economists have made various attempts to overcome the nonidentification problem. One can identify a single equivalence scale assuming (1) or (2) being independent of u (the assumptions called, alternatively, independence of base (IB) (Lewbel, 1999) or equivalence-

Equivalence scales in the general population of households
Let random variables Xh and Y describe income distributions in the general populations of h-households and reference households, respectively 3 . We assume that Xh and Y are continuous random variables with density functions fh(x) and fy(y), and the distribution functions Fh(x) and Fy(y), respectively. In the sequel, we use shorthand Xh ~ fh(x), Y ~ fy(y), and Xh ~ Fh(x), Y ~ Fy(y). As countries' household populations are usually large, the limit theorems justify continuity of Xh and Y.
Let indirect utility functions v1(y) and vh(x) be continuous and increasing in their arguments (Donaldson and Pendakur, 2003). By applying the well-known formula of transforming continuous random variables (see e.g. Fisz, 1963, p. 39), the following results are obtained.
Result 1. The indirect utility functions are continuous random variables in the general population, namely Vh = vh(Xh) and V1 = v1(Y).
Result 1 implies the necessity of a reformulation of 'the if and only if condition' of Equation (2). An essential property of a continuous random variable is zero probability that it takes any specific number. Mathematicians refer to such a number as an almost impossible event 4 . Thus P(V1 = u) = P(Vh = u) = 0, for all real u. In other words, the fulfilment of the condition in question is almost impossible in the general population, therefore utility intervals should be used instead of a single utility level.
Assume a finite number k of disjoint non-empty intervals, which cover the whole domain of V1 and Vh. Let ith interval (ui, ui + 1), ui < ui + 1, be such that P(ui < Vh < ui + 1) > 0 and P(ui < V1 < ui + 1) > 0, i = 1,…, k -1. Then the condition in question is reformulated as follows: Definition 1. Two households are equally well-off if and only if the values of their indirect utility functions belong to the same interval of welfare.
Assuming h-households and reference households are equally well-off, in the sense of Definition 1, the following result is obtained: Result 2. There will be a multitude of ratios (2) which are the realisations of the following positive-valued continuous random variable: with the density function fz(z) and the distribution function Fz(z), z > 0. As Xh and Y are independent, the density function fz(z) takes the form (Fisz, 1963, p. 62).

Definition 2. Zh (3) is the random equivalence scale (RES) for h-households.
The results presented above show how the deterministic microeconomic concept of equivalence scales becomes stochastic if one adopts the 'general population perspective'. To make RES operational and consistent with economic merit, the study introduced some additional notions.
Definition 3. A single random equivalence scale is a positional parameter of RES.
When the RES distribution is known, an analyst might use the mean, the geometric mean, the median or the mode as a single equivalence scale.
Stanisław Maciej Kot Section 4 shows how society's inequality aversion can help choose a particular equivalence scale.
Zh (3) may violate the economies of scale in incomes or expenditures since its domain (z-values) comprises all positive numbers. The concept of economies of scale is "(…) the mechanism that explains why the cost of living of a family is less than the sum of costs of living of its members taken independently (…)" (Chiappori, 2016). Definition 3 of RES accounts for economies of scale as well as for diseconomies of scale. Diseconomies of scale may arise when the cost of a member of h-household is greater than the cost of a single reference person. For instance, for a two-member household consisting of disabled and non-disabled adults, the equivalence scale might be greater than 2. However, z for two-member families should not exceed 2 when accounting for economies of scale.
The following modification of RES (3) accounts for economies of scale: Definition 4. RES for h-households respects economies of scale in income if and only if it has the following density function ft(z) where h is the household size, and fz(z) and Fz(z) are, respectively, the density function and the distribution function of Zh (3).
Thus RES respecting economies of scale has the truncated distribution (5) in [0, h] interval. The upper truncation point h, yielding income per capita, specifies the lack of economies of scale. The lower truncation point 1, yielding income per household, specifies the maximum economies of scale. It is worth adding that this way of accounting for economies of scale is still valid for other specifications of household attributes, e.g. the numbers of adults and children; the sum equals h. According to Definition 3, the positional measures of the truncated distribution may serve as single equivalence scales.
RES is a formalisation of real circumstances which practitioners have inevitably experienced when estimating standard equivalence scales from sizable sample data. Although one can discern the continuous character of income or expenditure distributions and indirect utility functions, it was either ignored or overcome by some arbitrary ad hoc assumptions.
For instance, Jackson (1968) calculated equivalence scales for the USA using the food budget shares (FBS) as the household welfare function and presented the following result: "A typical adult living alone requires 36% of the income of a typical family of four to attain the same standard of living or welfare level as the family [our emphases]." Note that Jackson's expression 'equal welfare level' is misleading, as she used FBS intervals in her paper. Moreover, Jackson obtained many expenditures (realisations of Xh and Y, in our terms) within welfare intervals. In other words, she received expenditure distributions within welfare intervals. Thus Jackson's equivalence scale of 2.78 is not the definitional number xh/y (2) of expenditure of two individual households, but the ratio of average expenditure referred to as the typical value.
RES is not free from the non-identification problem. According to Definition 3, welfare intervals are necessary for the practical use of RES. It seems reasonable to assume that the subjective ordinal assessments of a household's welfare status ('good', 'rather good', 'average' (neither good nor bad') 'rather bad', and 'bad') are the indicators of underlying welfare intervals. Data on the ordinal measurement of household welfare are available from typical household budget surveys.
To illustrate a possible link between subjective ordinal measurements of welfare and underlying intervals, consider food budget shares (the Engel index) as a welfare measure. Figure 1 shows the kernel estimate of the Engel index density function. By cutting off consecutive probability mass' portions, equal to the observed fractions of subjective welfare measurements, the corresponding welfare intervals were obtained.
The use of subjective assessments of households' welfare has the general restriction that the qualifications: 'good',…, 'bad', have the same meaning to every respondent. Tinbergen (1991) holds that: "this restriction can be accepted since in discussion on the policy resulting from the use of welfare measurements the same words we also used either to accept or to reject the policy."

RES in the lognormal distribution of income
Let income in the general population of households obey the lognormal distribution with the density function where µ and σ 2 are the mean and the variance of logarithms of x, respectively (Aitchison and Brown, 1957). The common shorthand X ~ Λ(µ, σ 2 ) was used for a positive random variable X that has the lognormal distribution with parameters µ and σ 2 (Kleiber and Kotz, 2003, p. 107).
The following theorem shows that the lognormal distribution provides a closed form of the density function fz(z) (4) 5 .
The positional measures of Z are: the mean = exp{ + 2 /2} the median = exp{ } the mode (Aitchison and Brown, 1957, pp. 8-9). The geometric mean equals the median in the distribution in question.
To derive the truncated distribution (5) of Z~Λ(µz, σz 2 ), where 1 and h are the lower and upper truncation points, these symbols were introduced: The rth moment of the truncated lognormal distribution has the form: where E[Z r ] = exp{rµz + r 2 σz 2 } is the r-th moment of the non-truncated lognormal distribution Z~Λ(µz, σz 2 ) (Jawitz, 2004; Bebu and Mathews, Stanisław Maciej Kot 2009). The second factor in the righthand side of (14) accounts for truncation. It is worth adding that Eq. (14) is valid for all real r.
Theorem 2. Let Mz, Mez, and Moz denote the mean, the median, and the mode of the non-truncated distribution of Z ~ Λ(µz, σz 2 ). Then, the positional measures of the truncated lognormal distribution (10) are: the mean the median the geometric mean In Equations (15), (16) and (17) , when the expectation is calculated concerning g(w) (Johnson, Kotz, and Balakrishnan, 1994, p. 10 (Figure 2a). When Moz is inside (1, 3) interval, the density function ft(z) reaches the maximum at Moz (Figure 2b). When Moz ≥ 3, the density function ft(z) is increasing in [1,3] interval; its maximum equals 3 (Figure 2c).   welfare (average utility) as the existing distribution" (Lambert, 2001, p. 95). Formally, EDEI in the distribution of a positive valued random variable D is the solution, for instance dε, to the following equation

Normative recommendations for the choice of a single equivalence scale
where u(d) is a social planner's utility function. One obtains a single equivalence scale, by construction, when applying (19) directly to the truncated distribution of RES.
Let a social planner assesses the distribution of the random variable D with the constant inequality aversion function: where ε is the measure of inequality aversion (Atkinson, 1970). Then the solution to Equation (19) takes the form: The distribution of RES may also be the object of ethical judgements, such as distributions of income or expenditure. Note that adjustments of household incomes by distinct equivalence scales yield different income distributions, hence there are various assessments of inequality, poverty, and other distributional issues. Therefore, any ethical evaluation of income distributions embodies an indirect appraisal of underlying equivalence scales.
By choosing a particular level of inequality aversion ε, one obtains the following specifications of the normative equivalence scales: In ( The third variant of (23), with ε = 3, requires explanation. Using (22), the following expression for zε is obtained The first factor in the equation's righthand side is the mode Moz (11) of the nontruncated distribution of Z (7) that coincides with the mode Stanisław Maciej Kot Mot (18) of the truncated distribution of Zt. The second factor plays the role of the coefficient of proportionality. It is easy to see that this factor is less than one. Therefore the normative equivalence scale zε for ε = 3 is less than the mode Mot.
The result (23) can be interpreted as follows. If society were inequality neutral (ε = 0), it would prefer the equivalence scale based on the mean Mt. The society of a 'moderate' aversion to inequality (ε = 1) would like an equivalence scale equal to the geometric mean Gt. If society tolerated a 'strong' aversion to inequality (ε = 3), the equivalence scale proportional to the mode Mot would be preferable. In general, one may use an equivalence scale (22) for any level of inequality aversion ε ≠ 1.

Empirical results
The study used micro-data on households' monthly expenditures from the Polish Household Budget Survey (HBS) 2015, selecting seven types of households according to various combinations of adults (a) and children (k). The households of single childless persons were the reference type.
Three household welfare categories were specified: 'Good', 'Average' ('neither good, nor bad'), and 'Bad', using the HBS's subjective assessments of households' financial situation. The HBS sample comprised 26,956 households of selected types. The spread ratio test (Thomopoulos, 2017, p. 81) did not reject the null hypothesis that expenditure within each household type follows the lognormal distribution. Table 1 presents the estimates of single equivalence scales for Poland 2015 within each welfare category. This table contains random equivalence scales based on the mean Mt (15), the geometric mean Gt (17) and the mode Mot (18). For comparison, Table 1 also contains two parametric equivalence scales commonly used in practice. The first is the OECD scales defined as: where θ1, θ2 are weights assigned to an additional adult and a child, respectively. In the original ('Old') OECD scale, θ1 = 0.7 and θ2 = 0.5. The augmented ('New') OECD scale assumes θ1 = 0.5 and θ2 = 0.3 (see Hagenaars, de Vos, and Zaidi, 1994;OECD, 1982OECD, , 2008.
The second parametric equivalence scale has the form where h is the household's size, θ ϵ [0,1] is a parameter expressing economies of scale (Buhmann et al., 1988). The equivalence scale (25) with θ = 0.5, popularised by the Luxembourg Income Study Group, is known as 'square root' or 'the LIS scale'.
The calculation of the random equivalence scales presented in Table 1 runs as follows. First, we choose a welfare group, e.g. "Good", then select single childless persons' reference households and fit the lognormal distribution, for example, Λ(μa, σa 2 ) to expenditure distribution. Next, we choose a particular group of households, e.g. households of single persons with one child, and fit the lognormal distribution, say Λ(μak, σak 2 ) to expenditure distribution. Thus we obtain the distribution of (non-truncated) RES, i.e. Zak~Λ(μak -μa, σa 2 +σak 2 ), according to Theorem 1. Finally, the positional measures of the truncated distribution of RES within [1, h] interval are calculated. In the presented example, h = 2.
Examining Table 1 shows some interesting features of RES, which depends on welfare. As the welfare qualifications, "Bad", "Average", and "Good" accompany increasing expenditure, one can observe declining equivalence scales with expenditure. This observation is consistent with other authors' findings (see e.g. Pendakur, 2003, 2006;Majumder and Chakrabarty, 2008). Thus, the IB/ESE assumption does not hold.
Polish households enjoyed significant economies of scale in 2015 since all the estimates were less than the household's size h, reflecting the lack of economies of scale. The random equivalence scales located themselves between OECD Old and OECD New scales. The LIS scale was close to the random equivalence scales based on the geometric mean. Thus the LIS scale seemed appropriate for a society that tolerates a moderate inequality aversion (ε = 1).
The study calculated the cost of having an additional child as random equivalence scales' increments, with a constant number of adults. Table 2 presents the results.  Table 2 shows that the cost of having an additional child is not constant, as the OECD scale assumes. Generally, the costs diminish with the increase in the number of children. However, in some cases, the cost of having a third child is greater than that of the second, perhaps due to the need to increase the size of dwelling.

Conclusions
The general population perspective reveals the stochastic nature of deterministic microeconomic equivalence scales: they become continuous random variables. RES is a formalisation of the real circumstances that practitioners have inevitably experienced when dealing with household populations.
The truncated distribution of RES can account for possible economies of scale in incomes or expenditures. Chiappori (2016) showed that the standard microeconomic equivalence scales fail in this regard.
In applications, the positional measures of the truncated distribution of RES may serve as single equivalence scales. Society's attitude toward inequality may help chose such scales. When income distribution is lognormal (a testable assumption), the calculation of single equivalence scales is easy.
The empirical analysis of SES reveals some interesting facts. Equivalence scales decline with expenditure, thus the IB/ECE assumption does not hold, and additionally equivalence scales fall with inequality aversion, which implies increasing economies of scale with inequality aversion.
The presented version of RES has two apparent limitations concerning the measurement of welfare intervals and the lognormal distribution of RES. This paper uses subjective assessments of households' welfare status as indicators of underlying welfare intervals. One can apply the ordered logit or probit models to estimate the intervals (cut points) when using convincing welfare variables (Cameron and Trivedi, 2010). Szulc (2009) proposes several variables as indicators of well-being.
Although the lognormal distribution fits the income data quite well, it might not be true in general. However, for other theoretical models of the income distribution, the density function (4) might have not a compact expression. Pollastri and Zambruno (2010) analysed the ratio of the Dagum distributions numerically, and did not consider a truncated distribution of the ratio.