QALYs: Is the value of treatment proportional to the size of the health gain?

Health Economics, 2010,19,596-607.

 

Keywords:  QALY, individual utility, cost-utility-analysis, capacity to benefit, realisation of potential,  subtraction method

 

Erik Nord, Anja Undrum Enge and Veronica Gundersen

 

Research conducted at the Norwegian Institute of Public Health and the University of Oslo.

 

Corresponding author: Erik Nord, Norwegian Institute of Public Health, P.O. Box 4404 Nydalen, 0403 Oslo, Norway.

Tel: 00 47 21078178, fax 00 47 21078101, e-mail erik.nord@fhi.no

 

No funding other than salary in permanent position.

No conflicts of interest.

 

 

 

 

 

 

 

Abstract

 

In societal priority setting between health programs for different patient groups, many people are reluctant to discriminate too strongly between those who can benefit much from treatment and those who can benefit moderately. We suggest that this view of distributive fairness has a counterpart in personal valuations of gains in health. Such valuations may be influenced by psychological reference points and diminishing marginal utility such that the individual utility of care in patient groups with different potentials may be more similar than what conventional QALY estimates suggest. In interviews in three convenience samples, there is some support for the hypothesis. Most respondents do not think that desire for treatment is significantly less in those who stand to gain only moderately compared to those who stand to gain much – even when the treatment is associated with a mortality risk. When stating insurance preferences,  a majority of subjects express a greater concern for avoiding the worst states in question than for maximising expected value for money in terms of treatment effects. The tendency applies to outcomes in terms of both quality and quantity of life. Choices between prefixed response options fit well with oral explanations of these choices.

 

 

 

 

 

1 Introduction

 

A widely held value premise in health economics is that health care resources should be directed to those patients and activities where they generate the greatest health gains (e.g. Williams, 1985). To help decision makers follow this principle, the QALY has been developed as a generic measure of health gains, with the accompanying recommendation that cost-per-QALY considerations should prevail in priority setting.

 

However, ethical theory and studies of public preferences show reluctance to prioritize among patients strictly according to how big the achievable benefit is for a given amount of resources  (Daniels, 1985, 1993; Harris, 1987; Nord, 1993a; Nord and Richardson et al, 1995; Dolan and Cookson, 1998; Abellan-Perpinan and Pinto-Prades, 1999; Ubel et al, 1999; Nord, 1999).  The underlying ethical argument is that it is unfair to hold against some people that they happen to have a lower potential for health than others and that everybody within reasonable limits has the same right to realize his or her potential for health. The argument particularly applies to chronically ill and disabled people. Followed to the extreme, cost-effectiveness considerations will for instance lead to life protecting programs for these groups being valued less than similar programs for people in otherwise normal health. To most people this is presumably unacceptable.

 

Economists tend to respond that not to maximise health benefits when allocating scarce resources runs counter to people’s long term self interests behind a veil of ignorance (and is thus irrational).  The purpose of this paper is to take issue with this view of individual rationality. We do so by (a) outlining a theory of how patients with different degrees of treatability and potentials for health may value the care they can get and (b) by presenting some interview data suggesting that people’s self interests behind a veil of ignorance actually may be considerably less in the direction of health benefit maximisation - and thereby fit more with societal concerns for fairness - than is assumed in much of the cost-effectiveness literature.  If this is true, it has important implications for the role one should assign to cost-effectiveness analysis in decision making and perhaps for the way in which QALYs are constructed. While our data do not in any way justify firm conclusions, we hope their publication – together with the underlying theoretical reflections - will inspire other researchers to make further and more extensive inquiries into the issue.

 

 

2 Conventional QALYs

 

Conventional QALYs are meant to be express the personal value of health interventions to the individuals concerned. Personal value can either be ‘decision utility’ – which is individuals’ judgements of the undesirability of different states of illness ex ante to these – or ‘experienced utility’, which is based on individuals’ feelings and judgements ex post to illness.  Main stream QALYs are based on decision utility (Gold et al, 1996). However, in decisions about resource allocation in health care, information in terms of both decision utility and experienced utility is deemed relevant (Drummond et al, 2009). In this paper we hypothesize that in situations where treatability is somewhat limited, but nevertheless significant to those concerned, QALYs underestimate individual utility in both the above meanings. The hypothesis is based on three premises.

 

One is that in the standard QALY procedure, there is no utility elicitation on gains – or changes - in health. For functional and symptomatic gains there is only indirect valuation through ex ante preference elicitation on health states and subsequent subtraction of health state values from one another. This ‘subtraction procedure’ is understandable on grounds of data collection feasibility: The number of possible changes is much higher than the number of possible states ((n/2(n-1) possible changes if n states). The subtraction procedure is nevertheless a proxy approach, which has not been discussed, let alone validated, in the health economics literature (Nord, Daniels and Kamlet, 2009; Drummond et al, 2009). For gained life time there is no valuation at all of different numbers (quantities) of years gained, only of their quality.  Also this aspect of the QALY model lacks validation.

 

A second premise is that QALYs make no difference between comparisons of  different outcomes for the same individual and different outcomes for different individuals with different potentials for health. Assume for example that wheelchair user A can be brought either to ‘dependent on crutches’ or ‘full health’, while wheel chair user B can at best be brought to ‘dependent on crutches’. An intervention that takes both to ‘dependent on crutches’ will score the same in terms of  individual utility in the standard QALY model. This seems like a conflation of quite different valuation contexts, which may invoke quite different psychological issues and processes. For instance, for person A above, ‘the best (full health) will probably be the enemy of the good (crutches)’. This is not the case for person B.

 

The third premise, related to the second, is that the value of gains in health needs to be understood in the light of two well known aspects of human psychology. One is that value judgements depend on reference points (the individual's point of comparison or  "status quo" against which alternative scenarios are contrasted) and aspirations (Simon, 1955; Kahneman and Tversky, 1979; Schoemaker , 1982). The other is that goods tend to have diminishing marginal utility.

 

In the following we section we explain how failure to take the latter two psychological mechanisms into account may lead standard QALY procedure to relate the individual utility of health care too strongly to the ‘size’ of the health benefit that may be obtained. We then report from a set of small scale studies in which convenience samples of well-educated and reflective people made judgements and expressed preferences that may shed some further light on the plausibility of our thinking.

 

 

3  The theory

 

Consider two states of disability, A and B. Standard QALY procedure is to let a random sample of the general population judge the badness (and goodness) of the states relative to full health. Assume that, from the sample’s typical reference point of ‘normal health’, B is considered to be twice as undesirable as A, and that, on average, people are willing to sacrifice 40 % of their life expectancy to avoid B and 20 % to avoid A. So B scores 0.6 on the QALY utility scale, while A scores 0.8.

 

Now consider two groups of people G1 and G2 who happen to contract two different illnesses. Both groups find themselves in state B (0.6). People in G1 can be restored to full health. For people in G2 that is not possible. But the best available medical technology can significantly improve their condition and raise them to state A (0.8).

 

The interesting question for cost-utility analysis is how highly the two improvements – i.e. the two changes in health – are valued by those concerned. According to the QALY ‘subtraction method’, the intervention for people in group G1 scores twice as much as the intervention for people in group G2 (0.4 vs 0.2). But would people in group G2 necessarily value the help they could be given only half as much as people in group G1? From the position of being in full health, A is unattractive. But after onset of illness the reference point of group G2 has changed, and so have presumably their aspirations. The most important concern may now be to achieve something better than state B. Furthermore, from the position of being in B and having A as the best possible outcome, A may appear more attractive than it did earlier. Perhaps group G2s valuation of help is not so much less than that of group G1 given the equality in the severity of the two groups’ condition and the reduced maximum health group G2 now has to accept. The method of subtracting one ex ante health state value from an other one misses this possibility.

 

The idea that the utility of health gains depends on reference points and health potentials seems relevant also with respect to gains in life years. Consider two groups G1 and G2 with life threatening conditions of different kind. Individuals in G1 can be restored to a normal life expectancy of for instance 20 years, while individuals in G2 can only be given a life expectancy of 10 years. Assume that all else is equal. Given their situation, i.e. given their having risk of death as a reference point, the primary concern in both groups may be to avoid death in the near future and to at least get ‘some years’. This possible effect of the reference point may be strengthened by the general phenomenon of diminishing marginal utility: Even if there were no time preference, a prospect of  20 years is arguably unlikely to be considered twice as valuable as a prospect of 10 years. In short, while the two groups may value their potential gains differently, the two groups’ valuations will perhaps not differ as much as the difference in life years to be gained might suggest.

 

The above reasoning - with respect to both functional improvements and gains in life years – applies to utility assessments of health care ex post to illness. In principle, ex post feelings and valuations may be anticipated prior to illness. We thus hypothesize that non-proportionality between effect size and utility may be true also of individuals’ ex ante preferences for health care as expressed in preferences for health insurance (decision utility). In support of the latter, we note an earlier focus group study of preferences in a sample of health administrators in different counties in Norway (Nord, 1994). Subjects were asked to imagine two hospitals A and B. Hospital A gave equal priority to patients with equal severity of illness irrespective of the degree of treatability as long as the  treatment effect was substantial. Hospital B gave priority to those with a greater potential for improvement. The subjects were reminded that they could not know what kind of illnesses they themselves might contract in the future and then asked which hospital they  would prefer to ‘belong to’ if they were to think only of their self interests as potential future patients. Of  51 subjects, 31 preferred hospital A (equal priority for people with different treatment potentials), while 20 preferred hospital B (priority according to treatment potential).  

 

 

4 Materials and methods

 

We adopt the conventional assumption in welfare economics that in well informed consumers, an individual’s utility of a good is reflected in his/her strength of preference for the good and is proportional to the latter. Strength of preference is ideally measured in terms of willingness to sacrifice (WTS). To test our hypotheses, we should thus ideally observe personal WTS (e.g. willingness to pay) for health improvements in equally ill groups with different potentials for improvement and WTS for health insurances for illnesses with different degrees of treatability.  The data reported in this study are of a simpler kind and serve as preliminary indications. They partly consist in subjects’ reflective judgements of people’s desire for treatment in different circumstances, partly in the same subjects’ personal stated preferences for private and public health insurance.

 

 Interviews in convenience samples

 

Questions were administered in face-to-face one-to-one interviews in three consecutive convenience samples. Two of these consisted of staff at a Public Health Research Institute, the third of university post graduate students in health economics, policy and management. For details, see table I.

 

Table I about here.

 

The theory underlying the study and the specific study questions were developed by the principal investigator. The interviews were conducted by the two co-authors of the present paper, who at the time were master students. They had no prior relation to the work of the principal investigator. They also had an independent position regarding the underlying theory and would use the data in independent writing of their master theses. Altogether this was thought to be valuable with respect to achieving unbiased reporting of subjective impressions and verbal comments from the interviews.

 

A brief introduction to the interview was given, in which concerns for quality of life in resource allocation were indicated at a general level, with no mentioning of the study hypothesis. With each question in the interview, subjects were encouraged to ask questions for clarification. They were allowed to reflect as long as they liked before answering and encouraged to explain their way of thinking when responding.  The interviewers carefully noted all oral comments regarding difficulties in understanding and reasons for responses.

 

 Question 1: Judgements of patients’ desire for treatment

 

As noted above, we were interested in assessing strength of preference. We used ‘strength of desire’ as a proxy for strength of preference. A question was designed on assumed strength of desire in different situations. It used the functional scale shown in figure I, constructed by Sintonen (1981) and modified by Nord (1993b).

 

Figure I about here.

 

Subjects in sample 1 were asked to think of two illnesses A and B that affect the same number of people. Both illnesses put those affected at level 7 of the disability scale in figure 1. Both illnesses can be treated with surgery. The mortality risk is 5 % in both cases. With illness A, the best possible treatment takes the patients to level 2. With illness B, the best possible treatment takes the patients to level 5. Given the (approximate) equal interval property of the scale, subjects were assumed to perceive the former gain (7 to 2) as very much larger than the latter (7 to 5).

 

Subjects were asked to think of themselves in the two patient groups’ situation and to judge the strength of  desire for surgery in the two groups. Three response options were possible:

1.        Patients with illness A will have a much stronger desire for surgery than patients with illness B.

2.        Patients with illness A will have somewhat stronger desire for surgery than patients with illness B.

3.        The desire for surgery will be much the same in the two groups.

 

The question was repeated in later interviews in samples 2 and 3, with the only difference that the two improvements to be compared were now ‘from level 6 to level 2’ vs ‘from level 6 to level 4’. 

 

 Question 2: Stated preferences for private insurance concerning quality of life

 

The above judgements concern preferences ex ante to treatment, but ex post to illness. Another question was designed to address the issue of  preferences for insurance ex ante to illness. Unlike question 1 it was phrased in terms of own stated preferences rather than judgements of other people’s preferences. After piloting in sample 1, the final version in sample 2 and 3  went as follows:

 

‘Imagine that you live in a country where it is common to have some private health insurance for things not covered by the public health service. Typically these are fairly rare diseases for which treatment is very expensive. You cannot afford to include all such diseases and are forced to prioritise. Among the various candidates are two diseases that both will put you in a wheel chair for a year’s time if you are left untreated. Treatment potentials are as follows:

 

Disease A: Treatment restores you quickly to full health.

Disease B: Treatment will enable you to move a few meters without a wheelchair, for

                  instance at home.’

 

After a year or so the diseases will in any case end and you will be back in normal health. If you were to prioritise between the two diseases in you private insurance plan, you might think in one of the following ways:

 

(1)     You would primarily be concerned with the effect of treatment. Unless disease B

       was clearly more likely, you would therefore give priority to insurance for

      disease A.

(2)     ’You would primarily be interested in avoiding the serious condition that both 

      diseases lead to, and not worry so much about whether treatment would make you

      a good deal better or restore you to full health. Therefore you would give priority

      to insurance for the disease with the highest probability.

 

With which of these ways of thinking do you agree most?’

 

Agreement with option 1 may be interpreted as showing a greater concern for the size of the benefit than does agreement with option 2.

 

For interpretation of results it was important to know how subjects perceived the treatment outcome in disease B (from ‘completely dependent on wheelchair’ to ‘able to move a few meters without a wheelchair, for instance at home’) relative to the treatment outcome in disease A (full health). Subjects were therefore asked to locate the state resulting from treatment in disease B on a rating scale from zero to 100 (full health). On this scale, ‘completely dependent on wheelchair’ was prefixed at 50. The latter value was not of interest in itself. The idea was to ask the subjects: ‘If we say that the improvement  in disease A (from ‘wheelchair’ to ‘full health’) is represented by the interval from 50 to 100, how much of that interval is covered by the improvement obtained in disease B?’

 

In a follow up question (in samples 2 and 3),  different probabilities for diseases A and B were specified. In sample 2, the first seventeen subjects to be interviewed were told that the probabilities were 10 % and 20 % respectively, while the next sixteen subjects were told that the probabilities were 10% and 25 % respectively. The change in risk-stimuli in sample 2 was triggered by observations of the results in the first seventeen interviews. Ideally there should have been randomisation of all 33 subjects to the two different sets of probabilities. However, there were no significant differences in personal characteristics between early and late interviewees in the sample. In sample 3, all were given 10 and 25 % as stimuli (on the basis of observations in sample 2, see results section below).

 

Question 3: Stated preferences for public hospital  treatment capacity concerning quality of life

 

In a country like that of the authors, with a dominant National Health Service (NHS), people are not used to thinking about interests in private health insurance. A question was therefore phrased in terms of preferences for treatment capacity in the NHS. For the purpose of the study question, we used a measure of health effect that clearly has ratio scale properties, namely ‘number of symptom free days’. The question was asked in samples 2 and 3 and went as follows:

 

‘Consider two muscular diseases A and B. Both result in pain that leads to severe limitations in mobility and to some problems with sleep. Both are chronic.  There are treatments for both illnesses, but they are very expensive and treatment capacity is insufficient at the moment.   A medical examination reveals that your risk of getting A is 15 %. For B your risk is 30 %.

   With both illnesses the effect of treatment is that symptoms will not occur daily, in other words that the patients have good days in between bad days. But the effects are different in size:

 

Treatment for illness A (likelihood 15 %): A couple of days per week will be OK.

Treatment for illness B (likelihood 30 %): Approximately one day per week will be OK.

 

If you were to express you preferences for treatment capacity for A and B in the National Health Service, you could have in mind one or more of the following:

 

You want to make sure that you receive treatment for the most likely illness.

You want to avoid having to live with the symptoms in question every day.

You want to make sure that you receive treatment for the illness for which treatment is

most effective.

 

Thinking only about your self interest, for which of the two illnesses would you most strongly want the National Health Service to increase treatment capacity?’

 

Question 4: Stated preferences for private insurance concerning life years

 

Preferences for life extensions of different size have been studied earlier within the framework of expected utility theory (Verhoef et al, 1994; van Osch et al, 2006). The typical approach is to have subjects find a point of equivalence between a certain outcome of some length and a gamble in which one of the outcomes is of greater length. In our view, it is difficult in such approaches to disentangle the possibility of diminishing marginal utility of life years (as a pure quantity effect) from risk aversion to the gamble. To downplay the potentially confounding role of risk aversion, we let subjects compare two prospects that both were associated with uncertainty of the same order of magnitude. We thus designed a question similar to question 2 on private insurance, in which we addressed health benefits in terms of life years rather than quality of life. After piloting in sample 1 with 20 and 10 years scenarios as comparators, the final question in samples 2 and 3 used 20 and 7 years and went as follows:       

 

‘Imagine two diseases A and B that both in short time lead to death if one is left untreated. For disease A there is a treatment that does not cure but statistically postpones death by approximately 20 years. For disease B there is also a treatment that does not cure but statistically postpones death  by approximately 7 years. In both cases health related quality of life is good in the gained years of life. If you were to prioritise between the two diseases in your private insurance plan, you might think in one of the following ways:

 

1.        You would emphasise the number of years to be gained from treatment. Unless disease was clearly more common, you would therefore give priority to insurance covering disease A.

2.        Given the severity of the diseases, you would give priority to insurance for the most likely disease, and not emphasise so much whether there would 7 or 20 years that could be gained.

 

With which of these ways of thinking do you agree most?

 

Again, a follow up question was administered specifying  different probabilities for diseases A and B. The probability given for A was 10 % and for B 20 %.

 

5 Results

 

Process observations

 

Interviews lasted from 10 to 40 minutes, with an average of 20 minutes. The general impression of both interviewers was that the subjects had a good understanding of the questions, although some (5-10%) found it difficult that the diseases in question were specified in terms of functional level only and not in terms of diagnosis. A few subjects (5-10%) found it necessary to ask questions for clarification. On the other hand, numerous explanatory comments were given regarding response choices, and these were noted in detail by the interviewers.  While some inconsistencies in responses seemed to occur, for instance in use of the rating scale in question 2, both interviewers felt that the respondents generally had reasonable and understandable arguments for their choices. But we also observed that some of the respondents (10-15%) seemed to make factual assumptions that were not included in the information provided in the questions themselves – assumptions that in some cases may have distracted from the aim of the questions.

 

Quantitative data

 

Distributions of responses to questions 1-4 are summarised in table II. Results from pilots in sample 1 are not included. Since the absolute numbers are small, and in order not to overload the table, we have chosen not to include distributions in terms of percentages.

 

Table II about here.

 

Regarding question 1, altogether 55 out of 83 subjects thought that, in patients at severe levels of disability, those with prospects of a modest gain would have the same ex ante desire for surgery with a mortality risk of 5 %  as those who stood to gain much more. Another 22 subjects thought that the latter group’s desire would only be ‘a little’ stronger.

 

In the first part of question 2, the inferior treatment result in disease B received mean rating scale scores of  63 – 65 in samples 2 and 3. This suggests that the subjects perceived the treatment effect in disease B as much smaller than the effect in disease A – perhaps as low as 25-30 %. Despite this, a slight majority in the two samples (32 vs 25) agreed with the statement that emphasised ‘alternative thinking’ (option 2) rather than effect size. The same tendency was observed in the pilot study in sample 1 (not reported in the table).

 

In sample 2, a clear majority of those given 10 and 20 % as probabilities preferred insurance for disease A (the one with the better treatment result). On the other hand, among the sixteen given 10 and 25 % as probabilities, an equally clear majority preferred insurance for disease B. A similar distribution to the latter one was found in sample 3. In other words, majority preference seemed to switch from A to B when the difference in probability increased from 10 to 15 percentage points (p < 0.01).

 

Regarding question 3, in both sample 2 and 3 a clear majority preferred to have treatment capacity increased for the illness that was twice as likely, even if the treatment effect was only half the size.

 

When the first part of question 4 was administered as a pilot in sample 1, where the comparators were 20 and 10 years, there was a clear majority against emphasising effect size (not reported in the table). However, as can be seen in the table,  samples 2 and 3 clearly emphasised effect size given that the comparators were 20 and 7 years. On the other hand, when in these two samples the probabilities were specified to be 10 % for illness A (20 years to be gained) and 20 % for illness B (7 years to be gained), the distributions were reversed in favour of insurance for illness B.

 

b. Verbal comments

 

Verbal comments, based on observations and extensive notes made during the interviews, are reported in detail in Enge (2007) and Gundersen (2008). For access to the theses, see the reference list.  Here we summarize that the ‘flavour’ of respondents’ oral reasoning generally fitted well with their concrete choices. For instance, depending on age respondents had different life goals, and this again led to different opinions in question 4 regarding  the importance of gaining many versus ‘some’ years. Similarly, some (5-10%) had personal experiences with illness or knowledge of illness in family members or others that accounted understandably for their preferences. Overall, we observed in most respondents’ explanatory comments little evidence to suggest that they were primarily concerned with the size of treatment effects or – in ex ante insurance choices – that they aimed to maximise expected (future) health benefits. Rather, there was much focus on minimising the likelihood of ending up in the most severe condition described in each of the pairwise scenarios. Compared to this, whether the treatment outcome was large or moderate seemed to be a somewhat secondary consideration to most of the respondents.

 

6 Discussion

 

We started this paper by hypothesising that personal valuations of gains in health may be heavily influenced by reference points, treatability and diminishing marginal utility, such that valuations of care across patient groups with different potentials may be more equal than what differences in treatment effects suggest.

 

The ideal data for examining our hypothesis would be real willingness-to-sacrifice-data in large samples of patient groups with different potentials and in representative samples of members of the public making choices about personal health insurance. Our data consist  only of the judgements of members of three convenience samples of what they think they themselves or others would have felt and preferred in given situations ex post and ex ante to illness. Given this hypothetical and judgemental nature of our data, our study can at most increase the plausibility of the study hypothesis and lead to interest in more thorough empirical testing of it.

 

A weakness with question 1 is that people can express desire for something without being willing to sacrifice anything to obtain it, so strength of desire does not necessarily correspond to strength of preference in an economic sense. On the other hand, willingness to accept surgery (which is always undesirable) and a significant mortality risk (5 %) were premises for question 1, so some willingness to sacrifice was involved. The strength of the data in question 1 is (a) the high degree of consensus on one of the offered views in all three samples, (b) that in one of these samples, competence in psychology was strongly present, and (c) that the choices between the three response options fitted well with the extensive explanatory oral comments provided by the respondents.  Altogether, we think the response distribution to question 1 is a noteworthy collective judgement that increases the plausibility of the study hypothesis.

 

Question 2 concerned  ex ante insurance preferences for functional improvements. According to the rating scale scores, the health gain from treatment in illness A was perceived as roughly 50/14 times that of  the gain in illness B, i.e. roughly 3,5 times larger (see  comment below on this cardinal interpretation of the rating scale scores).  Given this,  thinking emphasising treatment effect was agreed to by almost half of the subjects in the first (general) part of the question. However, slightly more respondents were  primarily interested in being sure to achieve something better than the serious condition in question. This fitted with oral explanations.

 

When probabilities were included in question 2, there was majority preference for illness A when the probabilities of A and B were 10 and 20 %. This shifted to majority preference for B when the likelihood of illness B increased from 20 to 25 %. A majority was in other words willing to accept a doubling of the risk (20 vs 10 %) of ending up in the worst scenario (completely dependent on wheel chair) in order to obtain a roughly 3,5 times  larger effect of treatment in case of illness. At the same time, the majority was not willing to accept an increase in the risk of the worst scenario by a factor of 2,5 for that same 3,5-fold increase in  effect. The shift in majority preference thus occurred at a point where – according to the valuation of the state resulting from treatment in B (rating scale value = 64) - the estimated expected utility of insurance for A was still clearly larger ((100-50) x 0.10 = 5.00 vs (64-50) x 0.25 = 3.50). 

 

The finding could be due to rating scale scores being poor measures of utility (Torrance et al, 2001). However, given (a) that the rating scale task was to locate the treatment result in illness B in a fairly narrow interval between ‘wheel chair’ and ‘full health’, and (b) the  respondents were explicitly instructed to value one improvement relative to another, we believe the lack of meaning normally suspected in rating scale valuations may be less of a concern in the present case. Two explanations then remain. One is that the subjects were strongly aversive to a high risk of the worst case scenario. The other is that they thought the seemingly 3,5 times larger prospective health gain in illness A compared to illness B was not 3,5 times as valuable. Both interpretations are supported by the verbal comments, cfr. the summary of these above. Both interpretations suggest that some improvement over ‘complete dependence on wheel chair’ was given special weight compared to the further improvement to full health. This is consistent with the study hypothesis, although the strength of the tendency in the data is moderate.

 

In question 3 – on preferences for treatment capacity in the NHS - the treatment effect in illness A is clearly twice that in illness B. Given the probabilities of illness, the expected effects of the two options are equal. Nevertheless, there is a strong majority preference for option B, where the risk of illness was higher, in both samples 2 and 3. Notes from the interviews indicate that  ‘wanting to help as many as possible’ was a main reason in 15-20 % of those who chose option B (Gundersen, 2008), This means that in this question, a fair number of the respondents did not take the pure self interest perspective they were supposed to take. This reduces the relevance of the results. On the other hand, further examination of the verbal comments also indicates that 35-40 % of subjects who chose option B emphasised the risk of ending up in the severe state in question more than whether one or two symptom free days per week could be provided.  This seems to imply a strong aversion to the ‘worst case scenario’ (no symptom free days), which again suggests that the utility of going from ‘no symptom free days’ to ‘one symptom free day’ was thought to be greater than the additional utility of going from ‘one symptom free day’ to ‘two symptom free days.

 

In question 4 – on ex ante insurance preferences for saving life years – there was a clear preference reversal from ‘alternative thinking’ to ‘emphasising treatment effect’ when the difference in outcome in illness A and B was increased from 20 vs 10 years (sample 1) to 20 vs 7 years (samples 2 and 3). But a reversal occurred again (in favour of illness B) when probabilities were specified as 10 and 20 %. Discounting for distance in time at an annual  standard rate of 3-5 does not fully account for this preference for option B (the rate would have to be 8-9 %). Possibly, the subjects in this study have stronger time preference than what is normal.  But in their oral comments, only one subject mentioned a higher valuation of early years compared to later years, i.e. a preference relating to distance in time. There is in other words no evidence of time preference being a particularly strong concern in these subjects. Instead, nearly all subjects who first chose option A referred to 20 years as being much more than 7 years and allowing much more achievement in life. This suggests that the size of  the treatment effect was important to them. On the other hand, when probabilities were introduced, many reversed their preference on the grounds that avoiding death (the ‘worst case scenario’) was their primary concern. Compared to that, the magnitude of the potential health gain was said to be of less significance. This seems to be the same as saying that the most important thing for them was to get ‘some years’ and that additional years were of less importance. In those who first chose option B, this was said explicitly, in phrases like ‘seven or twenty years is not so important’ and ‘seven years is better than no years at all’. Viewed in the light of these verbal comments, the results in question 4 seem to lend some support to the study hypothesis.

 

Our overall impression is that a majority of the respondents seem to deviate considerably from being health benefit maximisers. If this is true of the population in general, then it has important implications for the use of QALYs as estimates of the individual utility of care. 

 

Some might also want to see the results from a different perspective. Even if the study  shows a majority view, it reveals considerable disagreement. Disagreement is of interest in itself. The existence of significant minorities and  persistent disagreement suggests that in any specific resource allocation decision many will feel uncomfortable with simply following majority view and call for a fair procedure that encourages further deliberations about the quantities and issues involved (Daniels and Sabin, 2002).

 

7 Conclusion

 

Priority setting according to costs-per-QALY has been criticised for being unfair to those with a lesser potential for health (capacity to benefit). Faced with this critique, economists tend to respond that not to maximise health benefits when allocating scarce resources runs counter to people’s long term self interests behind a veil of ignorance (and is thus irrational). We suggest that people’s self interests behind a veil of ignorance actually may be somewhat less in the direction of health benefit maximisation - and thereby fit more with societal concerns for fairness - than one hitherto has thought. We noted a study from the 1994 that gives some support to the study hypothesis. In the present study, further support seems to come out to some degree in responses to questions on four different themes, concerning outcomes in terms of both quality and quantity of life. The tendency was the same across samples of people of different age and training (e.g. middle aged psychologists vs young economists-to-be). Choices between prefixed response options fit well with oral explanations of these choices.

 

The consistencies across studies, themes and subjects suggest that we may in fact be observing a true and general phenomenon of considerable relevance for the use of conventional QALYs in economic evaluation of health care.  The purpose of QALYs is to express the value of health interventions (changes in health) to those concerned. But the QALY procedure does not address the value of interventions given illness. Instead it builds on (mostly) healthy people’s ex ante valuations of states of illness and disability and subsequent subtraction of such values of states from one another. Even if there are understandable feasibility reasons for this indirect ‘subtraction approach’ to valuing interventions, to our knowledge no evidence has ever been brought forward to show that it yields valid results. Indeed, one could say that the burden of proof lies with the proponents of the indirect subtraction approach. We hope our study, in spite of its obvious limitations, can lead to increased interest in the issue and encourage further research. This could partly consist in replicating some of our questions in larger and more representative population samples, partly in asking related questions from somewhat different angles. It should also be feasible to do comparative studies of interest in treatment and (hypothetical) willingness to sacrifice to obtain treatment in patients with equal severity of illness but different potentials for improvement.

 

 

Acknowledgement

We thank two anonymous reviewers and the editor for numerous valuable comments and suggestions. The usual disclaimer applies.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

References

 

Abellan-Perpinan JM, Pinto-Prades JL. 1999. Health state after treatment: A reason for discrimination? Health Economics 8:701-707.

Daniels N. 1985. Just Health Care.  Cambridge, MA: Harvard University Press.

Daniels N. 1993. Rationing fairly: Programmatic considerations. Bioethics 7:224-233.

Daniels N, Sabin J. 2002. Setting limits fairly: Can we learn to share medical resources? New York: Oxford University Press.

Dolan P, Cookson R. 1998. Measuring preferences over the distribution of health benefits. Mimeo. University of York, Centre for Health Economics.

Drummond M, Gold M, McGuire A, Nord E, Brixner D, Kind P. 2009. Towards a consensus on the QALY. Value in Health (forthcoming).

Enge A. 2007. A quantitative and qualitative study of individual preferences in relation to health care priorities. Master thesis. London School of Economics. Available through corresponding author.

Gold M et al. 1996. Cost-effectiveness in health and medicine. New York: Oxford University Press.

Furlong WJ, Feeny DH, Torrance GW, Barr RD. 2001. The Health Utilities Index (HUI®) System for Assessing Health-Related Quality of Life in Clinical Studies. Annals of Medicine, 33:375–384.

Gundersen V. 2008. Individuals’ valuation of a health improvement – is it linear? A quantitative and qualitative study of people’s preferences regarding healthcare priorities. Master thesis. University of Oslo.

http://www.duo.uio.no/sok/work.html?WORKID=76293

Harris J. 1987. QALYfying the value of life. Journal of  Medical Ethics 13:117-123.

Kahneman D, Tversky A. 1979. Prospect theory: an analysis of decision under risk. Econometrica 47: 263-91.

Nord E. 1993a. The relevance of health state after treatment in prioritising between patients. Journal of Medical Ethics 19: 37-42.

Nord E. 1993b. The trade-off between severity of illness and treatment effect in cost-value analysis of health care. Health Policy 24:227-238.

Nord E. 1994. Workshops on a value table for prioritising in health care. Working paper 1/1994. Oslo: Norwegian Institute of Public Health. (Text in Norwegian, summarised in Nord 1995.)

Nord E. 1995. The person trade-off approach to valuing health outcomes. Medical Decision Making 15:201-208.

Nord  E. 1999. Cost-value analysis in health care. New York: Cambridge University Press.

Nord E, Richardson J, Street A, Kuhse H, Singer P. 1995. Maximising health benefits versus egalitarianism. Social Science & Medicine 41: 1429-1437.

Nord E, Daniels N, Kamlet M. 2009. QALYs: Some challenges. Value in Health (forthcoming).

Schoemaker  PJ. 1982. The expected utility model: Its variants, purposes, evidence and limitations. Journal of Economic Literature 20: 529-563.

Simon HA. 1955. A behavioural model of rational choice. Quarterly Journal of Economics 69:99-118.

Sintonen H. 1981.An approach to measuring and valuing health states. Social Science & Medicine 15c:55-65.

Torrance G, Feeny D, Furlong W. 2001. Visual analogue scales: Do they have a role in the measurement of preferences for health states? Medical Decision Making 21: 329-334.

Ubel P, Richardson J, Pinto-Prades JL. 1999. Life-saving treatments and disabilities. Are all QALYs created equal? International Journal of Technology Assessment in Health Care 15:738-748.

Verhoef LCG, de Haan AFJ, van Daal WAJ. 1994. Risk attitudes in gambles with years of life. Medical Decision Making 14: 194-200.

Van Osch SMC, van den Hout WB, Stiggelbout AM. 2006. Exploring the reference point in prospect theory: Gambles for length of life. Medical Decision Making 26: 338-346.

Williams A. 1985. Economics of coronary artery bypass grafting. British Medical Journal 291: 326-329.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Table I. Characteristics of subjects.

 

Sample

(interviewer)

Type of respondents

N

Men

Mean age, range

Education level

1

(Enge)

Staff  NIPH,

Depts of Epidemiology

24

9

54, 28-70

Master degrees in psychology, medicine, social science, statistics

2

(Gundersen)

Staff NIPH, Dept of Mental Health

33

11

46, 25-64

Master degrees, mostly in psychology

3

(Gundersen)

Students

28

14

28, 21-52

Bachelor degrees, mosly in economics/health administration

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Table II. Response distributions. Absolute numbers.

                                                                                Sample:

Question:                                                                 N

     1     

    24

      2

     33            

     3

    28

 

1. Judgements of desire for treatment

 

 

       Much stronger desire in group A than in group B

       Somewhat stronger desire in group A than in group B

       Much the same desire in both groups

 

 

A: 7=>2 

B: 7=>5

 

      2

      6

    16

 

A: 6=>2

B: 6=>4

 

      4

      8

    21

 

A:6=>2

B:6=>4

 

     2

     8

   18

2. Stated preferences for private insurance for illnesses with different potentials for functional improvements:

                        

 

     

        Mean rating of treatment result in disease B (range)

 

      

       Agree most with thinking that emphasizes effect size

       Agree most with alternative thinking

       In doubt

 

       p(A)=10 %, p(B) = 20 %

                             Priority to insurance for A

                             Priority to insurance for B

       p(A)=10 %, p(B) = 25 %

                              Priority to insurance for A

                              Priority to insurance for B

 

 

 

 

 

     

 

     

   

     

 

 

 

    

    

 

A: Full health

B: Small gain

 

   63 (55-80)

    

     12

     17

       4

 

 

    12

      5

 

      4

    12

   

A: Full health

B: Small gain

 

   65 (55-90)

   

    13

    15

      0

 

 

 

 

 

    10

    18

   

3. Stated preferences for public treatment capacity for illnesses with different potentials for functional improvements

 

         Priority to illness A (p=15%, 2 symptom free

         days per week)

         Priority to illness B (p=30%, 1 symptom free

         day per week)

 

 

 

 

 

 

    

     10

    

     23

 

 

 

 

     

      8

   

     19

4. Stated preferences for private insurance for illnesses with different potentials for gaining life years

 

        Agree most with emphasizing effect size

        Agree most with alternative thinking

        In doubt

       

        p(A)=10 %, p(B)= 20 %

                              Priority to insurance for A (20 years)

                              Priority to insurance for B  (7 years)

 

  

  

 

    

  

    

 A: 20

 B:   7

 

    18

      8

      7

 

 

     13

     20

 A: 20

  B:  7

 

    20

      5

      3

 

 

    11

    17

 

 

 

 

Figure I. A scale of severity of illness.

 

 

 

Level of problem

 

Example of condition

 

1

None

Healthy.

 

 

2

Slight

Difficulties walking more than 2 k.

 

3

Moderate

Difficulties in stairs and outdoors.

 

4

Conside-rable

Difficulties with moving about at home. Needs assistance in stairs and outdoors.

 

5

Severe

Able to sit. Needs help to move about – both at home and outdoors.

 

6

Very severe

To some degree bedridden. Able to sit in a chair part of the day if helped up by others.

 

7

Completely disabled

Permanently bedridden.