Written by Sue Ziebland   


Sue Ziebland and Angie Rogers review outcome measures used in drug treatments.


The irony of the HIV/AIDS epidemic serving as fairy godmother to drugs services has been widely documented. Treatments for drug users have adopted a focus which, while still promoting abstinence as the clear goal, also include harm minimisation as a valid target (ACMD, 1988; Glanz, 1988). This is evident at both the macro level, with central government funding for needle exchanges, and the micro level, through an emphasis on individual modifications of risk behaviour. Meanwhile, in Britain and elsewhere, changes in health service organisation have reinforced the necessity of demonstrating the outcome of clinical interventions.

Outcomes are measurable changes attributable to a treatment. An outcome measure is an indicator of this change and may take a number of forms, depending on the precise nature of the intervention. It is vital that outcome measures are chosen with care. To be of practical use they should be quick and easy to complete, sensitive to relevant changes and provide scores that are simple for the clinician to interpret. If they do not satisfy these requirements, they are highly unlikely to be employed with sufficient care and regularity to be of use. A valid and reliable outcome measure has the potential to provide data for a number of related purposes, all of which may enhance the development of effective treatments. These would include:

  • Trials for establishing the effectiveness of a treatment.
  • Assessment of the suitability of treatments for subgroups of patients or clients.
  • Individual patient or client monitoring.
  • Aggregate level routine monitoring (e.g. for audit).
  • For analysis of attrition rates.
  • To assess the appropriateness of different combinations of treatment and services.

Simple tallies of abstainers have been recognised as

an inadequate reflection of drug interventions (Gillam et al., 1992), yet there is little consensus about which indicators are most appropriate to the new goals. In the report of a study carried out between 1969 and 1973, Edwards and Goldie ( 1981) suggested that:

To a large extent, assessment of outcome is a value judgement that varies from individual to individual and from one professional group to another.

Although greater concord might be expected from the late 1980s, a study reported by Charuvastra et al. (1992) presented 'cessation of drug use and criminality' as the criteria representing successful outcome. Berglund et al. (1991) emphasise the fundamental nature of the changes which take place within the abuser's lifestyle as they leave drug abuse, with the consequent necessity of assessing 'the ex-client's total psychosocial situation'. In the same year White et al. (1991) wrote that, in the search for relevant outcome measures '. . . it is important to consider the utilisation of variables other than simply the alcohol or drug involvement of the client . . . suggested variables include: client and therapist ratings, family perceptions, client background variables, and clinical evaluation'.

The object of this paper is to review the use of outcome measures in drug treatment research and evaluation by conducting a systematic search of the literature. It is recognised that there are broader issues concerning the relationship between drug use, treatment and behaviour change, but the aim of this paper is to describe the ways in which the criteria of success and failure of treatment are operationalised in research design, and to consider the effect this has on interpretation of the results.


The literature search was carried out during the early months of 1993. On line searches were made through Medline using the key words 'drug/opiate' treatment, 'drug dependence' and 'harm minimisation' with 'follow-up', 'outcome' and 'evaluation'. This was supplemented by the excellent Institute for the Study of Drug Dependence (ISDD) in London where the librarian-assisted search used the key words 'success' and 'failure' to search the ISDD catalogue. The benefits of using the world's largest collection of drugs literature were considerable. References were located through the ISDD author catalogue and all studies which met the inclusion criteria were selected.

lnclusion criteria

Studies were included in this review if they were trials, follow up studies or evaluations, and the use of one or more outcomes measure was reported. Some designs, such as ethnographies and action research, are underrepresented because they are unlikely to involve the systematic measurement of outcomes. Although this in no way detracts from the rigour or appropriateness

of these research methods, such studies were clearly not eligible for inclusion in this review.

Contact was made with drug service researchers to help identify unpublished developments in outcomes methodology. After 4 months new studies were rarely being referenced and review articles, for example, Darke et al. (1992) and Berg (1992), indicated that new measures were unlikely to be identified.

A three-page standard assessment schedule was developed and piloted. This was independently completed by two assessors for each of the studies. The schedule included the location of the study, the year in which it took place, the type of service, the aim of the treatment, the type of study, sample size and basic characteristics, inclusion/exclusion criteria, baseline and outcome measures used, reproducibility, statistical tests, external validity and results. Areas where the reviewers differed were reconsidered. The results reported here concern the identification of which outcome measures were being used. Disagreement between the reviewers was rare, although in a few cases considerable scrutiny was required to determine which measures were employed solely as baseline indicators and which as outcome measures.

Although guidelines for conducting a systematic literature review were adhered to (Sackett et al., 1991), it is possible that some studies may have escaped inclusion in this review. However, it seems likely that the inclusion of more studies would lead to the identification of an even wider range of outcome measures. It is certainly unlikely that our assertion that there is a wide variation of approaches to measuring outcomes would be undermined.


Thirty one studies published between 1980 and 1992 were included in the review. Although harm minimisation is clearly always a legitimate aim,17 Of the studies were of treatments which were specifically aimed at abstinence whereas 14 primarily focused on harm minimisation. It should be noted that, given that the aim of the treatment is not always specified, and follow-up studies may include subjects who have experienced a number of different programmes, this distinction is not entirely clear cut. However, for clarity of presentation, Table 1 includes studies aimed primarily at abstinence and Table 2 those with harm minimisation or habit moderation as the more explicit aim.

Interventions aimed at abstinence

All of the studies with abstinence as the primary aim included more than one outcome measure (Table 1). Although the year(s) during which the research was conducted is not always reported, it is clear from the publication dates that most predate the worldwide recognition of the threat of HIV to injecting drug users. A wide variety of outcomes measures are reported, the most popular of which are abstention, mortality and 'illegal behaviour'.

Interventions aimed harm minimisation

In the UK, the publication of the report of the Advisory Council on the Misuse of Drugs in 1988 was influential in shifting the aims and objectives of services from abstinence to harm minimisation and risk reduction. Most of the studies in this group measure several outcomes (Table 2) . Woody et al. (1987) were particularly extensive in including the Addiction Severity Index (which was developed before HIV and thus does not address risk-taking), the Beck Depression Index, the Hopkins checklist and the Shipley Institute of Living Scale. The most commonly used outcome measures are aspects of drug use including the quantity, type of drug, the method of use and reported needle sharing and sexual behaviour, non-reactive measures such as urine tests, HIV status and inspection of injecting sites are also reported in some studies, for example, Hart et al. (1989), Greenwood (1992), Gill et al. (1992) and Farley et al. (1992).


Tables 1 and 2 may give an exaggerated impression of the uniformity of the approach to outcomes measurement. In fact, there is such variation in the way in which these indicators are measured, that results would probably differ widely. This variation in the measures of self-reported behaviour, non-reactive measures and objective indicators of social functioning and timing of follow-up are discussed below.

Variations in the measurement of self-reported behaviour

Although self-reported behaviour has been treated with some suspicion there is now evidence (Power et al.,1988; Darke et al.,1992) that drug use, needle sharing and criminality may all be reliably self reported. However, there are a number of different ways of obtaining the assessments and these will inevitably produce different results. Thus, one study may ask

about drug use during the last 3 days, another may invite a ranking of the frequency with which a drug is used and yet another record an average use by itemising the last few occasions of use and dividing this by a set timespan. Similarly, items about the sharing of injecting equipment may pick up apparently contradictory accounts according to how the questions are phrased. Imprecise terms, such as 'sharing works', may be interpreted as referring to any part of the equipment, including shared water. It should also be made clear if 'sharing' is supposed to include passing equipment on to others, as well as accepting second use one self. Furthermore, should it include sharing with one's sexual partner (with whom one may also be having unprotected sex) ?Where the sharing takes place, and with whom it takes place, requires careful definition especially if the responses are deemed to have policy implications.

Illegal behaviour or criminality is also measured in different ways. Darke et al. (1992) found little correspondence between the self-reported assessment of criminal activity in the Opiate Treatment Index and the apparently corresponding scale on the Addiction Severity Index (ASI). On examining the item con tent it was clear that this discrepancy arose because of the ASl's focus on arrests. This raises the issue of how to interpret an 'illegal activities' variable which is based on arrests or incarcerations? If this more accurately reflects the effectiveness of the local constabulary (or, indeed, the tacit priorities of the local constabulary) it will serve as an inadequate surrogate for an individual's involvement in criminal activity.

Non reactive measures

Stimson and Power (1992) suggest including nonreactive measures which are not affected by inter viewer-interviewee interaction:

Non-reactive measures include: physical examinations (skin condition at injecting sites, urinalysis, weight); independent records (from hospital, criminal records); and physical traces (eg empty bleach bottles, syringe contents, discarded syringes) . . . (and) . . . examination of returned syringes for finger prints or multiple blood groups.

Analysis of hair cuttings has also been increasingly advocated in the USA (Mieczkowski,1992) . Analysis of urine and symptoms of drug use featured in several of the studies reviewed here. Such measures have been used as evidence of the validity of self-reported drug use, as well as standing alone as outcome measures. When and where and how the measurements or observations are made is still an important consideration.

Objective indicators of social functioning

Employment status, dependency on benefits and marital status are all used as outcome measures. Care must be exercised in identifying a simple dichotomous variable such as employment/non-employment as an outcome measure. Extraneous factors that will influence this variable include the availability of employment opportunities and any individual commitments and preferences which might prevent the person from seeking work. Marital status also seems a rather ambiguous indicator of success, because it is not inconceivable that securing a divorce could be among the actions embarked upon by a newly drug-free individual. Measures of social functioning are undoubtedly important, yet should be undertaken with care and assessed in a way that bears relevance to the treatment.


Even when very similar measures are used, timing is crucial because it has been shown that long- and short-term follow-up will produce different numbers of abstainers. Hence, some form of relapse may occur in the short term, yet be a 'one-off', and thus not ultimately detract from the effectiveness of the treatment (Bradley, 1989).


It is clear from the review that not only are there many different concepts employed as measures of outcome in studies of drug services, but that these very concepts are operationalised in a dissimilar manner. Comparisons between studies or attempts to pool results should therefore be undertaken with caution.

Among the recommendations in the 1993 ACMD update are the monitoring, audit and evaluation of oral methadone programmes, including particular attention to 'reduction in injecting behaviour and illicit drug use, impact on sexual risk taking behaviour, and the durability of observed benefits'. It is apparent that there are different levels of harm minimisation of which abstention is, of course, the highest level objective. Most of the studies with harm minimisation as an aim are clearly recording data which relate to the stated priorities, yet the measures that are employed vary enormously and routine collection is rare.

Outcome measures which, by virtue of acceptability and brevity, lend themselves to routine clinical use as well as research, are of particular value. The Opiate Treatment Index ( OTI ), developed by Darke and colleagues, combines six independently measured outcome domains of drug use, HlV risk-taking, social functioning, criminality, health and psychological adjustment, and is recommended for research and clinical use (Darke et al., 1991). The index is based on self-reported behaviour with a multidimensional composition, continuous level measurement and comprehensive evidence of validity and reliability. Although developed in Australia the index may prove appropriate in other countries where harm minimisation and improved social functioning are the focus of treatment.

Most drugs services are strongly influenced by the microclimate, yet it is these very variations in case mix and treatment regime which underline the need to collect comparable data. Although the use of a standard outcomes measure, such as the OTI, could not be viewed as the panacea to all research, clinical and management needs, there is clearly much to be gained from using standard measures if they are well validated, reliable and sensitive. This is especially so if the alternative is a measure that is hastily constructed, lengthy and largely irrelevant to the treatment.

Angie Rogers, Research Fellow, Department of Epidemiology and Public Health, University College London, 68-7 2 Gower Street, London, UK.

Sue Ziebland, Senior Research Fellow, Department of Public Health, Camden and Islington Health Authority, 110 Hampstead Road, LondonNW1 2LJ, UK.


