Big Bias (video here) may be the most important lecture I have ever given. Its not easy. The bottom line is that Clinical Trials don’t work. For everyone working in health (medical Catholics), this is a bit like saying the Pope is not infallible. We have a great problem with this, partly because our response is – If not the Pope, then who? – the idea that no-one is infallible is beyond difficult.
The BB Slides are here and the text to go with these slides is below
This talk appeared as an article in the BMJ six weeks ago – Clinical Judgment, not Algorithms are key to Patient Safety – written a code called English. The talk strips away the code.
2). In this X-Ray you see my broken shoulder. I slipped in the shower on a Friday morning. It was fortunate I didn’t come to more grief. My shoulder was operated on later that day and I was back in work on Monday. This was good for me, for the hospital I work in who didn’t have to pay for a locum, for the patients who didn’t have to see anyone else, for my family and friends. Healthcare like this could be provided for free – it pays for itself.
3). A few weeks later I got a letter, which shows what is bankrupting healthcare. A letter like this is sent to everyone over a certain age who’s had a fracture treated in any UK hospital. It’s an invitation to have our bones screened and if there is a hint of some bone thinning we will be offered a bisphosphonate, a drug for osteoporosis.
4). This advert for a bisphosphonate, then an expensive new drug, conveys the impression these drugs work just like the plate put in my shoulder – good for the patient, her family and friends – a treatment that pays for itself.
Guidelines recommended these drugs. The companies making them provided free bone scanners for screening programmes.
But these drugs don’t work like the plate in my shoulder, leading to a need for auditors and managers to get this service working right. We pay for the screeners, auditors and managers to put drugs in people’s mouths – not Pharma, which leaves a grinning Pharma asking What do you mean our drug costs are high? The drugs budget remains a constant 10% of health expenditures.
5). This is before we start paying for the problems these drugs cause like the spiral fracture of the femur in this X-Ray. Before the bisphosphonates these were rare. Now faced with a fracture like this, we immediately know the patient has been on a bisphosphonate. Drug companies claim these fractures are anecdotal because no RCTs have ever shown this happening – you have to have been on the drugs for years for this to happen and trials don’t extend that long – but the RCT data which no-one gets to see shows there is no drop in fracture rates on bisphosphonates.
Abnormally thickening bones is a risky thing to do. Better to encourage people over 50 to remain fit and keep active. Telling them they have bone thinning encourages them not to cut the garden, not to be active, and take pills instead.
Companies and politicians have a get out clause for the increasing costs of healthcare – ageing. This seems to make sense – surely, we need more services as we age?
6). This obvious idea is wrong. Here you see Bart Simpson being told by Homer ‘pull yourself together kid’. These days Bart is more likely to get an antidepressant. There are two points here. One is that we have lost wisdom.
7) Second the process you saw in osteoporosis is being replicated in children’s mental health. The government is pumping more money in – for screeners, auditors, managers and psychotherapists to put antidepressants in our children’s mouths but things are clearly not working out leaving Jeremy Hunt, a recent Minister for Health in the UK, saying children’s mental health services are the greatest point of failure in the NHS.
There have been 30 RCTs of antidepressants in depressed children – all negative – even the RCTs that led to Prozac being licensed for children. This is the greatest concentration of negative trials for anything ever – but despite this, antidepressants are now the second most common drug being taken by teenage girls. What’s going on.
Well the literature on antidepressants is almost entirely ghost written and there is no access to the data from antidepressant trials.
8). Ghostwriters don’t look like this. They are mostly women with PhDs.
9). Here is one page of a Pfizer document showing the preparation of articles on their SSRI Zoloft. Two trials have been done on PTSD. The trials in fact show Zoloft didn’t work but both were written up as positive – one to appear in the New England Journal of Medicine and the other in JAMA – the premier journals in the field. Most of us would figure articles in journals like these must be true.
On the left you see ‘author – TBD’ – to be determined. The articles are written – the company will later pick authors based on who would suit the marketing profile of the drug.
For 30 years the greatest concentration of Fake News on earth has centred on the drugs a doctor gives you – whether psychotropic drugs, statins, or heart drugs.
10). Many good people are concerned about corruption in medicine – the free pens, lunches, conflicts of interest and lack of Transparency. I think this ghostwriting and lack of access to trial data – let’s call it Cisparency – is more important.
But there is an even deeper problem – your belief that Randomized Controlled Trials, if done by angels of course, are the answer to our problems. Your belief is our problem.
11). The first use of randomization in a trial was in the 1948 MRC study of streptomycin run by Tony Hill. This was British medicine’s 1966 World Cup moment – the academic media still dwell on it lovingly and lose sight of the fact that streptomycin had been evaluated in the Mayo Clinic 2 years earlier and this evaluation told us things about streptomycin the MRC trial missed – that some patients became tolerant to it and others had hearing loss.
12). Hill didn’t practice medicine – but here in 1965 even he is saying:
“Frequently with a new discovery… the pendulum at first swings too far… Given the right attitude of mind, there is more than one way we can study therapeutic efficacy.
Any belief the controlled trial is the only way would mean not that the pendulum had swung too far but that it had come off its hook”.
He at least was very aware that RCTs could be helpful but their place was limited – to a primary efficacy endpoint. They are not a good way to evaluate a drug overall.
13). In the late 1950s Louis Lasagna was the major advocate of RCTs but to little effect.
14). The current premium we put on RCTs stems from the thalidomide crisis. In the wake of this, politicians had to be seen to do something and in the USA in 1962 they amended the Food and Drugs Act to require companies to demonstrate efficacy in addition to safety. How would they demonstrate efficacy? Up popped Lasagna saying RCTs would do the trick.
15). At this point, there had been only one drug put through a placebo controlled RCT before being brought onto the US market in which it had been shown to be effective and safe. That drug? Thalidomide – in a trial run by Lasagna.
16). Lasagna like Hill soon began to appreciate the limitations of RCTs.
Here in 1983 he is responding to an article about adverse events that claimed that:
“Spontaneous reporting is “the least sophisticated and scientifically rigorous . . . method of detecting new adverse drug reactions”.
“This may be true in the dictionary sense of sophisticated meaning ‘adulterated’ . . . but I submit spontaneous reporting is more ‘worldly-wise, knowing, subtle and intellectually appealing’ than grandiose, expensive RCTs”.
17). And a few years later he had the following to say:
“In contrast to my role in the 1950s which was trying to convince people to do controlled trials, now I find myself telling people that it’s not the only way to truth.
“Evidence Based Medicine has become synonymous with RCTs even though such trials invariably fail to tell the physician what he or she wants to know which is which drug is best for Mr Jones or Ms Smith – not what happens to a non-existent average person”.
18). The person responsible for randomization and the linked use of statistical significance testing in trials was Ronald Fisher, whom you see here smoking – he didn’t believe the evidence that smoking caused lung cancer.
19). Fisher never trialled anything. He ran a thought experiment about experiments. If someone knows what they are doing, the only things that can interfere are some confounder they don’t know about or chance. Randomisation takes care of the confounder and we can agree that 1 wrong result out of 20 is down to chance. If we know what we are doing Robin-Hood-like we should get the same answer every time – as the image shows.
20). But no RCTs for anything in medicine looks like that – they look like this. In the case of the antidepressants its sheer chance a patient put on one is going to be helped. These RCTs have no anchor in the real world. They do not offer gold standard knowledge – but landing an arrow anywhere on this target is the standard through which companies make gold – which is why you think RCTs rather than your own view offer gold standard clinical knowledge. Companies don’t have to let anyone see the arrows that totally miss the target.
21). This is all blindingly obvious. But we’ve had a denial mechanism in place since before the Berlin Wall came down, with statisticians and journals like the BMJ saying “Hey no-one uses p-values anymore – we use confidence intervals”.
Here on a pre-Euro Deutschmark, you see Carl Friedrich Gauss, who in 1810 created the confidence interval and the idea of measurement error. The problem he solved was that 10 astronomers looking at a star – given the instruments then – came up with 10 different locations. Were they seeing just the one star or several? Gauss’s answer was that if the observation lay outside a normal distribution curve it likely came from a different star.
But if I take 10 or 20 of you here and gave you a drug that would sedate some of you, or cause your heart rate to slow, others will find themselves unable to sleep or their heart rate speeded up. This is not measurement error. Applying confidence intervals to these data risks concluding these drugs have no effect on heart rate or level of sedation.
Neither p-values, nor confidence intervals, as applied to clinical trials have an anchor in the real world. Let me show you.
22) RCTs are supposed to control even the confounders we don’t know about. But they routinely introduce confounders. If you print off this slide and superimpose the table tops you will end up confounded – they are the same size. Despite doing this, they will still look different here and my problem may be that even after hearing what you hear next you will still think that RCTs give us the right answer about drugs.
23). Imipramine was the first antidepressant. It is more potent than any SSRI. It can cure melancholia – a severe depression that leads people to commit suicide. But imipramine itself can cause people to commit suicide. Nevertheless, putting this drug that can cause suicide into a trial of melancholic patients I’d expect that fewer people given imipramine would go on to suicide compared with people given placebo. We’d conclude this drug protects against suicide not that it causes suicide.
24). In contrast, the trials that brought SSRIs on the market show an increase of suicidal events on SSRIs compared to placebo. This is because the trials had to be done in mildly depressed people and so the drug risks showed up more clearly.
25). If imipramine was put into trials of the same patients it too would seem to cause suicide. We call these drug trials, but they aren’t – they are treatment trials. A drug trial would be done in healthy volunteers. Second, clinical RCTs tell us nothing about cause and effect.
26). And where both treatment and condition can cause superficially similar effects, which happens in most medical trials, you are completely unable to tell what is going on. I’m sure you can all add to the list here.
27). Now take this GlaxoSmithKline document. In these placebo-controlled trials you see 11 suicidal events on paroxetine versus 0 on placebo. This is a problem for the company – so much so they have omitted at least one event from these trials and complete trials.
28). But they also did trials in intermittent brief depressive disorder – also called borderline personality disorder – where patients engage in suicidal events much more frequently than in classic depression. Now the data for paroxetine don’t look good here either – and the real data are worse – we could add 12 extra suicidal events on paroxetine here and still get the same magical result you see on the next slide.
29). When you add the two groups – Hey presto – Paroxetine protects against suicide. This is using a problem a drug causes to hide a problem a drug causes. You can get the same confounding effect every time the condition being investigated is heterogenous – as is the case for depression, diabetes, parkinson’s, breast cancer, back pain and pretty well every condition we treat.
30). Now what you often hear about adverse events is that of course RCTs can miss idiosyncratic events or events too rare to be picked up in say 200 people. This is clever propaganda and wrong. Whatever sex, age, ethnicity you are, pretty well everyone here given an SSRI would have some genital numbing 30 minutes after a first dose. This is the commonest effect these drugs have.
31). The best example of how RCTs introduce confounders starts here with Frank Ayd, who before 1962 discovered amitriptyline – the best-selling of the first generation of antidepressants – a good treatment for melancholia which causes loss of libido. Unencumbered by RCTs, Ayd was able to say within a year of amitriptyline being in use that it caused sexual dysfunction.
32). A decade later, George Beaumont, working for Ciba-Geigy had the job of marketing clomipramine, now recognised as our most potent antidepressant, but then another molecule in a crowded field. He placed articles in newspapers featuring a minor celebrity, thrilled that her boyfriend’s premature ejaculation problem could be managed by 10 mg of clomipramine taken 30 minutes before intercourse – the standard antidepressant dose is 150 mg
33). The SSRI antidepressants, derived from clomipramine, were launched around 1990. SSRIs are ineffective for proper depression, but this is rare compared to the nervous problems for which doctors were giving benzodiazepines. The SSRI marketing need was to transform cases of Valium into cases of Prozac – transform people with no sexual dysfunction treated with pills that had no sexual effects into people who were depressed.
Doctors began to hear they could be sued for prescribing benzodiazepines which cause dependence. The real need they were told was to treat the underlying depression, with antidepressants, which do not cause dependence, rather than treat the superficial anxiety with dependence producing drugs.
Prior to marketing, companies ran phase 1 studies of SSRIs. In these healthy volunteers became dependent after two weeks and were left anxious and depressed afterwards. Within 3 years of paroxetine being on the market, there were more reports in Britain about dependence on it than there had been in 20 years from all benzodiazepines combined.
The initial labels for all SSRIs stated that less than 5% of patients in clinical trials reported sexual dysfunction. But in some phase 1 trials, over 50% of healthy volunteers had severe sexual dysfunction that in some cases lasted after treatment stopped.
How does over 50% become less than 5%? Well in RCTs investigators focus almost entirely on the primary endpoint – does the drug work – with minimal space and time to record adverse events. They are unlikely to record a problem, in particular one that can be passed off as a feature of the illness.
The bias here is total. The single commonest effect of these drugs – not a rare effect RCTs might miss – has vanished. Surely, it’s still there in the real world – what real world?
34). In 2006, PSSD – Post SSRI Sexual Dysfunction – emerged. My first encounter with this was when a lady in her 30s presented to me 20 years ago telling me she was unable to function – I said this will clear once you stop your SSRI. No she said – she’d been off treatment for 3 months and she could take a hard-bristled brush and rub it up and down her genitals and feel nothing.
Turns out the first reports of PSSD had been reported to MHRA in 1991. They have hundreds. The first academic publications came in 2006. This is a state where people become profoundly genitally numb and lose the ability to orgasm properly and lose libido. It may start on the drug but typically gets worse after treatment stops. It can endure for years or in some cases forever. Some kill themselves, others refer themselves to Dignitas.
Almost identical problems have been reported after finasteride and isotretinoin. This is a pharmacological and neurological mystery – whose solution is worthy of a Nobel Prize.
35). But despite PSSD, the sexual problems on antidepressants have vanished. In May this year BMJ featured an article on Declining Sex in Britain. It fingered depression as the cause for this – no mention of antidepressants even though antidepressants almost universally cause this while the nervous problems for which they are given don’t cause sexual dysfunction and the benzodiazepines didn’t cause sexual dysfunction either. The media were completely uninterested to pick up the treatment issue – even though between 10-15% of us are on these drugs – because we can’t stop – which means 20% of us are not making love the way we might want.
36). The medical response to PSSD is – don’t be stupid a drug can’t have an effect if it’s not in your body. There is no real evidence for this. If you google this, you’ll be ill forever. Maybe you have some past trauma that counselling or an antidepressant could put right. The ridicule and lack of any recognition contributes to the suicides.
So we figured on petitioning EMA to get this problem included in the label of drugs. Why would EMA budge in response to us given they are sitting on thousands of reports of just this problem since the early 1990s and have done nothing?
37). The unfortunate experience of Walter Raleigh suggested a way forward. In 1618 Raleigh was executed in London on the basis of hearsay – people saying things about what he had said or did without coming to court to be cross examined. After his execution legal systems worldwide changed so they don’t now admit hearsay evidence. People have to come into court to be cross examined.
Now if you report an adverse event to the regulator, the regulator will remove patient names – transforming the report into hearsay and making it impossible to establish causality. Examining or cross-examining the patient is crucial to establishing causality. Without this, regulators can accumulate tens of thousands of reports and do nothing.
Companies, in contrast, are legally obliged to chase a patient’s medical records. They do so to find out if you had an ingrown toenail at the age of 2 that might explain why you are unable to function sexually post SSRI – if they can’t find anything they often conclude their drug has caused the problem and include it in the drug label but with words that most doctors perceive as not conceding cause and effect when in fact they are conceding causality.
If we want to restore healthcare, we and our patients have to insist our names are left on the reports and as things stand we should report to companies not regulators. We have to make clear we are willing to come to court to be cross examined if need be. The legal system cannot dismiss reports from people willing to come into court to be cross examined. The question is – how many of you have the courage to do this?
38) The alternate lecture to this one ends with an Economics is from Mars, Medicine from Venus slide. Aimed at persuading you that anyone who works in health knows more about neo-liberalism – its good and its bad – than any economist or politician.
39). Algorithms are from Mars and Magic from Venus pitches algorithms – tools that work in clearly defined ways that be capitalized – as the basis for economics. RCTs are a great example of an algorithm and EBM a compelling slogan for an algorithmythology that is incompatible with medicine.
You are better at evaluating a drug than any RCT. RCTs can’t offer anything to the Magic central to medicine which involves a decision about whether to attempt to bring good out of the use of poison or a mutilation. But RCTs can be capitalized, and your discretion can’t, and this pits us against them. At the moment – they are winning
When a patient consents to a poison or mutilation, key to the possibility of magic and their subsequent life is the relationship between them and you. If they can trust you, if they sense you would do the same for yourself or your child as you are offering them, you gain social capital. When you give a poison, if you betray them – if they sense you would not have taken that bisphosphonate – you lose that capital.
Taking a poison at our suggestion remains one of the best ways to discover new treatments. But in addition, PSSD for instance is a complete pharmacological and neurological mystery, the answer to which might lead to a Nobel Prize but even more important to most of us could shed a lot of light on who or what we are.
We need a Relationship Based Medicine rather than Evidence Based Medicine to grapple with mysteries like this.