This is the first of 3 linked lectures given to family doctors in Stockholm on September 9 2021. The meeting was organized by André Marx, a family doctor in Stockholm who also runs Sweden’s first and only public health care-funded withdrawal clinic for psychiatric drugs.
The alternate title of this talk is Girl with a Clinical Trial Tattoo. It will be followed by Girl who Catches Ghosts and Girl who Eats Salt. All will feature on the Politics of Care Forum.
I come from a very orthodox, conservative medical background. I believe in the medical model and the value of pharmacological and other medical treatments. Despite this you may be disturbed by what you hear. Disturbed by the distance we seem to have travelled from what They Used to Call Medicine to what is happening now.
I won’t be offering many answers. But I have two questions – What are RCTs and What is Data. My answer to Data is in this talk. My answer to RCTs is in the third talk.
Here is our patient. It’s not clear if it is a woman or a man. I can tell you s/he is smelly. You might initially feel s/he is not very bright. How do we Help here? What would Help look like? Help surely cannot be the same as telling someone what to do.
Our problems knowing what to do for our patient start here. Any discussion of randomized controlled trials (RCTs) starts with Ronald Fisher, a cantankerous character. Fisher thought the idea that smoking causes lung cancer was ridiculous – views the tobacco industry were happy to fund. Fisher wasn’t a doctor or a scientist. He never ran an RCT. He was a mathematician who was interested to mathematically characterise the views of experts.
Fisher assumed experts knew what they were doing. For him experiments were a way to demonstrate that – not a way to find out new things. Only two things could get in the way of an experiment turning out the way the expert said.
There might be some unknown factor, but Fisher thought this would be rare – experts knew nearly everything. Randomization could take care of unknown unknowns.
Fisher’s experts were like Robin Hood. Unless chance intervenes, like Robin Hood they would split the first arrow 19 times out of 20.
If you can any medical condition where the RCTs turn out like Fisher’s scenario – let me know. Trials of antidepressants and most drugs look more like this – all over the place.
If medical experts were like Fisher thought, you wouldn’t need trials. Medical advice would be like telling people to wear a parachute when jumping out of a plane.
This old Deutschmark celebrates Carl Friedrich Gauss’ key scientific breakthrough. Around 1810 telescopes were unreliable and astronomers couldn’t be sure if they were looking at one or two stars. Gauss solved this with the confidence interval which you see beside him – if the measurements fell within the confidence interval they were from the same star and from different stars if one of them fell outside it.
Jerzy Neyman and Egon Pearson, who hated Fisher, said statistical significance was all wrong and experiments needed confidence intervals.
These guys also had nothing to do with science or medicine. They never did an RCT. But in the 1970s, medical journals told us to drop statistical significance and start using confidence intervals instead.
This is the only complicated slide in this talk. You see Gauss’s confidence interval on top. Confidence intervals work for a measurement error problem like stars and telescopes. If you give a beta blocker to one person, their heart rate will slow but with some variation every time you check. We can call this measurement error and use Confidence intervals – the data will all be to the Left of the heavy line running through 1.0.
But if you give a beta blocker to ten people, it will slow the heart rate of 9 but the next person may have a heart rate increase. We should but we don’t view her as a different star. We just say beta-blockers slow heart rate.
Heart rate increases on beta blockers are rare but many psychotropic drugs sedate some while leaving others wide awake. Half of the measurements will fall to the Left of 1.0 and half to the Right, leaving companies able to claim their drug has no effect on sleep. This is not just wrong its psychotic.
On the lower Left we have another example. Both the Red and Yellow Drug here can kill you. If forced to take one of them, all your statistical training will tell you avoid the Red one because the confidence interval says it can kill you for sure. The Yellow confidence interval crosses 1.0 so supposedly we don’t know if it can kill you.
Well – the likely risk from the Yellow Drug is nearly 10 times greater than the Red one. If forced to choose, you should take the Red one.
The next example shows the suicidal events from FDA’s database of antidepressant trials – the Red Curve shows events in those 25 and under and Yellow Curve people who are 45-55. But FDA and others claim only those 25 and under are at risk from antidepressants. There’s no problem if you are over that age.
The bottom line here is that neither statistical significance nor confidence intervals work in trials of a medicine. We have tens of thousands of RCTs now but no-one can really work out what they mean. Our use of statistics is based on convenience rather than reality. Confidence intervals can work if we know what we are doing in the first place but don’t if we don’t know what we are doing. See The Antidepressant Tale: Figures Signifying Nothing.
The two tables you see here are identical in size and shape. If you superimpose one on the other, they will match – confounding you. Randomizaton supposedly controls for unknown confounders. People talk about it in mystical terms. Put a drug through an RCT and even though a doctor has no idea what they are doing – the right answer will emerge.
The problem with this is neither Fisher nor Neyman ever thought an RCT could help someone who didn’t know what they were doing. You have to roughly know what you are doing before an RCT can be of any use.
The First Medical RCT was for streptomycin in tuberculosis in 1948. Tony Hill was the person who introduced randomization to a medical trial. Hill didn’t follow Fisher or Neyman. There was nothing mystical about randomization in his trial – it was just a method for fair allocation.
Two years previously clinicians at the Mayo Clinic had done an old-fashioned trial to evaluate streptomycin for tuberculosis – controlling for things like age and sex in both treatment and placebo groups. Both trials found streptomycin worked. The Mayo trial found patients became resistant quickly and some went deaf. Hill’s RCT missed all this.
The Mayo studies were the only reason Hill’s trial was run – the answer to did streptomycin work was already known. Hill didn’t discover it.
After Hill’s RCT, very few investigators could see much point in doing RCTs rather than the usual kind of clinical trial. Something else happened to change that – as you will see.
Here’s Tony Hill in 1965 reflecting on RCTs. In an early part of this article, he says he is surprised at how popular RCTs have become. He is also surprised by the fact its mostly industry people pushing them – not doctors.
He says RCTs are just one way to evaluate Therapeutic Efficacy – that is evaluating one of the 100 or more effects all drugs have. This essentially means RCTs are not a good way to evaluate a drug – they have a place but do not offer a good view of a drug overall.
In the 1950s the most enthusiastic advocate for RCTs was Louis Lasagna. Lasagna had put the placebo and Clinical Pharmacology on the map.
Drug Regulation then was about Drug Safety. Lasagna thought RCTs offered FDA a way to establish if a drug worked. If they didn’t work, they couldn’t be safe. Nobody paid any heed.
Events changed everything. The horrific birth defects caused by Thalidomide triggered a political crisis – something had to be done.
The 1938 FDA Act which focused on safety kept Thalidomide off the US market. A new 1962 Act gave FDA a brief to establish that drugs were effective in addition to safe. Demonstrating effectiveness would be done using RCTs. The idea was 2 positive RCTs would be the criterion. Everyone thought if there was one positive trial, all trials would be positive, and certainly with two positive trials – but we now know this is not the case. 50% of trials can be negative.
Adding effectiveness can only be a good thing – don’t you think? Well thalidomide later got licensed – under the 1962 Effectiveness Act.
Prior to the 1962 Act, only one drug had been shown to be both safe and effective in a placebo controlled RCT – Thalidomide and the person who did the trial was Louis Lasagna. Article Here. The mechanism we have put in place to stop Thalidomide happening again was one that it sailed through without a problem.
Everyone thought, and most still think, RCTs put a brake on pharmaceutical companies. Instead, RCTs have become the standard through which companies make Gold.
Twenty years later you see Lasagna responding to Rossi et al who say RCTs are the most sophisticated way to work out what a drug is doing.
He says this is only true if sophisticated means adulterated. This is an older meaning of the English word sophisticated that most people today don’t realise. Sophisticating wine means adding ethylene glycol to it. Article Here.
He is saying essentially that clinical judgment is more accurate than RCTs at least for adverse events and more interesting to engage with.
Ten years later again you have him saying that his view of RCTs has changed completely since the 1950s. Interview Here.
And RCTs aren’t that useful – certainly not for the key question which is what am I going to do to help this person in our waiting room.
RCTs are supposed to control all confounders even the ones we don’t know about. In fact, it is just the opposite. RCTs introduce confounders and make trial results essentially meaningless.
Imipramine was the first antidepressant. It and other tricyclic antidepressants are stronger than SSRIs and SNRIs. It beats them in RCTs. It can treat melancholia – they can’t. They are useless for severe depression. Melancholia comes with a high risk of suicide.
Imipramine was launched in 1958. A year later at a meeting in England, Danish psychiatrists made it clear that while it was a wonderful treatment it made some people suicidal. Nobody there argued. This drug can cause suicide.
Let’s do a thought RCT of imipramine versus placebo in melancholia. Even though it can cause suicide, we would expect it to reduce the number of suicides in a trial like this because it treats the condition. This RCT would be great evidence antidepressants do not cause suicide.
Here is the data on the trials in mild depression that brought the SSRIs and SNRIs on the market – you see a doubling of suicidal events compared to placebo. Companies resorted to all sorts of illegal manoeuvres to hide this risk.
This is what the data for imipramine look like in the same mild depressions. Now it seems that it too causes suicides. So RCTs tell us nothing about cause and effect – they can give us diametrically opposite answers. This is because these aren’t drug trials. They are Treatment Trials and in any clinical Trial, the condition confounds the effects of the drugs – and these confounders hide drug effects.
People evaluating drugs in traditional trials, before RCTs, knew this. People doing RCTs don’t. When a patient becomes suicidal in a trial you have to use your judgement to work out what is happened but in RCTs clinicians are not supposed to use their judgment.
This is not just the case for depression – it’s true in every clinical situation where drugs and conditions cause superficially similar effects – diabetes and glitazones both cause heart failure, osteoporosis and bisphosphonates both cause fractures
Here is what a drug trial looks like. Companies ran these studies in the 1980s and found that SSRIs make healthy volunteers suicidal, caused dependence and sexual dysfunction but we heard nothing about these problems when the drugs launched. These Drug Trials enabled companies to engineer their Treatment Trials to hide these problems. I will show you how this is done now in a moment but first look at this.
This slide shows some data straight from a 2006 GlaxoSmithKline paper. GSK’s SSRI paroxetine was in trouble – the RCTs data for Major Depressive Disorder seem to show paroxetine causes suicidal events. The real data I think are worse that GSK admit to here.
But never fear RCTs come to the rescue. GSK also did trials in people with Intermittent Brief Depressive Disorders – IBDD. These are borderline personality disorder to most people – patients who have suicidal events much more often than anyone else. But these patients can meet criteria for depression and could be entered into Depression RCTs.
Now Lilly had done a trial of Prozac in these patients – it didn’t work. GSK also did a trial of paroxetine which didn’t work and had a 3-fold higher suicidal act rate than placebo. GSK then did another trial in a similar group of patients. Why?
The answer is here. Here are IBDD data from the two GSK trials. I have seen other data for these two trials which make paroxetine look worse but let’s stick with GSK’s story. We could even add 16 more events to the paroxetine arm and still get the same magical outcome
When you add the IBDD data to the MDD data – all of a sudden paroxetine doesn’t cause suicidal events, it protects against them.
Something like this is going to happen in every treatment trial where the patients entered are heterogenous – back pain, breast cancer, diabetes, hypertension, osteoporosis, parkinson’s disease. We can use an effect a drug causes to hide an effect a drug causes.
RCTs are not a good way to work out what is going on. Results of a back pain trial will insist you use analgesics rather than antibiotics – which is all wrong for the 5-10% of backpains caused by infections.
Now I mentioned in Healthy Volunteer Drug Trials, companies saw SSRIs give most people sexual problems. After only 2 weeks, you can be left unable to function ever again in your life – the condition is called Post SSRI Sexual Dysfunction (PSSD). It happens in young and old, female and male, all ethnic groups and in every country on earth.
But in company Treatment Trials, less than 5% of people seem affected in this way.
No laws need to be broken to achieve this. No skullduggery is needed. Just do an RCT. RCTs depend, as Tony Hill told you, on a primary endpoint. Everybody assumes this is the commonest thing a drug does, and devoting attention to it makes sense. All other effects will be less common or may only show up if you’ve been on treatment for months or years.
For SSRI RCTs, the primary endpoint is mood change. But mood change is not the most common effect. It is the effect of commercial interest. What happens if you focus all attention on this –
This is what happens…. We are hypnotized and miss what is going on
The sexual effects of SSRIs happen in close to 100% of takers within 30 minutes of the first dose. They should be unmissable. But a focus on a primary endpoint makes them vanish.
Companies also knew from healthy volunteer trials that people become dependent on SSRIs but this problem vanished in RCTs. The result is that 10-15% of the population of most Western countries are now on these drugs – primarily because they can’t get off.
The BMJ ran a lead article 2 years ago saying the British have stopped making love. They blamed depression but those of us in the mild states these drugs get given for often turn to love-making or eating to help. The Benzodiazepines we used for these problems before the SSRIs didn’t cause any worse dependence and we were at least able to make love.
It’s the drugs 15% of us are on that make love-making impossible for maybe 20% of the population if you take our partners into account and perhaps 30% of people in areas where the use of these drugs is particularly high.
RCTs greenwash drugs. They convert poisons out of which we can bring good, if we are not hypnotized, into sacraments. Sacraments are substances that can only benefit – that cannot harm.
Regulators tell us that drugs that kill us or wipe out our sex lives for ever have a positive benefit-risk balance. This claim is based on RCTs, which only look at one of a drug’s effects – that may not be the most common effect. The statement is totally incoherent – it was a drug company invented mantra that regulators have swallowed.
One of my key questions at the start of this talk was – What is Data?
In a trial of Pfizer’s antipsychotic, a man died from burns. You can’t get any sense of what happened him from reading any of the 50 articles there are from every single drug trial. Few mention any hazards. The figures and statistical outputs in the papers from this trial give no hint.
His death triggered an internal company adverse event report. This shows he poured petrol on himself and set fire to it in an attempt to kill himself. He died 5 days later – and was coded as death by burns. If you didn’t have the adverse event report – which you can’t easily get – you’d have no way know what happened.
In GSK’s famous Study 329,which you will hear more about in the second lecture, a trial of paroxetine versus placebo in teenage depression, a 15-year-old boy was arrested by police because he was out on the street with a gun threatening to shoot people. The police took him to hospital. He was taking paroxetine. This should have led to a serious event report but it didn’t because GSK had discovered if you say someone has an intercurrent illness – you don’t have to report on what happened. Four children dropped out of this trial with intercurrent illness – all on paroxetine.
A internal company email, which FDA never got to see, told the story of this boy. He brings out the meaning of data.
People are the data in clinical trials. You have to be able to interview them to find out if this boy’s case if he had an adverse event or did he really have another illness – did it clear when the drug was stopped?
Working out has a drug caused a problem is judicial – you do it by examination and cross-examination not by counting out the figures an algorithm spews out. What Used to be Called Medicine was judicial – it was not algorithmic.
Imagine you break an arm and go to an Accident Department who say ‘Good news we are running an RCT of Plaster of Paris. We are going to put a POP randomly on one of your 4 limbs and compare this to no POP’.
Randomly putting a POP on some limb will do better than no POP but to start doing this on the basis of RCT results like this would be crazy. This however is exactly what we are doing.
This however is what most doctors are doing. RCTs effectively remove our brains and replace them with something that can be programmed.
RCTs are pitched against clinical judgment. Clinical expertise used to be at the heart of HealthCare – lived experience we could call it – but this is now a problem for health service companies.
Traditional doctors are like a gourmet Chef commenting about a Fast-Food meal and health has become a service industry like Fast Food – gourmet physicians aren’t wanted.
We are a problem for Guideline writers, regulators, and pharmaceutical companies, all of whom want to pitch objective knowledge in the form of RCTs against your and my expertise. The idea of bringing good out of the use of a poison doesn’t compute for insurers, managers or even the public who want religious sacraments.
What are these RCTs that are used to invalidate us? The answer comes later.
The Politics of this are that in some sense we need health/medical co-operatives rather than Corporate Operations in which staff and patients are Cogs rather than central.