Tuesday, 13 January 2015

"Science" in Court: Study 329, Paxil, and Depressed Adolescents

Science: Dead, but not entirely forgotten.
We hear a great deal about the problems facing modern psychopharmacology – allegations of bias, the distortions of results, poor research design, and adverse effects being underplayed.

These issues are all on display in the microcosm of a single study: the notorious “Study 329” undertaken by GlaxoSmithKline to investigate the usefulness of paroxetine (Paxil), over which it then held patent, for depression in children and adolescents. The complete text of the published article can be found here.

The study was carried out between 1994 and 1997, and was ultimately published in the Journal of the American Academy of Child and Adolescent Psychiatry in 2001.

The subjects were 275 12- to 18-year-old patients suffering from major depressive disorder (winnowed from an initial screening of 425 potential subjects). These were randomly assigned to one of three conditions: paroxetine, imipramine (an older tricyclic medication not normally recommended for use with children), or placebo (an inert dummy pill). Both patients and providers were blind to the treatment being administered, as is standard practice.

Like most medication trials, the duration was limited – in this case, the treatment phase was just 8 weeks. Many of the concerns about antidepressant medication involve what happens over the longer term, but this study was not designed to examine longer-term treatment gains or adverse events.

So what happened?

Let’s jump to the conclusion in the study abstract: “Paroxetine is generally well tolerated and effective for major depression in adolescents.” Well this sounds great. If you are a busy physician reading the study abstract and not looking any closer, the conclusion looks clear.

Internal letters (click to see a complete copy) were more muted: “As you will know, the results of the studies were disappointing in that we did not reach statistical significance on the primary end points and thus the data do not support a label claim for the treatment of Adolescent Depression.”

The aim of the memo in question is spelled out quite clearly:

To effectively manage the dissemination of these data in order to minimize any potential negative commercial impact…. It would be commercially unacceptable to include a statement that efficacy had not been demonstrated, as this would undermine the profile of paroxetine.”

(Ensuring the safety of the tens of thousands of adolescents likely to be prescribed paroxetine is, unaccountably, not mentioned. Maybe I missed it.)

Contrast this assessment with subsequent marketing information sent to GSK’s sales representatives: “This ‘cutting edge,’ landmark study is the first to compare efficacy of an SSRI and a TCA with placebo in the treatment of major depression in adolescents. Paxil demonstrates REMARKABLE Efficacy and Safety in the treatment of adolescent depression.” (quotation taken from the charge of the US District Court in Massachusetts in USA v GlaxoSmithKline).

The Actual Results

Let’s go back and look at what actually happened in this study.

There were two primary outcomes defined at the outset of the trial.

  1. A treatment response in which participants would either achieve a Hamilton Rating Scale for Depression (HAM-D) score of 8 or less, OR a reduction from their initial HAM-D score of 50% or greater. Outcome: Neither medication achieved a significant difference from the placebo group.
  2. A change in the total HAM-D score from the pre-treatment to the end of the trial. Outcome: Neither medication demonstrated statistically significant superiority to placebo.

There were five secondary outcomes that were also defined at the outset of the trial. Statistical significance was achieved on none of these.

According to the Department of Justice, investigators defined an additional four secondary endpoints later in the study – but before the results were “unblinded.” (This means that they defined the endpoints before knowing the results – had they done so only once they had the results in hand, the breach would have been worse.) Statistical significance was achieved on several of these later-defined endpoints – though their relevance to actual improved quality of life for patients is open to question.

The eventual article defines the two primary endpoints and lists five secondary endpoints rather than nine – including three of the later-defined endpoints that achieved statistical significance. An accompanying table gives 8 endpoints (2 primary plus 6 secondary).

Aside: The deletion of nonsignificant endpoints is important in studies like this, because as the number of endpoints increases, the likelihood of achieving significance on some of them by random chance increases as well. Using a .05 threshold for significance, one in every 20 variables can turn out significant by chance. Most of the significant outcomes in this study (on only later-defined secondary endpoints, no less) are not impressively so. Performing a standard statistical correction for the number of comparisons would most likely have reduced the number of significant differences even further.

What about adverse events?

Consider one of the two sentences from the study’s short conclusion: “The findings of this study provide evidence of the efficacy and safety of the SSRI, paroxetine, in the treatment of adolescent depression.”

Well, that seems clear enough. But let’s go back to the actual outcomes.

The imipramine group had an alarming number of cardiovascular side effects – in keeping with past results indicating problems with tricyclic antidepressants for younger patients.

Many of the long list of side effects were similar between paroxetine and placebo, though somnolence affected 17.2% of the paroxetine group and 3.4% of the placebo group, and tremour affected 10.8% vs 2.3%. Not disasters, really.

But further down in the “Adverse Effects” section is what we’re really looking for: “serious adverse effects.” Based on what the study concludes, we might guess that these were minimal. But no. The rate of these events was 11.8% in the paroxetine group versus 2.3% in the placebo group. (Keep in mind that providers did not know which drug patients were taking – any medical event occurring during those 8 weeks might be due to the drug or to coincidence.)

What were the paroxetine group’s adverse effects? Emotional lability (including suicidal ideation or gesture) – 5 patients, conduct problems or hostility – 2, symptoms suggestive of mania – 1, worsening depression – 2, and headache upon discontinuation – 1. It appears that no patient in the placebo group required hospitalization, but seven in the paroxetine group did.

Keep in mind that we often give antidepressants with two big goals in mind (in addition to the obvious one of reducing depression - which Paxil failed to do in any impressive way):

  1. Let’s keep this person’s problem within manageable levels so they don’t have to go into hospital.
  2. Let’s treat their mood problem to reduce the risk of suicide.

To see a higher rate of both suicidal ideation and hospitalization for a drug likely to be prescribed to do the opposite is a big warning sign.

Remarkably, the study report states (apparently not anticipating raised eyebrows), “Of the 11 patients [suffering serious adverse effects in the paroxetine group], only headache (1 patient) was considered by the treating investigator to be related to paroxetine treatment.” Later in the discussion of results comes a remarkable statement: “Because these serious adverse effects were judged by the investigator to be related to treatment in only 4 patients (paroxetine, 1; imipramine, 2; placebo, 1), causality cannot be determined conclusively.” Readers may be forgiven if they missed the bulletin in which investigators' opinions were elevated to the level of outcome data.

Even more remarkable is an allegation included in the Department of Justice charge: “An earlier draft of the article stated that of the 11 SAEs experienced by Paxil patients, ‘worsening depression, emotional lability, headache, and hostility WERE (emphasis added) considered related or possibly related to treatment.” The change in the treating investigator’s mind apparently occurred during the editing process in the final preparation of the article. If true (and I'm not clear whether this was firmly established), this would be outright fraud.

Studies 377 and 701

Needless to say, not all of GSK’s eggs were placed in one basket. There were others.

Study 377 looked at Paxil versus placebo in 13- to 18-year-olds and, according to the Department of Justice charge “failed to demonstrate efficacy on any of the primary or secondary endpoints.”

Study 701 looked at paroxetine versus placebo in seven- to seventeen-year-olds. It too failed to reach significance on any of the primary or secondary endpoints.

It would be fun to look at the spin placed on these results in the final publications. But of course there weren’t any. As the internal GSK memo tartly states, “There are no plans to publish data from Study 377.”

This is an illustration of a common problem in psychopharmacology. Physicians are advised to keep up to date on the science regarding the treatments they use, and they do so by reading journal publications. But the journals have not, in the past, published all of the data. Positive results tend to find their way into print, negative ones are hidden under the mat. The result is that almost any medication looks good based on the published literature.

Recently there has been a push to correct the problem. Teams launching a trial are required to declare a study prior to starting out, and then must publish regardless of the results. This is a positive step, though how it will play out in practice remains to be seen.

USA v GlaxoSmithKline

The United States attorney charged GSK with misbranding its medications (including Paxil, Wellbutrin, and Avandia), and with failure to report data to the Food and Drug Administration. The complete charge can be read in its entirety here.

Discussion of the data on Paxil begins on page 5.

Discussion of the subsequent writing of the positive paper on Study 329 appears on page 9-11.

The subsequent marketing of Paxil by its sales representatives to physicians is discussed on pages 11-12.

Discussion of the safety data for adolescents appears on page 13-14, including the FDA decision to impose a black box warning on antidepressants regarding the risk of increased suicidality.

For those still imagining that the method used to encourage prescribing practices is the impartial reporting of science, the material on page 14 to 19 should prove to be a bitter but necessary tonic. (If you like to fish, sail, travel, take balloon rides, and be paid an honorarium to do so, this shows how to go about it.) The only surprise to those of us in healthcare is that such practices should attract notice at all, because they are hardly limited to the drugs in the court case.

And the result? Here is the 2012 press release on the $3 billion settlement arising from the GSK guilty plea.

It would be nice to view Study 329 as an aberrant low point in the field of psychopharmacology and medication marketing. Regrettably, it does not appear to be.

