One of the most important things we scientists – and we journalists – can do is to admit bias. So here I go: one of my longest-standing biases is that when a new, exciting finding comes out I tend not to believe it. That is, I don’t believe it until I lay my eyes on the numbers; not just the numbers themselves, but where they came from, who’s asking the question, and why they’re asking.
It’s important to remember that numbers don’t really mean anything in and of themselves. If I tell you, “The answer is ’12’,” you would very rightly ask me, “12 what?” If I reply to you, “12 mph,” you would want to know what was moving at that speed: a person, a train, or a bird? Until you know the answer, ’12 mph’ has no meaning or purpose. If I then told you that 12 mph is the top speed of a healthy unladen male African swallow in excellent weather with no head or tail wind, you could then estimate how long it would take that bird to transport a coconut from Africa to England while gripping it by the husk.1 Even after I answered to that level of detail, you should still ask two more questions to understand my findings: why did I want to know the answer to this question and what if I’m wrong? In other words, what were/are the motivations and the implications of what I’m studying?
It was with this conservative analytic approach that I put my mind to work on the recent study that claims to show that female internal medicine hospitalists are better doctors for sick patients 65 and older. Make no mistake, the article recently published in the Journal of the American Medical Association (JAMA) only looked at Medicare patients: not your average clinic patients or even your average hospitalized patients. The patients in the study were 65-90 years old.
For the record: reason that the authors decided to look carefully at outcomes for 1.5 million hospitalized Medicare patients was
not to prove that women are better doctors than men. What they did was ask the question, “Are there any differences in patient outcome (for example, greater risk of harm) if these Medicare patients are treated by a female doctor rather than a male doctor?”
That might lead you to ask, why should there be any difference? It turns out there are a few reasons why people might think that women are not as good at practicing medicine as men are. The most obvious is that women are paid an average of $15-50,000 less to do the same job (depending on age, geography, and specialty). Why would you pay a third of the physicians in this country less if they were just as good as the other two-thirds? I believe that’s the question beneath the question that was asked when the study was run. And it’s a darn good question.
Another reason why people might wonder if there’s any difference in the outcomes, or what happens to the patients of these doctors within 60 days of being hospitalized, is that female doctors tend to follow the treatment guidelines more closely than their male counterparts: at least, when it comes to treating diabetes and heart failure. So, another question beneath the question is: do patients do better when the guidelines are adhered to for the diseases that they have? These diseases, by the way, may not have been the reason the patients were hospitalized; though given the population that was studied, most patients have at least one of these two diseases on top of whatever else put them in the hospital.
So now we’ve reached the point where we really understand the question that’s being asked: in this group of chronically underpaid professionals who habitually stick to the guidelines when it comes to treating two diseases endemic, how do older, hospitalized patients do?
The answer turns out to be, as far as I can tell from studying the statistics, that the older hospitalized people with a higher incidence of diabetes and heart failure do better when treated by chronically underpaid physicians who tend to stick to the guidelines when it comes to treating diabetes and heart failure.
There are a few things to bear in mind about the findings. The first is that the female doctors tended to be treating fewer patients at a time. They were also more likely to be working in large nonprofit teaching hospitals in New England, and their patients were more likely to be female themselves.
While any of these findings could potentially impact the outcome, in general, the study seems to have been quite well-run. As far as I can tell the authors controlled for everything that one should control for. They made sure that doctors were compared within the same hospital instead of between hospitals, so that the quality of the hospital itself couldn’t interfere. They controlled for how many patients were taken care of by single provider, to make sure that there wasn’t one or two super-providers or under-performers throwing the reading off for the entire group. They controlled for the age of the individual providers, so that more experienced providers couldn’t throw the measurement off (the female providers tended to be younger anyway, so that would necessarily have helped them). They controlled for the age of the patients. They did their best to control for how ill everyone’s patients were. They controlled for just about everything that one could reasonably control for, and, though they did some strange math that involved comparing quartiles (in other words, how they derived significance wasn’t simple), everything generally seems to be aboveboard.
What I’m saying is that there’s hugely nothing wrong with the paper as far as I can tell – what’s awry is the conclusion that people are coming to because of the paper. The gender of the physicians is not what may have saved the lives of tens of thousands of people. It’s the way the people with that gender tended to practice – patient-centered and evidence-based – that reduced the risk of mortality in hospitalized Medicare patients over the age of 65. I’ll say it again: it wasn’t the double-X chromosome itself that saved the day, but rather the methods of medical practice favored by those wielding the double-X chromosome (or those who identify as doing so) treating older people, for whom evidence-based guidelines may well make the difference between life and death.
This paper focused extensively on gender, which is fine, but for me what’s far more interesting is drawing conclusions about the best way to practice medicine. In this case, organizing doctors by gender was something of a shorthand. It allowed the authors to single-out the most patient-centered and evidence-based doctors so that we could look at their patients’ outcomes.2
If I’m right and what’s interesting is outcomes, not gender, we could run this study again with that in mind. If we did, what might we find? We might find that doctors who stick to evidence-based guidelines for treating their patients’ diseases (even if those diseases aren’t what put them in the hospital) and who treat their patients with the most compassion and empathy have the best outcomes, irrespective of gender. We would see a trend that pertains to treatment – one that guides our vision for the best way for all doctors to treat patients.3
This paper is a great lesson in the purpose of science: it’s about establishing trends, not metering truth. Even if we compared 100 compassionate, female doctors practicing evidence-based medicine to 100 equally-compassionate, male doctors practicing evidence-based medicine, and found that the patients who saw female doctors were still doing better to a statistically significant degree 4, that would not leave us free to conclude that the double-X chromosome wields some sort of magical healing power. At that point, we would have to look into other factors that separate the way that women and men practice medicine. Is it patterns of communication that separate out the two genders? How do we even begin to measure those patterns? Taking one step further back to see the big picture: how will we know when we’ve finally found all the contributing factors that make the way that one gender practices medicine superior, on average, to the way the other gender practices medicine?5
This is what drives science as a whole. We start out with a question that is motivated by a desire to understand some facet of the world in which we live. In coming to better understand that facet, we hope to improve the lives of people living today, and to inspire more and better questions to be asked in the future. We hope that by asking straightforward, important questions we can have an impact on the world. The impacts that we hope to have are not about truth – i.e. women are better doctors than men – but about trends and the possible sources of those trends: i.e. doctors practicing in teaching hospitals who are seeing fewer patients may be taking more time and treating those patients with more evidence-based care, resulting in better outcomes.
What is the best conclusion to come to based on this study? Maybe it’s this: we should all take a little more time, see slightly fewer patients, and treat them all with more compassion and more care based on the current guidelines. Also: why the heck are people with better outcomes making less money? That’s a question that no study will ever answer to my personal satisfaction.6
- Depending on the weight of the coconut, how often the swallow needs to rest during the journey, how often it stops for food and how long, how many hours a day it can fly, and if you are completely insane.
- Because we can’t run a double-blind study where we blind the patients to the gender of their doctor. I’m pretty sure that’s not a thing.
- At least patients for whom best evidence guidelines have been established.
- Understanding here that better means here that fewer die or return to the hospital within 60 days of going home from the hospital.
- Again, on average, with a certain group of patients, etc etc.
- Because it is legal to do so. No further study is required.