The therapist had been at it for nine years. Orthopedic caseload, outpatient clinic, steady volume. The kind of clinician who showed up early, stayed late, kept their documentation current, and treated every patient on the schedule with the same level of effort. A reliable pair of hands. Good outcomes on the straightforward cases. The complicated ones β the patients who did not respond on the expected timeline, the ones with four medications and a history that did not add up β those tended to plateau. The therapist would adjust the protocol, add a modality, and extend the episode. Eventually, the patient would be discharged with partial improvement or referred out. It was not negligent care. It was a pattern.
Down the hall, another therapist β two fewer years of experience, no board certification β was seeing the same mix. Same referral sources. Same payer mix. Same scheduling template. But this clinicianβs complex patients moved through the system differently. Shorter episodes. Fewer visits to achieve functional benchmarks. Fewer referrals back to the physician. The difference was not in what the clinician knew on paper. It was in what the clinician did in the room β the questions asked in the first visit, the interventions selected under pressure, the willingness to abandon a plan that was not working, rather than double down on it.
Nobody in the clinic had a language for this difference. There was no metric that captured it, no dashboard that displayed it, no triage system that routed patients accordingly. Both clinicians had the same credentials. Both billed the same codes. Both received the same reimbursement. In the system’s eyes, they were interchangeable.
We have spent two decades improving our phenotyping of patients. The pain science revolution gave us frameworks for central sensitization, nociplastic pain, and psychosocial contribution. Classification systems emerged β treatment-based, mechanism-based, complexity-tiered. We learned to sort patients by their presentation, their prognosis, and their barriers to recovery. Intake questionnaires grew longer and more sophisticated. Outcome measures proliferated. We built entire research programs around the question of which patient characteristics predict response to which interventions.
This was important work, and it has improved how we think about the people sitting in our waiting rooms. The profession deserves credit for it.
But in the process, something went unmeasured. The research infrastructure that learned to sort patients by mechanism, by complexity tier, by psychosocial profile β that infrastructure never turned around to examine the other person in the room. The profession has not yet reckoned with a parallel truth: not all clinicians with the same credentials are the same clinician. The unmeasured variable in every clinical encounter is the person delivering the care.
The evidence on this point is now difficult to ignore. Resnik and Hart (2003) studied more than 24,000 patients across 930 therapists and found that expert-level performance β defined by superior patient outcomes β did not correlate with years of experience. Rodeghero and colleagues (2015), in a sample of over 25,000 patients, found that fellowship training was associated with greater functional gains and treatment efficiency, but residency training alone showed no outcome advantage compared to therapists with no post-professional education. Whitman, Fritz, and Childs (2004) demonstrated that when standardized protocols were applied, neither experience nor specialty certification predicted disability outcomes. The credentials that were supposed to signal expertise did not predict the thing that mattered most: what happened to the patient.
What did predict outcomes was something more granular, more behavioral, and far harder to credential. Lutz and colleagues (2021), studying 1,240 therapists and more than 39,000 patients, identified what they called treatment signatures β observable patterns of clinical behavior that distinguished high-performing from low-performing clinicians. The top 17% of therapists and the bottom 11% did not differ in their credentials, continuing education hours, or years on the job. They differed in what they did with their hands and their clinical judgment. High-performing therapists used significantly more active therapeutic activities, more manual therapy, and more skilled interventions. They minimized passive modalities. Every difference was statistically significant. The pattern held across the caseload.
This was not a study of what clinicians knew. It was a study of what clinicians did β and the gap between knowing and doing turned out to be the gap that mattered.
But the most revealing finding was not in the averages. It was in what happened as complexity increased. High-performing clinicians maintained their skilled intervention patterns regardless of patient complexity. Lower-performing clinicians reduced skilled care as complexity rose β retreating from precisely the interventions their most complex patients needed. The patients who demanded the most clinical reasoning received the least clinical skill.
This is not a story about bad clinicians. Every one of those therapists showed up, treated a full caseload, documented their care, and went home tired. The ones who reduced skilled care as complexity rose were not lazy. They were doing what the system trained them to do β managing volume, staying on schedule, defaulting to the interventions they could deliver efficiently when the clinical picture exceeded what they had been prepared to navigate. The system never told them there was another way to practice. It never showed them the data. It never created a feedback loop.
It is a story about an unmeasured variable operating inside a system that has no mechanism to see it. The scheduling software does not know which therapist maintains skilled care under complexity. The referral system does not route the most complex patients to the clinicians whose treatment signatures suggest they can handle them. The reimbursement structure β Medicareβs conversion factor sitting at $32.36 after five consecutive years of reductions (MedPAC, 2023) β does not reward the clinician who does more with fewer visits. The continuing education marketplace does not teach treatment signature awareness because there is no infrastructure to measure it.
The profession phenotyped the patient because the patient was visible. Outcome measures, intake questionnaires, and classification algorithms β all of these tools point at the person receiving care. The clinician remained in the background, serving as a constant. But the clinician was never a constant. The clinician was always the variable the system chose not to measure.
This matters for a reason beyond academic interest. If treatment signatures predict outcomes and credentials do not, then the professionβs entire investment in credentialing as a proxy for quality represents a structural misallocation. We have been measuring the wrong thing. We have been certifying knowledge and rewarding tenure while the behaviors that actually drive patient improvement go unobserved, unrewarded, and β in most clinical settings β unknown. The continuing education system requires hours, not behavioral change. The credentialing system requires examinations, not outcome data. The reimbursement system requires a license, not evidence that the license-holder practices in a way that serves the patient sitting across from them.
The therapist down the hall β the one with fewer years and no board certification who nonetheless moved complex patients through the system more efficiently β was never identified by the system as different. There was no feedback loop that captured the pattern. No development pathway that reinforced it. No triage mechanism that leveraged it. The system identified two clinicians with the same credentials and treated them as a single clinician.
They were not the same clinician. They never were.
Stay up to date on The Development Gap and subscribe to us on Susbtack!
