Behavioral Science in the Real World

Why Listening to Users Matters More Than Just Lifting Metrics

In behavioral science, we often celebrate the 3-point lift. A message increases response rates from 20% to 23%. A different framing moves the needle just enough to win a headline. These kinds of incremental wins are useful, and they add up. But if we only focus on the average, we can lose track of what those few percentage points actually mean for the people behind them.

That tension came into focus for me while working on a behavioral pilot program in Boston, where we partnered with the City to help small business employees return to work using public transit. The program offered free CharlieCards and Bluebikes passes to workers in five Main Streets Districts. It was designed as a randomized controlled trial to test how transit incentives might shift commuting behavior after COVID disrupted old routines.

Half the participants received a $60 CharlieCard. The other half got $5 up front and then received the remaining $55 a few weeks later. We tracked usage, compared ridership, and ran the numbers.

The data told a clear story. The group with the $60 incentive took over four times more trips on public transit than those who received $5 initially. Even participants who had access to a car used the T more often. From an A/B testing standpoint, it was a win.

But the part that stuck with me didn’t come from the data. It came from a conversation.

One participant shared that she applied for the program because she was struggling to get to work—and by the time her CharlieCard arrived, she had just lost her housing. The card didn’t just change how she commuted. It helped her keep her job. It reduced stress. It gave her a way to stay afloat during an impossible time.

She wasn’t the only one. Others talked about how the passes made it easier to see family, buy groceries, or just feel like the city was more accessible. For them, this wasn’t about shifting marginal preferences. It was about basic mobility and opportunity.

These stories changed how I think about my work.

Behavioral science is often measured in statistical significance. But real-world behavioral design also needs to account for emotional significance, financial stress, and the reality of everyday decisions.

Here’s how I try to keep that balance in future projects:

I still run the A/B tests, but I try to pair them with interviews and open-text feedback where I can.
I design for what works on average, but I listen for what matters most to users.
I ask myself not just what’s effective, but who is it effective for—and why?

The goal is not to abandon rigor. It’s to add depth. A clean graph might show us that the intervention “worked,” but the lived experience tells us how and why it mattered.

For behavioral scientists working in government, health care, financial services, or any applied setting, here’s the takeaway: don’t just optimize for outcomes. Optimize for understanding. The people you’re trying to help often know more than the data alone can tell you.

Want to go deeper?

I’m building a short field guide on how to integrate user voice into behavioral science experiments with examples, interview prompts, and mixed-method templates.

Subscribe here to get early access when it drops.

Main Streets Transit Pilot Report (PDF)

Why Listening to Users Matters More Than Just Lifting Metrics

Want to go deeper?

Related Reading