Open to research & data science roles · Boston, MA

Sean W. KelleyHuman–AI interaction · behavioral science · mental health.

I measure how AI systems behave, and build tools that act on what I find.

Postdoctoral researcher at Northeastern's Network Science Institute and co-founder of Myndgard. My research measures how language models behave — things like sycophancy, epistemic independence, and causal reasoning. At Myndgard, I build mental-health tools that clinicians use in their work.

What I do

01 / about

I work across human–AI interaction and behavioral science. My PhD is in psychology, focused on computational psychiatry — building machine-learning and network models of mental health from real longitudinal data. That work led to first-author papers in Nature Communications, PNAS, and NPJ Digital Medicine.

Now my research turns vague claims about how language models behave into numbers you can compare across models. I also co-founded Myndgard, where I build AI safety and mental-health tools that universities use with their counseling services.

My work splits in two: research on how AI behaves, and products that put it to use where it matters most — mostly mental health.

NowPostdoctoral FellowNortheastern · Network Science Institute
FounderCo-founder & Technical LeadMyndgard ↗
TrainedPhD, Computational PsychiatryTrinity College Dublin
Funding raised$790K+ as PIacross 4 competitive grants

Research & evaluation

02 / measuring model behavior
Evaluation harness

GoalPref-Bench

A benchmark that tests whether nine language models stick to what a user actually wants or just agree with them. It turns a common worry about sycophancy into one number you can compare across models.

LLM evalsycophancy9 models
Framework + study

Causal-Coherence Probing

Asks a model to forecast something and explain its reasoning as a causal diagram. It then works out which factors matter most and challenges them one at a time: does the model change its mind more when you push on an important factor than a minor one? Run across seven models and hundreds of questions, with reliability and placebo checks.

causal reasoningDAGs7 models
Controlled experiments

Personalization & Epistemic Independence

Experiments on what happens when you personalize a model to a user. Personalizing makes it warmer and more emotionally in tune, but whether it also makes the model more agreeable depends on the role it's playing.

personalizationalignmentarXiv 2026
Human–AI collaboration

Personalized AI for Creative Work

A study with hundreds of people testing whether a personalized AI helps with creative work. It did: people and the AI built on each other's ideas across a back-and-forth, instead of the AI just doing the work. I ran it end to end, from the question to the analysis.

synergycreativityarXiv 2025

Building

03 / shipped & in flight
MyndgardDeployed

Discover

A 12-day, 38-module mental-health course. An AI pipeline writes the content for a range of student backgrounds and checks its own quality. 90% of students improved on mental-health measures.

AI content pipelinepsychoeducation
visit site
MyndgardDeployed

Share

Monitors each student's mental health during the wait for care and flags those who are deteriorating to the clinic manager — so students getting worse are seen sooner, not triaged by who booked first. It also gives the therapist a pre-session summary. In production with university counseling services; 15–20% shorter initial sessions and fewer total sessions per student.

waitlist monitoringdeterioration detectionclinician-in-the-loopuniversity counseling
MyndgardIn development

Guardian

Watches a teenager's conversations with AI companions for signs of harm, and runs on the device itself so the conversations never leave it.

on-deviceAI safetyadolescents
MyndgardIn development

Discover Family

A 10-module course that teaches teens (13+) to spot the ways AI companions can go wrong — flattery, growing attachment, blurred personas, being pulled away from real relationships, and bad responses in a crisis. Each module has scenarios to work through and an AI coach to talk them over with.

AI companion safetyadolescentsPWA
visit site

Selected publications

04 / peer-reviewed & preprint
side projects

Side projects — like testing what an AI intake agent can pick up that a patient never says out loud.

See the side projects

Background

05 / experience · education · grants

Experience

Postdoctoral Research Fellow, Human–AI Interaction
Northeastern University · Network Science Institute
Sep 2024 — Present
Co-founder & Technical Lead
Myndgard
Dec 2022 — Present
PhD Researcher, Computational Psychiatry
Trinity College Dublin
Oct 2018 — Dec 2022

Education

PhD, Psychology
Trinity College Dublin · 2023
MSc, Medical Neurosciences
Charité — Universitätsmedizin Berlin · 2018
BS, Neuroscience & Behavioral Biology (Highest Honors)
Emory University · 2016

Grants & awards

Enterprise Ireland Commercialisation Fund$629K · PI
Enterprise Ireland Proof of Concept$143K · PI
Enterprise Ireland Feasibility Study$16K · PI
Schmidt Sciences — AI at Work$10K · PI
Trinity College Dublin Provost's PhD Award$132K
NeuroCure Cluster of Excellence Scholarship$14K
EIT Health Wild Card1 of 8 EU startups
Dogpatch Labs Founders ProgrammeResident