Data Scientist - Research Sovereign AI
Company: Mayo Clinic
Location: Rochester
Posted on: March 24, 2026
|
|
|
Job Description:
Mayo Clinic is top-ranked in more specialties than any other
care provider according to U.S. News & World Report. As we work
together to put the needs of the patient first, we are also
dedicated to our employees, investing in competitive compensation
and comprehensive benefit plans – to take care of you and your
family, now and in the future. And with continuing education and
advancement opportunities at every turn, you can build a long,
successful career with Mayo Clinic. Benefits Highlights • Medical:
Multiple plan options. • Dental: Delta Dental or reimbursement
account for flexible coverage. • Vision: Affordable plan with
national network. • Pre-Tax Savings: HSA and FSAs for eligible
expenses. • Retirement: Competitive retirement package to secure
your future. Position Summary The Data Scientist for Foundational
Model Science is the senior technical leader, and the lead
scientist responsible for designing, training, and governing Mayo’s
multimodal foundational model. This model forms the core
intelligence layer used by clinical departments, researchers,
agentic workflows, and sovereign AI collaborations. The individual
will work as a hands-on architect, model-builder, and researcher
while acting as a player–coach, guiding strategy and building a
future team. Key Responsibilities Scientific & Technical Leadership
• Design multimodal foundational model architectures integrating
signals from imaging, text, waveforms, structured data, graph
representations, and temporal embeddings. • Develop fusion,
alignment, and cross-modal reasoning mechanisms (early fusion, late
fusion, token-level fusion, hybrid models). • Define and implement
methods for grounded clinical reasoning, retrieval-augmented
inference, graph-augmented attention, and chain-of-thought
verification. • Establish protocols for model lifecycle governance,
safe update cycles, drift-aware re-training, and provenance
tracking. Hands-On Modeling & Training • Train large-scale deep
learning models, including multimodal architectures and
domain-specific transformer-based systems, on real clinical
datasets. • Fine-tune and adapt large language models (LLMs) for
clinical reasoning, summarization, question answering, agentic
behavior, and instruction-following tasks. • Build
retrieval-augmented pipelines using embeddings, vector stores,
graph traversal, and clinically grounded context construction. •
Develop evaluation methods for reasoning quality, temporal
prediction accuracy, multimodal synergy, ablation-based robustness,
and counterfactual behavior. • Create reference-grounded training
datasets, structured reasoning tasks, and multimodal benchmarks to
evaluate model performance. • Conduct hands-on experimentation with
optimization strategies, large-scale distributed training, model
quantization, and inference acceleration. • Implement uncertainty
modeling, selective prediction, abstention mechanisms, and
clinically meaningful risk thresholds. • Build interpretable
reasoning pathways, cross-modal attribution maps, and
reference-grounded explanations. Cross-functional Collaboration •
Work closely with the Representation team to ensure
representation-model alignment. • Partner with clinical SMEs to
encode domain reasoning into reinforcement learning, preference
optimization, or rule-guided behaviors. Team Leadership • Serve as
the future founding technical lead of the Foundational Model
Science Program. • Mentor scientists and engineers and eventually
build a specialty modeling team. Qualifications Required • PhD in
Machine Learning, Computer Science, Applied Mathematics, or related
discipline with at least four years of informatics, Artificial
Intelligence, data science and/or machine learning. • Experience
with generative modeling, reasoning models, or multimodal
foundation models. • Expertise in alignment methods (contrastive
learning, RLHF/RLCS, preference optimization). • Experience with
distributed training, and large-scale compute. Preferred •
Experience with clinical or EMR data across multiple modalities. •
7 years experience training deep learning models, including
transformers or multimodal architectures. • Experience defining
evaluation frameworks for reasoning, multimodal synergy,
reliability, or fairness. • Publications in multimodal learning,
foundation models, or reasoning architectures.
Keywords: Mayo Clinic, Rochester , Data Scientist - Research Sovereign AI, Science, Research & Development , Rochester, Minnesota