HotpotBio | Hotpot.ai Research Group

Introduction

HotpotBio is the data lab and research group of Hotpot.ai dedicated to biomedicine.

Our lab philosophy is to address real-world use cases and advance AI models without Internet-scale datasets, employing a two-tier approach.

Labs like ours have long believed the future of AI will mirror the computing industry, where supercomputers tackle the most complex cases, but smartphones serve billions of people. While supermodels have dominated since GPT-3, the smart model paradigm is finally shifting from fringe to credible. Our first tier focuses on architecture and dataset research to meet the inference demands of pragmatic use cases.

This entails both creating base models and extending existing ones. In protein structure prediction, for example, we develop proprietary models to investigate sub-optimal dataset and architecture choices in current approaches. For radiology, we explore how models like Gemma 4 could enable a future where edge devices in rural and underserved communities act as an AI Radiologist, helping overworked general practitioners flag suspicious cases of potential lung cancer.

The second tier concentrates on system research. Our hypothesis is that certain problems are now constrained by model judgment, not model intellect. This research investigates how to construct systems for injecting the relevant priorities and facts for frontier models to reason over, then leading them toward solutions in an iterative cycle similar to PIs leading a lab or CEOs leading a company.

Our mindset is shaped by the observation that raw intellect and job performance are not perfectly correlated, much as raw physical abilities and athletic success are not perfectly correlated. Tom Brady, for example, holds many football records yet was dismissed early on due to ordinary physical abilities. His success stemmed in large part from understanding the opponent and situation, and executing the right play at the right time.

Office tasks exhibit comparable patterns. Grandmasters rarely become the top investors while the best CEOs and investors routinely hire people with higher IQs than themselves. What matters in business is the right information at the right time. The same principle applies broadly, reflecting how people specialize after college -- not to deepen raw cognitive abilities but to acquire domain knowledge for solving real-world problems. Specialization trains human experts on what to learn and when to apply this knowledge. In short, specialization yields judgment.

We have applied our systems research to develop a drug discovery platform for both novel therapies and drug repurposing strategies, and are currently testing the platform against a rare cancer type.

Unable to publish commercial research, we established HotpotBio to advance science in other ways. We draw inspiration from open source, where ephemeral teams innovate by attracting talent across organizational boundaries. Since biomedicine is characterized by sparse data and evolving facts, the field presents a high-impact opportunity for validating hypotheses aligned with our lab vision. Furthermore, ML research resembles biomedical research more than most realize.

Just as findings in one patient may not generalize to others due to genetic and lifestyle differences, ML findings -- even on core parameters like learning rate -- may not generalize due to architecture and training differences. In both, generalizability is far weaker than in physics or mathematics.

Both biomedical and ML research are constrained by data quality. Central to our effort is rethinking biomedical datasets and training approaches in clinical reasoning, oncology, neuroimmunology, drug development, and other specialty areas.

While data errors are tolerable for general ML models, uncommon variants in biomedicine may drive pathology. Training on imprecise medical information may cause misdiagnosis, clinical errors, or drug candidates with elevated toxicity. Complicating matters, evolving medical facts may invalidate training data and model knowledge. What was true last year may be false today. For instance, in April 2024 the U.S. Preventive Services Task Force revised its longstanding guidance and now urges biennial mammograms starting at age 40 -- down from the previous benchmark of 50 -- for average-risk women, citing rising breast-cancer incidence in younger patients.

Ultimately, HotpotBio seeks to advance biomedical research by:

Developing open-source frameworks and tools to help patients, researchers, and clinicians leverage AI
Publishing papers to highlight where the current understanding in cancer and other diseases may be incomplete due to methodological gaps
Building benchmarks to measure AGI progress in healthcare and general reasoning
Developing healthcare and drug discovery models

Free Tools

These free are available in GPT, Claude, and Gemini. Open-source contributions are welcome.

Patient.md: simplifies case sharing for second opinions, promotes patient autonomy, and provides a foundation for clinical trial matching by organizing medical data for AI assistants.
Conclusion Checker: helps identify which conclusions may not generalize to human patients, real-world settings, or broader populations. While all studies are valuable and scientists produce remarkable work under challenging conditions, sometimes readers and the media overlook critical limitations and promote overstated narratives. For patients in particular, this has profound implications.

Papers

Public Datasets

Problem

Cancer is the second leading cause of death worldwide, claiming the lives of roughly 10 million people per year and devastating the lives of millions more [1].

With 8 billion people and only 12.7 million doctors, personalized healthcare is impossible today. Human doctors alone cannot bridge this gap and provide the attentive care everyone deserves.

Goal

Quality datasets and benchmarks can unlock rapid progress in machine learning (ML), but most technologists lack medical expertise while most doctors lack technical expertise.

Without accurate and comprehensive datasets and evaluations, it is hard to train models and improve AI -- not unlike teaching students with poor textbooks and exams.

Our objective is to package medical knowledge into a format suitable for ML engineers and researchers to further AI biomedicine, regardless of medical background.

Concretely, this work involves investigating language model reasoning, developing datasets, and creating frameworks in collaboration with medical professionals from Stanford and other leading institutions.

Cancer Research

Our research investigates the association between Epstein-Barr virus (EBV) and cancer, concentrating on the topics below.

Viruses cause cervical cancer, Burkitt lymphoma, nasopharyngeal cancer (NPC), and several other cancer types, but the data is inconclusive for more common cancer types like breast cancer and lung cancer [3-8].

1K non-smoking lung cancer dataset: see below.
1K TNBC dataset: see below.
Joint Omics Adaptive Nosological (JOAN) detection framework: systematic computational-experimental framework for detecting viruses in cancer samples, starting with adenocarcinomas.
EBV association with breast cancer, starting with triple-negative breast cancer (TNBC).
EBV association with lung cancer, starting with non-smoking lung cancer.
EBV association with NPC, Burkitt lymphoma, and gastric cancer.
EBV association with MYC.
EBV sequence conservation.

How To Collaborate

Review Collaboration Areas.
To maintain the integrity of our network and the quality of data annotations, collaboration is by invitation only. We exclusively partner with MDs, PhDs, and postdocs from leading research institutions.
Candidates must provide the details below and may reach out with questions. See below for contact information.

Summary of experience and credentials in areas of interest
Resume
Publishing history

Collaboration Areas

We welcome contributors in the areas below.

Endocrinology
Genomics
Immunology
Infectious Disease
Medicinal Chemistry
Neurology
Oncology

Work can fit any schedule and take one of many forms:

Creating 100-200 multiple choice questions per specialty
Reviewing questions
Defining key clinical tasks and requirements
Conducting lit reviews
Reviewing paper drafts

Research Culture

HotpotBio focuses on science, deferring policy and ethics to other forums.

Although this position may not appeal to all, the benefit of clear values is cultivating an environment where everyone can concentrate on science. Organizational theory demonstrates that teams united by shared priorities and explicit expectations foster more productive collaborations.

I understand the anxiety around AI, but our culture is rooted in a deep study of technology history and societal progress. Throughout time, a consistent pattern has characterized the emergence of disruptive technology. This cycle was observed with books, computers, the web, and it's repeating again with AI. Fear dominates the discourse while concerned critics seek to curb capabilities and protect the masses.

With hindsight, we know those noble intentions were misguided and failed to account for the transformative benefits spawned by innovation. General technology, by definition, is wieldable for good or bad, but the good vastly outweighs the bad. This propels the world to greater heights of prosperity and accessibility.

On ethics, most people aspire to be moral and responsible, but the challenge is: whose values dictate tradeoffs and resolve disputes? Officials from California, Texas, China, India, France, Japan, the UK, Saudi Arabia, or where? Whose risk profile shines the way forward? For instance, GPT-2, GPT-3, and GPT-4 were all considered too dangerous for the average person, but those worries proved exaggerated at best and unfounded at worst. Moreover, it's presumptuous to assume one jurisdiction can bottle up software ingenuity or constrain global innovation. If America surrenders AI leadership, other nations will readily fill the void.

While healthy people can afford the luxury of endless deliberation, the sick cannot. With nearly 800K people passing away each month from cancer, discovering breakthroughs even one month sooner can save lives and spare immeasurable suffering.

Intelligent people may disagree. I respect different opinions and hope others can as well.

TNBC & Non-smoking Lung Cancer Datasets

Novel datasets for TNBC and non-smoking lung cancer could power tens to hundreds of studies and hopefully set a new precedent for tackling tumor subtypes. See here and here for details.

Contact Information

Clarence Hu

X: x.com/panabee
Email: clarence --at-- hotpot dot ai

References

WHO Cancer Fact Sheet.
Given controversies over defining "open source," the term "open research" reflects a desire to advance biomedicine while avoiding semantic debates.
Extrachromosomal Amplification of Human Papillomavirus Episomes as a Mechanism of Cervical Carcinogenesis.
Gaps and Opportunities to Improve Prevention of Human Papillomavirus-Related Cancers.
Epstein-Barr virus provides a survival factor to Burkitt's lymphomas.
Targeting Epstein-Barr Virus in Nasopharyngeal Carcinoma.
EBV Infection and Its Regulated Metabolic Reprogramming in Nasopharyngeal Tumorigenesis.
EBV infection-induced GPX4 promotes chemoresistance and tumor progression in nasopharyngeal carcinoma.