HotpotBio is the data lab and research group of Hotpot.ai dedicated to biomedicine.
Our lab philosophy is to address real-world use cases and advance AI models without Internet-scale datasets, employing a two-tier approach.
Labs like ours have long believed the future of AI will mirror the computing industry, where supercomputers tackle the most complex cases, but smartphones serve billions of people. While supermodels have dominated since GPT-3, the smart model paradigm is finally shifting from fringe to credible. Our first tier focuses on architecture and dataset research to meet the inference demands of pragmatic use cases.
This entails both creating base models and extending existing ones. In protein structure prediction, for example, we develop proprietary models to investigate sub-optimal dataset and architecture choices in current approaches. For radiology, we explore how models like Gemma 4 could enable a future where edge devices in rural and underserved communities act as an AI Radiologist, helping overworked general practitioners flag suspicious cases of potential lung cancer.
The second tier concentrates on system research. Our hypothesis is that certain problems are now constrained by model judgment, not model intellect. This research investigates how to construct systems for injecting the relevant priorities and facts for frontier models to reason over, then leading them toward solutions in an iterative cycle similar to PIs leading a lab or CEOs leading a company.
Our mindset is shaped by the observation that raw intellect and job performance are not perfectly correlated, much as raw physical abilities and athletic success are not perfectly correlated. Tom Brady, for example, holds many football records yet was dismissed early on due to ordinary physical abilities. His success stemmed in large part from understanding the opponent and situation, and executing the right play at the right time.
Office tasks exhibit comparable patterns. Grandmasters rarely become the top investors while the best CEOs and investors routinely hire people with higher IQs than themselves. What matters in business is the right information at the right time. The same principle applies broadly, reflecting how people specialize after college -- not to deepen raw cognitive abilities but to acquire domain knowledge for solving real-world problems. Specialization trains human experts on what to learn and when to apply this knowledge. In short, specialization yields judgment.
We have applied our systems research to develop a drug discovery platform for both novel therapies and drug repurposing strategies, and are currently testing the platform against a rare cancer type.
Unable to publish commercial research, we established HotpotBio to advance science in other ways. We draw inspiration from open source, where ephemeral teams innovate by attracting talent across organizational boundaries. Since biomedicine is characterized by sparse data and evolving facts, the field presents a high-impact opportunity for validating hypotheses aligned with our lab vision. Furthermore, ML research resembles biomedical research more than most realize.
Just as findings in one patient may not generalize to others due to genetic and lifestyle differences, ML findings -- even on core parameters like learning rate -- may not generalize due to architecture and training differences. In both, generalizability is far weaker than in physics or mathematics.
Both biomedical and ML research are constrained by data quality. Central to our effort is rethinking biomedical datasets and training approaches in clinical reasoning, oncology, neuroimmunology, drug development, and other specialty areas.
While data errors are tolerable for general ML models, uncommon variants in biomedicine may drive pathology. Training on imprecise medical information may cause misdiagnosis, clinical errors, or drug candidates with elevated toxicity. Complicating matters, evolving medical facts may invalidate training data and model knowledge. What was true last year may be false today. For instance, in April 2024 the U.S. Preventive Services Task Force revised its longstanding guidance and now urges biennial mammograms starting at age 40 -- down from the previous benchmark of 50 -- for average-risk women, citing rising breast-cancer incidence in younger patients.
Ultimately, HotpotBio seeks to advance biomedical research by:
These free are available in GPT, Claude, and Gemini. Open-source contributions are welcome.
Cancer is the second leading cause of death worldwide, claiming the lives of roughly 10 million people per year and devastating the lives of millions more [1].
With 8 billion people and only 12.7 million doctors, personalized healthcare is impossible today. Human doctors alone cannot bridge this gap and provide the attentive care everyone deserves.
Quality datasets and benchmarks can unlock rapid progress in machine learning (ML), but most technologists lack medical expertise while most doctors lack technical expertise.
Without accurate and comprehensive datasets and evaluations, it is hard to train models and improve AI -- not unlike teaching students with poor textbooks and exams.
Our objective is to package medical knowledge into a format suitable for ML engineers and researchers to further AI biomedicine, regardless of medical background.
Concretely, this work involves investigating language model reasoning, developing datasets, and creating frameworks in collaboration with medical professionals from Stanford and other leading institutions.
Our research investigates the association between Epstein-Barr virus (EBV) and cancer, concentrating on the topics below.
Viruses cause cervical cancer, Burkitt lymphoma, nasopharyngeal cancer (NPC), and several other cancer types, but the data is inconclusive for more common cancer types like breast cancer and lung cancer [3-8].
We welcome contributors in the areas below.
Work can fit any schedule and take one of many forms:
Novel datasets for TNBC and non-smoking lung cancer could power tens to hundreds of studies and hopefully set a new precedent for tackling tumor subtypes. See here and here for details.
Clarence Hu