Finding Promising Molecules and Biological Background With Balto

Drug discovery begins with an idea — a hunch that a certain pathway or protein could be the key to treating disease. But between that spark and a promising lead compound lies a swamp of friction: hours spent cross-searching literature, databases, and binding studies, just to confirm what’s already known.

Medicinal chemists routinely jump between a half-dozen disconnected tools: PubChem for physicochemical data, ChEMBL for assay results, PDB for binding evidence, Google Scholar for context. Each click leads to another tab, another export, another round of manual comparison.

That process is powerful — but feels pretty painful and slow for 2026.

I want to step you through what this process looks like with Balto. We’re going to find a promising molecule and its biological background, with thorough analysis at every step in less than an hour.

If you follow along through the walkthrough, you’ll see that Balto doesn’t replace scientific judgment. It lays out all the pertinent information, provides clarifications, and helps you chart the best path forward.

Ask a question, and Balto draws from curated sources like ChEMBL, PubChem, and literature databases, linking mechanistic data, binding affinities, and even known resistance mutations.

It’s as if every specialized search engine were already talking to each other, laying out information for and with you.

The result: scientists can move from “What if this target matters?” to “Here are the best candidate molecules to test” without ever leaving chat.

(You can then perform simulation, docking, and prediction of properties like ADMET directly in chat as well, but those are subjects for other walkthroughs.)

This first video below shows how that process works in practice. But let me warn you, Balto goes deep “into the weeds” (in a good way). Here we’re primarily exploring Balto’s agentic research capabilities. It can get a little long winded (and extremely thorough). So if you want to nerd out about the current cutting edge of AI-aided drug discovery iteration take a look at the Balto chat itself.

Anyways, onwards…

Let’s identify a promising small molecule with supporting biological evidence for a new drug target in less than an hour.

Step 1 — Exploring the biological landscape

Our initial prompt for Balto:

Summarize the current biological and therapeutic landscape around KRAS G12D as a drug target.

Include:
- Validated mechanisms of inhibition (e.g., covalent, allosteric, protein-protein disruption)
- Key signaling pathways involved
- Known clinical or preclinical failures and resistance mechanisms
- Open questions or emerging strategies that appear promising but underexplored

Please prioritize recent primary literature and reviews and cite your sources.

View Balto's full response

KRAS G12D as a Drug Target: Current Biological and Therapeutic Landscape

Overview

KRAS G12D represents the most prevalent KRAS mutation in pancreatic ductal adenocarcinoma (PDAC, ~40% of cases) and occurs frequently in colorectal cancer (CRC) and non-small cell lung cancer (NSCLC). After decades of being considered “undruggable,” KRAS G12D has emerged as a tractable target through multiple mechanistic approaches, though no agent has yet achieved regulatory approval as of January 2026.

Validated Mechanisms of Inhibition

1. Non-Covalent Allosteric Inhibition (RAS-OFF State)

MRTX1133 pioneered direct G12D targeting via high-affinity, non-covalent binding to the switch-II pocket of GDP-loaded KRAS G12D. This approach stabilizes the inactive conformation, blocking nucleotide exchange and effector binding with sub-nanomolar potency and strong allele selectivity (Cancer Discovery, PMID: 36216931).

Preclinical achievements: MRTX1133 demonstrated widespread tumor regressions in PDAC xenografts and immunocompetent models, reprogrammed the tumor microenvironment by increasing CD8+ T-cell infiltration and FAS expression, and synergized with PD-1/PD-L1 checkpoint blockade (Immunity, PMID: 37625401).

Clinical outcome: The Phase 1/2 trial (NCT05737706) was terminated in March 2025 after Phase 1 due to highly variable, suboptimal pharmacokinetics—not safety concerns—highlighting translational challenges for non-covalent switch-II pocket binders in humans.

2. Covalent RAS-ON Inhibition via Tri-Complex (Molecular Glue) Catalysis

Zoldonrasib (RMC-9805) represents a paradigm shift: it acts as a molecular glue, recruiting cyclophilin A to form a neomorphic protein-protein interface with GTP-bound (active) KRAS G12D. This induced proximity catalyzes selective, irreversible covalent modification of the Asp12 residue, disabling downstream signaling (Science, July 2025; company announcement).

Clinical progress: As of January 2026, zoldonrasib holds FDA Breakthrough Therapy Designation for previously treated KRAS G12D-mutant NSCLC, based on a 61% objective response rate and 89% disease control rate at the 1200 mg QD dose in early-phase trials (AACR 2025; FDA announcement January 8, 2026). This is the most clinically advanced G12D-targeted therapy to date.

3. Targeted Protein Degradation (PROTACs)

Multiple KRAS G12D-selective degraders have been reported in 2024-2025, representing an event-driven pharmacology approach that eliminates the mutant protein entirely:

Compound 8o (J. Med. Chem. 2024): Built on MRTX1133 scaffolds, induced rapid, selective G12D degradation with strong pathway suppression and significant antitumor efficacy in AsPC-1 PDAC xenografts.
CH091138 (Eur. J. Med. Chem. 2025): Selectively degrades endogenous KRAS G12D (not WT or other mutants), downregulates KRAS in proteomics, and suppresses AsPC-1 tumors and G12D patient-derived organoids in a VHL- and proteasome-dependent manner.
ZJK-807 (J. Med. Chem. 2025): Overcomes resistance mutations affecting switch-II pocket inhibitors with selective cellular degradation and in vivo tumor growth inhibition.

4. Direct Covalent Targeting via Strain-Release Electrophiles

A 2024 chemistry breakthrough demonstrated that malolactone-based, strain-release electrophiles grafted onto switch-II pocket ligands can react selectively with the Asp12 carboxylate, forming stable covalent adducts in both GDP and GTP states (Nature Chemical Biology, 2024).

Key Signaling Pathways Involved

RAF→MEK→ERK (MAPK) Axis

KRAS G12D drives constitutive activation of the MAPK cascade, which is essential for PDAC initiation and maintenance. ERK drives the majority of the KRAS-dependent phosphoproteome in KRAS-mutant PDAC.

PI3K→AKT→mTORC1 Axis

KRAS G12D directly engages class I PI3Ks with enhanced affinity compared to wild-type KRAS, activating survival and metabolic reprogramming pathways.

Crosstalk and Adaptive Signaling

Loss of ERK-mediated negative feedback drives rebound signaling through wild-type RAS isoforms to both ERK and AKT. In KRAS G12D-mutant CRC, MRTX1133 triggers EGFR/pan-ERBB feedback that activates wild-type RAS and rescues MAPK and PI3K signaling.

Clinical and Preclinical Failures & Resistance Mechanisms

Clinical Failures to Date:

MRTX1133: Despite exceptional preclinical potency, the Phase 1/2 trial was terminated due to highly variable, suboptimal human PK without a safety signal.
Checkpoint blockade monotherapy: Limited benefit in PDAC outside MSI-H/dMMR subsets (~1-2% of cases).

Mechanisms of Resistance:

MAPK pathway reactivation via RTKs/EGFR/ERBB
Upstream RAS reactivation through SHP2/SOS1
Bypass signaling and parallel pathway engagement (PI3K/AKT/mTOR)
Metabolic and cell-state adaptations (autophagy-mediated glutathione synthesis)
Pocket and interface mutations (Q95/Y96 conferring resistance to pocket-binding inhibitors)
Tumor microenvironment and immune evasion

Open Questions and Emerging Strategies

Promising but Underexplored Approaches:

KRAS G12D inhibition + immunotherapy combinations in PDAC
Neoantigen vaccines at minimal residual disease (ELI-002)
Combination with stromal/chemokine remodeling
Optimizing tri-complex and molecular glue strategies
Multi-epitope TCR-T and HLA-loss countermeasures

Before touching a molecule, discovery starts with context.

Is the target druggable?

Which mechanisms have worked — and which have failed?

Where does resistance emerge, and what strategies might overcome it?

Rather than issuing a single broad search, Balto approaches this step the way an experienced scientist would: by iteratively refining its questions as new information emerges.

The initial prompt asked Balto to summarize the therapeutic landscape around KRAS G12D — not just listing inhibitors, but identifying validated mechanisms, clinical outcomes, resistance pathways, and open questions. From there, Balto decomposed the task into a sequence of focused research passes.

In response, Balto began by surveying known KRAS G12D inhibitors, then pivoted to clinical trial outcomes, followed by mechanistic studies explaining both success and failure. As resistance patterns emerged, Balto expanded its search to upstream and downstream signaling pathways. Finally, it explored emerging strategies — molecular glues, degraders, immunotherapy combinations — that attempt to address the limitations of first-generation approaches.

Internally, this looked like a loop of refinement:

initialize session memory
search inhibitor literature
assess clinical success and failure
analyze resistance mechanisms
map signaling pathways and adaptive feedback
explore emerging and underexplored strategies
update memory with consolidated findings

Each step informed the next. Rather than treating papers as isolated facts, Balto integrated them into a growing mental model of the target. It then saved these observations and supporting data to its memory to build on throughout the subsequent steps.

To put this another way, the result is not a flat literature summary, but a structured view of the KRAS G12D landscape: which mechanisms are validated, where pharmacology has broken down, how tumors adapt, and where new opportunities may lie.

To make this information usable beyond the chat, Balto also exports ranked results — including top pathway papers, inhibitor studies, and clinical outcomes — as CSVs for further inspection. Chemists can scan, sort, and cross-reference these outputs just as they would any internal dataset.

In practice, this step compresses weeks of background research into minutes — giving scientists a clear foundation to decide which mechanisms are worth pursuing, and setting up the next step: identifying and comparing known active molecules.

Step 2 — Finding known actives and mechanistic anchors

Our prompt to Balto:

Using ChEMBL, identify known small molecules that directly target KRAS G12D or modulate KRAS signaling via SOS1 or SHP2.

For each compound, return:

- Target and mechanism
- Mutation or allele specificity
- Best reported biochemical and cellular potencies
- Assay context
- Development stage

Rank results by relevance to KRAS G12D biology and strength of evidence.

View Balto's full response

Summary of what I delivered:

538 KRAS G12D-specific activities analyzed
100 SOS1-KRAS inhibitors identified
200 SHP2 bioactivity records compiled
Ranked by relevance to KRAS G12D biology and evidence strength
12 top compounds profiled with all requested details

Key findings from the analysis:

Direct KRAS G12D Inhibitors:

MRTX-1133: 0.4 nM IC50, Phase 1 (terminated due to PK issues)
CHEMBL4855757: 0.1 nM IC50 (ultra-potent, preclinical)

SOS1 Inhibitors:

BI-1701963: Phase 1 active, pan-KRAS mechanism
MRTX0902: Phase 1, brain-penetrant, deprioritized by BMS
Top ChEMBL compounds: 1–2 nM IC50

SHP2 Inhibitors:

JAB-3312: Phase III (China), ORR 65–71% with glecirasib
TNO155: Phase 1b/2, ORR ~33% with JDQ443
Top ChEMBL compounds: 0.32–7 nM IC50

#1 by Pure Biochemical Potency: CHEMBL4855757

IC50: 0.1 nM (most potent G12D inhibitor in ChEMBL)

Limitations:

Only binding data - no cellular functional assay
No published name or structure disclosed
No mechanism details beyond G12D binding
No in vivo or clinical data

#1 by Potency + Mechanism Quality: MRTX-1133 (CHEMBL4858364) — RECOMMENDED

Biochemical IC50: 0.4 nM | Cellular IC50: 50 nM

Target & Mechanism: Non-covalent, allosteric inhibitor that binds the switch-II pocket of GDP-loaded KRAS G12D.

Why MRTX-1133 is the better #1:

Functional validation - works in cells, not just in a test tube
Defined mechanism - switch-II pocket allosteric inhibitor with structural basis
Clinical-grade - advanced to human trials with extensive preclinical validation
Comprehensive dataset - biochemical + cellular + in vivo + TME remodeling studies

Preclinical Achievements:

Tumor regressions in PDAC xenografts and syngeneic models
Reprograms tumor microenvironment (↑CD8+ T cells, ↑FAS, ↓myeloid)
Synergizes with PD-1/PD-L1 checkpoint blockade
Combination-ready: validated with anti-EGFR, SOS1i, autophagy inhibitors

With a biological hypothesis in hand, the next question is not “what molecules exist?” — it’s “which chemical strategies have already demonstrated real traction against this biology?”

Rather than issuing a narrow keyword search, the Step 2 prompt explicitly told Balto to build on what it had already learned. That single instruction mattered. It signaled that this was a continuation of the same discovery thread, not an isolated lookup.

As a result, Balto reused its session memory from Step 1 — including validated mechanisms, resistance pathways, and translational challenges — to scope a focused ChEMBL analysis. It did not re-derive the KRAS G12D landscape. It went straight to identifying compounds that meaningfully intersect with it.

Internally, this meant querying ChEMBL across multiple mechanistic classes in parallel: direct KRAS G12D binders, SOS1 disruptors that suppress upstream signaling, and SHP2 inhibitors that blunt adaptive feedback. Each class was evaluated not just by potency, but by biological relevance — allele specificity, assay context, and evidence strength.

Instead of returning a long tail of loosely related hits, Balto collapsed hundreds of bioactivity records into a ranked short list. Across 538 KRAS G12D–specific activity entries, 100 SOS1 inhibitors, and 200 SHP2 records, it identified a dozen compounds that anchor the current state of the field — spanning ultra-potent preclinical binders, clinically tested pathway modulators, and combination-validated SHP2 programs.

Crucially, Balto treated these compounds as mechanistic exemplars, not just names on a list. For each, it surfaced how the molecule works, where it has succeeded or failed clinically, and what limitations remain unresolved. In doing so, it turned ChEMBL from a database into a decision-making tool.

By the end of Step 2, the chemist isn’t just aware of what’s been made — they understand which chemical strategies have been validated, which have stalled due to pharmacology or resistance, and which modalities appear most promising for further exploration.

This sets a clear transition point. Once candidate molecules are identified and ranked by biological relevance, the next question becomes unavoidable:

What differentiates these compounds chemically — and which properties might explain their successes or failures?

Step 3 — Comparing physicochemical profiles and developability signals

Our prompt to Balto:

Building on the top KRAS G12D-relevant compounds identified in the previous step, retrieve physicochemical properties and bioactivity summaries from PubChem

For each compound include:
- Molecular weight, cLogP, TPSA, HBD/HBA, rotatable bonds
- Aqueous solubility and permeability indicators (if available)
- Reported bioavailability, formulation notes, and exposure-related flags
- SMILES strings for downstream modeling and docking

Benchmark these properties against sotorasib and adagrasib, and highlight which parameters may explain known pharmocokinetic limitations or clinical outcomes

View Balto's full response

Physicochemical & ADMET Profile Analysis: Top KRAS Compounds

Complete Property Comparison Table

Property	CHEMBL4855757	MRTX-1133	Sotorasib	Adagrasib	Ideal Range
Clinical Stage	Preclinical	Phase 1 (terminated)	FDA Approved	FDA Approved	—
Molecular Weight	572.3 Da	600.2 Da ⚠️	560.6 Da ✓	604.1 Da ⚠️	<500 Da
cLogP	3.56 ✓	3.91 ⚠️	3.37 ✓	3.48 ✓	0-3
LogS (solubility)	-4.74 ⚠️	-4.78 ⚠️	-3.99 ✓	-4.06 ✓	>-4
HBA (acceptors)	—	11 ⚠️	7 ✓	9 ⚠️	≤10
QED Score	0.33 ⚠️	0.32 ⚠️	0.36 ⚠️	0.36 ⚠️	>0.5

Key Findings: What Explains PK Limitations?

MRTX-1133 PK Failure Root Causes

The clinical termination of MRTX-1133 due to “variable/suboptimal PK” can be explained by multiple compounding factors:

A. Molecular Weight & Complexity Burden

MW = 600.2 Da (20% above Lipinski’s Rule of 500 Da)
High polarity: 11 hydrogen bond acceptors (vs. 7 for sotorasib)
Result: Poor oral absorption, high first-pass metabolism, variable bioavailability

B. Solubility Crisis

LogS = -4.78 (predicted aqueous solubility ~10 μM)
Sotorasib LogS = -3.99 (65 μM, 6.5× more soluble)

C. Poor Drug-Likeness

QED = 0.32 (where 1.0 = ideal drug)
Lower than sotorasib (0.36) and adagrasib (0.36)

Why Sotorasib and Adagrasib Succeeded

Despite also being large, complex molecules, the approved G12C inhibitors have critical advantages:

Sotorasib (Lumakras) - Superior Profile:

Better solubility: LogS -3.99 (6.5× > MRTX-1133)
Lower MW: 560.6 Da (40 Da lighter)
Fewer HBA: 7 vs. 11 (better permeability)
Optimal cLogP: 3.37 (better absorption/distribution balance)

Adagrasib (Krazati) - Acceptable Compromises:

Similar solubility: LogS -4.06 (slightly better than MRTX-1133)
No HBD: 0 donors (reduces crystal lattice energy, aids dissolution)
Required advanced formulation strategy (amorphous solid dispersion)
Higher dose to compensate (600 mg BID vs QD)

Recommendations for G12D Inhibitor Optimization

Priority 1: Improve Solubility — Target LogS > -4.0

Priority 2: Reduce Molecular Weight — Target MW < 550 Da

Priority 3: Optimize HBA Count — Target ≤8 HBA

Priority 4: Maintain Potency — Accept modest potency loss (5-10×) for 10× better PK

By this point, Balto has already identified which chemical strategies matter biologically.

Step 3 asks a harder — and more consequential — question:

Why do some of these molecules translate clinically, while others stall despite extraordinary potency?

Rather than treating PubChem as a static property lookup, the Step 3 prompt explicitly instructed Balto to build on the ranked compounds from Step 2 and benchmark them against known clinical outcomes. That framing mattered. It told Balto that this step was about explanation, not enumeration.

As a result, Balto didn’t simply return molecular weights and LogP values. It assembled a comparative, decision-oriented analysis — pulling physicochemical properties, estimating ADMET risk, and contextualizing each value against real-world outcomes for approved and failed programs.

Internally, Balto treated sotorasib and adagrasib as calibration points, not the main event. Their property profiles were used to anchor expectations: what ranges proved workable in humans, what compromises were tolerated, and which formulation strategies ultimately succeeded. Against that backdrop, Balto evaluated next-generation KRAS G12D inhibitors — including MRTX-1133 and ultra-potent preclinical binders — and highlighted where physicochemical red flags accumulated.

This is where the workflow moves beyond “potency-first thinking.”

Balto surfaced how multiple small disadvantages compound:

marginal solubility
excess molecular weight
high hydrogen bond acceptor counts
structural complexity

Together, these explain why a molecule like MRTX-1133 could demonstrate breathtaking preclinical efficacy — yet fail in the clinic due to variable and suboptimal exposure.

Crucially, Balto didn’t stop at diagnosis. It translated these observations into actionable medicinal chemistry guidance: which parameters are most predictive of failure, which tradeoffs are hardest to fix, and where modest potency loss might be justified to achieve viable pharmacokinetics.

By the end of Step 3, the chemist has more than a property table. They have a coherent hypothesis linking mechanism, structure, physicochemical profile, and clinical outcome — along with a clear sense of which candidates are worth pushing forward, and which require rethinking at the scaffold level.

At this point, one question naturally remains:

If these molecules are so large and complex, what exactly are they doing in the binding pocket — and where is that complexity coming from?

Answering that requires looking directly at experimental binding evidence and structure.

Step 4 — Explaining binding-driven complexity

Our prompt to Balto:

Using available experimental data, compare how leading KRAS inhibitors bind the switch-II pocket.

Specifically:
- Show biochemical binding data for sotorasib, adagrasib, MRTX-1133, and representative KRAS G12D inhibitors
- Retrieve associated PDB structures or cryo-EM models
- Highlight key ligand–protein interactions that drive affinity, selectivity, and residence time
- Explain how differences in G12C vs G12D binding mechanisms impose distinct structural and physicochemical constraints

Focus on how binding mode explains the molecular size, polarity, and complexity observed in G12D inhibitors.

View Balto's full response

Comprehensive Structural Analysis: KRAS Inhibitor Binding Modes

Biochemical Binding Data Summary

Parameter	Sotorasib (G12C)	Adagrasib (G12C)	MRTX-1133 (G12D)
Mechanism	Covalent GDP-state	Covalent GDP-state	Non-covalent (GDP/GTP)
KD (pre-covalent)	~220 nM	9.59 ± 2.09 nM	0.40 ± 0.11 nM
Residence Time	Protein turnover	Protein turnover	Est. 4-40 min
Selectivity (WT)	~700-fold	>6,000-fold	~6,000-fold

Key Insight: G12C covalent inhibitors benefit from permanent engagement (residence time = protein lifetime), while G12D non-covalent inhibitors must achieve sub-nanomolar KD with reversible binding.

Crystal Structure Analysis

Compound	PDB ID	Resolution	Key Features
Sotorasib	6OIM	1.65 Å	Covalent Cys12; H95/Y96/Q99 cryptic pocket
Adagrasib	6UT0	1.94 Å	Covalent Cys12; His95 H-bond; 8-Cl-naphthyl deep pocket
MRTX-1133	7T47	1.27 Å	Non-covalent; Asp12 salt bridge; state-compatible

Why G12D Inhibitors Are Larger: The Non-Covalent Affinity Problem

Covalent G12C Inhibitors:

Pre-covalent KD can be 10-200 nM (moderate)
Covalent bond adds ~5-7 kcal/mol “free” binding energy
Post-covalent residence time = protein lifetime (hours to days)
Result: Can use smaller, less complex scaffolds

Non-Covalent G12D Inhibitors:

Must achieve KD < 1 nM WITHOUT covalent bond
Requires ~10 kcal/mol total binding energy from:
- Salt bridge (Asp12): ~4-5 kcal/mol
- Hydrophobic burial: ~3-4 kcal/mol
- Multiple H-bonds: ~2-3 kcal/mol
- Shape complementarity: ~1-2 kcal/mol
Complex dissociates on minutes timescale
Result: Requires larger scaffold with more interaction points

The G12D Druggability Paradox

Non-Covalent Mechanism REQUIRES: → Extensive Interaction Network (to achieve <1 nM KD) → Large Molecular Size (>550 Da) + High Polarity (>10 HBA) → Poor Solubility + Poor Permeability → Challenging PK Profile → Clinical Translation Risk

Not a chemistry failure—an inevitable consequence of non-covalent G12D targeting.

Path Forward for G12D Inhibitors

Three Viable Strategies:

Tri-Complex Covalent (Zoldonrasib): Reintroduces covalency via CypA catalysis. Early clinical success (FDA BTD).
Targeted Protein Degradation (PROTACs): Circumvents pocket mutations. Event-driven pharmacology.
Formulation Innovation: Advanced formulations for existing scaffolds. Higher risk path.

By the end of Step 3, the picture was clear — and unsettling.

The most potent KRAS G12D inhibitors were not failing because of sloppy chemistry or avoidable mistakes. Their physicochemical liabilities were structurally imposed. The question was no longer whether these molecules were “too big” or “too polar,” but why they had to be.

Step 4 turns to binding evidence to answer that question directly.

Using BindingSearch, Balto aggregated experimental binding data alongside available structural models for leading KRAS inhibitors — including approved G12C drugs (sotorasib and adagrasib), clinical-stage G12D inhibitors (MRTX-1133), and representative high-potency G12D binders from the literature. This allowed potency, residence time, and binding mode to be examined together, rather than in isolation.

What emerged was a clear mechanistic explanation for everything uncovered earlier.

Covalent G12C inhibitors benefit from a privileged anchor: Cys12. That single electrophilic bond compensates for lower intrinsic affinity, allowing smaller, more drug-like molecules to achieve durable inhibition with fewer interactions and reduced structural burden. In contrast, G12D inhibitors lack a comparable covalent handle. To achieve slow off-rates and sufficient pathway suppression, they must rely on extensive non-covalent interactions across the switch-II pocket.

Those interactions come at a cost.

Balto highlighted how G12D binders require larger contact surfaces, more hydrogen bond acceptors, and tighter geometric complementarity — all of which translate directly into increased molecular weight, polarity, and rigidity. The very features that enable sub-nanomolar binding also degrade solubility, permeability, and oral exposure.

In other words, the physicochemical problems observed in Step 3 were not accidental. They were the downstream consequence of binding mode.

This structural perspective closes the loop on the investigation. It explains why certain optimization strategies struggle, why formulation alone often isn’t enough, and why alternative modalities — such as tri-complex “molecular glue” approaches or event-driven degradation — are gaining traction for KRAS G12D.

By the end of Step 4, the chemist isn’t just comparing compounds. They understand the mechanistic tradeoffs imposed by the target itself — and can reason about which strategies are likely to scale, and which will continue to fight the physics of the pocket.

Bringing it together: from hypothesis to a defensible lead

Drug discovery rarely fails because scientists lack ideas.

It fails because connecting those ideas to evidence takes too long to vet and push the truly good ideas forward.

In this walkthrough, we didn’t try to design a drug. We did something more fundamental: we compressed early discovery thinking — the part that usually lives across dozens of tabs, spreadsheets, and half-finished notes — into a single, coherent flow.

Step by step, Balto acted less like a search engine and more like a junior scientist who remembers what they’ve learned.

With LiteratureSearch and WebResearch, Balto mapped the biological landscape: mechanisms, pathways, resistance, and open questions — not just what’s known, but what still matters.
With CHEMBLSearcher, it grounded that biology in chemistry, identifying which molecular strategies have actually shown traction and which have failed for reasons beyond potency.
With PubChemSearch, Balto explained why those failures occurred, linking physicochemical properties to real-world pharmacokinetic outcomes rather than treating properties as abstract numbers.
And with BindingSearch, it closed the loop structurally — showing how binding mode drives molecular complexity, selectivity, and ultimately developability.

By the end of the process, we didn’t just have a list of compounds (though we also gained this). We had something far more valuable:

a clear biological rationale
a ranked set of mechanistically relevant molecules
an understanding of why certain approaches succeed or stall
and a defensible hypothesis for what a promising next molecule would need to look like — chemically and biologically

That’s the real outcome of early discovery.

Balto doesn’t replace medicinal chemistry judgment. It accelerates the part of the work where judgment matters most — by ensuring that every decision is grounded in integrated evidence rather than fragmented searches.

This first tutorial focused on finding promising molecules and their biological background. In future walkthroughs, we’ll build directly on this foundation — moving into structure-guided optimization, simulation, docking, and ADMET prediction — without ever having to restart from scratch.

Because discovery doesn’t happen in steps. It compounds.

Finding Promising Molecules and Biological Background With Balto

Step 1 — Exploring the biological landscape

KRAS G12D as a Drug Target: Current Biological and Therapeutic Landscape

Overview

Validated Mechanisms of Inhibition

1. Non-Covalent Allosteric Inhibition (RAS-OFF State)

2. Covalent RAS-ON Inhibition via Tri-Complex (Molecular Glue) Catalysis

3. Targeted Protein Degradation (PROTACs)

4. Direct Covalent Targeting via Strain-Release Electrophiles

Key Signaling Pathways Involved

RAF→MEK→ERK (MAPK) Axis

PI3K→AKT→mTORC1 Axis

Crosstalk and Adaptive Signaling

Clinical and Preclinical Failures & Resistance Mechanisms

Open Questions and Emerging Strategies

Step 2 — Finding known actives and mechanistic anchors

#1 by Pure Biochemical Potency: CHEMBL4855757

#1 by Potency + Mechanism Quality: MRTX-1133 (CHEMBL4858364) — RECOMMENDED

Step 3 — Comparing physicochemical profiles and developability signals

Physicochemical & ADMET Profile Analysis: Top KRAS Compounds

Complete Property Comparison Table

Key Findings: What Explains PK Limitations?

MRTX-1133 PK Failure Root Causes

Why Sotorasib and Adagrasib Succeeded

Recommendations for G12D Inhibitor Optimization

Step 4 — Explaining binding-driven complexity

Comprehensive Structural Analysis: KRAS Inhibitor Binding Modes

Biochemical Binding Data Summary

Crystal Structure Analysis

Why G12D Inhibitors Are Larger: The Non-Covalent Affinity Problem

The G12D Druggability Paradox

Path Forward for G12D Inhibitors

Bringing it together: from hypothesis to a defensible lead

Related Articles

A Blind Spot in Drug Discovery: Pocket Finding

Benchmarking and Development of AI-Based Agentic Systems for Autonomous Drug Discovery

Docking and Virtual Screening in Drug Discovery: Finding the Right Hit