Documentation

untangle.bio is an AI-native platform for downstream process design in biotechnology. Generate optimal purification routes, run real-time simulations, and perform techno-economic analysis — all in one workspace.

Quick Start

Get up and running with untangle.bio in under 5 minutes. The platform follows a simple workflow: define your feed, set target products, generate routes, and analyze results.

🔬 Define Feed

Set flow rate, components, and properties. Choose from 70+ molecules or add custom components.

🚀 Generate Routes

AI explores thousands of combinations using genetic algorithms and expert rules.

📊 Analyze Results

Compare yield, purity, and CAPEX across routes. Export optimized designs.

Pro tip: Start with the Balanced optimization mode for your first project. It provides a good mix of yield and purity while keeping costs reasonable.

Scope & Limitations

untangle.bio is a conceptual process design tool, intended for early-stage route screening and feasibility assessment — not for detailed engineering or final process validation. Understanding what the simulator does and does not model will help you interpret results correctly.

What the simulator does

  • Steady-state mass balances across each unit operation based on separation efficiency, rejection coefficients, and component properties
  • Flow and concentration tracking through every stream in the flowsheet
  • pH-dependent solubility checks and precipitation warnings
  • Yield and purity estimates for each product at every step
  • Indicative capital and operating cost ranges based on literature-derived correlations

What the simulator does not do

  • No rigorous thermodynamics — phase equilibria, activity coefficients, and equation-of-state calculations are not performed. Real mixture non-idealities (salting-out effects, co-precipitation, ternary phase diagrams) are not captured.
  • No detailed transport modelling — concentration polarisation, fouling kinetics, and gel-layer effects in membrane operations are approximated by fixed rejection parameters rather than solved from first principles.
  • No chromatography band profiles — chromatographic separations are represented by an overall recovery and purity factor, not by breakthrough curves or plate-height models.
  • No reaction kinetics — enzymatic reactions, degradation, and aggregation during processing are not modelled.
  • Simplified crystallisation — crystal yield is estimated from a user-defined recovery fraction rather than nucleation and growth kinetics or supersaturation profiles.
  • No hydrodynamics — pressure drops, pump sizing, pipe velocities, and fluid dynamics are outside scope.

Engineering interpretation required. Results should be treated as indicative order-of-magnitude estimates. Promising routes identified by untangle.bio should be validated with detailed process modelling, pilot-scale experiments, and consultation with separation specialists before making engineering or investment decisions.

Platform Workflow

untangle.bio follows a proven engineering workflow that mirrors how process engineers actually work — from initial feed characterization to final economic evaluation.

1. Feed Stream Definition

Define your input stream with volumetric flow rate and component specifications. The platform includes an extensive molecule database with physical properties for accurate modeling:

  • Flow rate: L/hr, with automatic unit conversion
  • Components: Concentration (g/L), molecular weight, charge state
  • Properties: pKa, isoelectric point, diffusion coefficient
  • pH and temperature: Critical for precipitation modeling

2. Target Product Selection

Select one or multiple target products from your feed components. untangle.bio optimizes routes for maximum recovery and purity of specified products, with support for complex multi-product separations.

3. Route Generation

The AI engine generates thousands of candidate routes using a diversity-preserving genetic algorithm. Set constraints and optimization goals:

  • Minimum yield: Typically 10-90% depending on application
  • Minimum purity: Product specification requirements
  • Optimization mode: Yield, purity, balanced, or high selectivity
  • Algorithm: Quick enumeration or evolutionary search

4. Simulation & Mass Balance

Run rigorous mass balances with stream-level tracking of concentrations, pH, and flow rates. The simulation engine handles:

  • Conservation of mass and volume at every node
  • pH propagation through mixing and chemical addition
  • Precipitation warnings based on solubility limits
  • Real-time feasibility checking

Building Your Own Process Flow

Alongside the automated route generators, you can build and edit process flowsheets entirely by hand — drag nodes onto the canvas, wire them together, and run the simulation yourself. This is useful when you want to test a specific sequence, reproduce a literature process, or make targeted modifications to a generated route.

Step 1 — Place a Feed Stream

Drag a Feed Stream node from the left palette onto the canvas. Double-click it to open the feed configuration dialog. Set the volumetric flow rate, temperature, and pH, then add your components — either from the built-in molecule database or as custom entries with manually entered properties.

Step 2 — Add Unit Operation Nodes

Drag one or more unit operation nodes from the palette. Available operations are grouped by category:

  • Clarification: Disc centrifuge, depth filtration, microfiltration
  • Membrane separation: Ultrafiltration (10k / 30k / 100k MWCO), nanofiltration
  • Chromatography: Ion exchange (cation & anion), affinity, size exclusion, reverse phase
  • Thermal separation: Spray drying, freeze drying, vacuum tray drying
  • Phase change: Crystallization, precipitation
  • Reagent feed: Wash water (💧), NaOH solution (🔵), HCl solution (🔴)

Double-click any unit operation to configure its parameters (MWCO, pH target, wash volume, etc.).

Step 3 — Switch to Connect Mode

Press C (or click the Connect button in the toolbar) to enter Connect Mode. In this mode, hovering over a node reveals its connection handles. Click and drag from one handle to another to draw a stream edge.

Handle Position Meaning
output Right side of feed node Feed stream outlet — connect to the first unit operation's input
input Left side of unit operation Main process inlet
light Right side of unit operation Light-phase outlet — permeate, filtrate, mother liquor, volatiles
heavy Bottom of unit operation Heavy-phase outlet — retentate, concentrate, solid, crystals
dilution Top of filtration nodes Auxiliary water inlet for diafiltration — connect a Wash Water node here

Tip: Press V to return to Select Mode for moving nodes around. Use Ctrl + Z / Y for undo/redo.

Step 4 — Add Product and Waste Nodes

Every outlet of every unit operation must terminate at either a Product node or a Waste node — the simulator validates this before running. Drag these from the palette (Sources & Sinks section) and connect them to the appropriate outlets.

  • Connect the outlet carrying your target product to a Product node
  • Connect all other outlets to a Waste (Wastewater Treatment) node
  • Multiple unit operations can share the same Waste node

Step 5 — Add Reagent Feeds (optional)

To model diafiltration or pH adjustment, drag reagent feed nodes from the palette and connect them to the appropriate inlets:

  • 💧 Wash Water → connect to the dilution handle on any filtration node
  • 🔵 NaOH Solution → connect as an additional inlet for pH increase
  • 🔴 HCl Solution → connect as an additional inlet for pH decrease

Step 6 — Run the Simulation

Press F5 or click the Solve button in the toolbar. The simulator performs a steady-state mass balance through every node in sequence, propagating concentrations, flow rates, and pH along every stream. Results appear as labels on stream edges and as summary panels on each unit operation node.

Validation errors: If the simulator reports dangling outlets or unconnected streams, check that every outlet handle on every unit operation is connected to either a downstream node, a Product node, or a Waste node. Unconnected outlets prevent the simulation from running.

Feed Definition

Accurate feed characterization is critical for reliable route optimization. untangle.bio provides comprehensive tools for defining complex biotechnology feeds.

Component Database

The platform includes 70+ pre-characterized molecules across key categories:

  • Proteins: Antibodies, enzymes, therapeutic proteins
  • Organic acids: Citric, acetic, lactic, and others
  • Sugars: Glucose, sucrose, complex carbohydrates
  • Salts: Buffer components and ionic species
  • Cells: E. coli, CHO, yeast with size distributions

Database integration: Clicking any molecule automatically populates all relevant properties for separation modeling, including molecular weight, charge, and transport properties.

Route Generation

untangle.bio uses advanced algorithms to explore the vast space of possible purification sequences and identify optimal routes based on your criteria.

Genetic Algorithm Approach

The platform employs a diversity-preserving genetic algorithm optimized for breadth rather than convergence:

  • Population size: 600 genomes for maximum diversity
  • Generations: 50 iterations with fresh injection
  • Selection: Tournament selection with low elitism (3%)
  • Mutation: Multi-type operations (add, remove, replace steps)

Expert Rules Integration

All generated routes pass through 13 expert rules that eliminate physically impossible or economically infeasible combinations:

  • Chromatography requires prior clarification + 80% water content
  • Membrane operations require appropriate particle size reduction
  • Crystallization requires supersaturation conditions
  • Size-based separation must follow large-to-small ordering

Simulation Engine

The simulation engine performs rigorous mass and energy balances with real-time validation of process feasibility and stream compatibility.

Mass Balance Methodology

untangle.bio uses a mass-flow-based approach for accurate modeling:

// Convert to mass flows
mass_flow = concentration × volumetric_flow

// Apply separation efficiency
retained_mass = mass_flow × rejection_coefficient
permeate_mass = mass_flow × (1 - rejection_coefficient)

// Enforce conservation
total_out = retained_mass + permeate_mass
assert(total_out == mass_flow_in)

pH Tracking

pH is tracked throughout the entire process with buffer capacity weighting:

  • Volume-weighted mixing of streams
  • Chemical addition effects (NaOH, HCl)
  • Precipitation warnings near isoelectric points
  • Henderson-Hasselbalch equation for acid solubility

Techno-Economic Analysis

Built-in cost estimation provides immediate economic feedback on route alternatives using industry-standard methodologies.

Capital Cost (CAPEX)

Equipment costs are scaled using the power law, with an exponent that varies by operation type — generally following the "six-tenths rule" but calibrated individually to each technology class:

CAPEX = Base_Cost × (Flow_Rate / Reference_Rate)^n
Total_CAPEX = Σ(Equipment_Cost × Lang_Factor)

The exponent n is not a fixed 0.6 for all equipment — it is calibrated per technology class. Chromatography columns and membrane systems (area-limited equipment) scale more favourably than thermal or cryogenic systems. As a rough guide: membrane and column operations sit in the lower range (~0.55–0.65), mechanical separators in the middle, and drying operations — especially freeze drying — at the higher end (~0.70–0.75).

Lang factors (1.5–3.0×) account for installation, instrumentation, and auxiliary equipment based on operation complexity. All base costs are referenced at 100 L/hr feed rate (2026 USD).

Operating Cost (OPEX)

Variable costs include utilities, consumables, and labor:

  • Utilities: Power, steam, cooling water
  • Consumables: Resins, membranes, chemicals
  • Labor: Operation and maintenance time

Multi-Product Routes

untangle.bio supports complex separations where multiple valuable products are simultaneously recovered from a single feed stream through branching routes. Each product is tracked individually for yield and purity and exits at a dedicated product node.

Branching Logic

At every two-outlet unit operation, each product is assigned to whichever physical stream carries more of its mass — heavy (retentate/solid) or light (permeate/filtrate). Products that end up in different streams at the same step are considered separated at that step and branch into their own product nodes. Products that remain together continue downstream together.

  • Heavy outlet: retentate, concentrate, solid, precipitate
  • Light outlet: permeate, filtrate, mother liquor, volatiles
  • Each product is tracked through every step individually
  • A product node is created at the step and outlet where each product first separates

Metrics

  • Yield — total mass recovery: mass of all target products recovered / mass of all target products in feed
  • Purity — best individual product purity achieved, measured at each product's own exit step (excluding water)

Design constraint: For N selected products, the route must produce exactly N distinct product nodes — each product must exit through a unique (step, outlet) combination. Routes that fail to separate all products are automatically rejected.

Multi-Product Generation Algorithm

The multi-product generator uses a property-driven constructive search — not a genetic algorithm. It analyses the physical differences between products and systematically builds separation sequences that exploit the largest differences first.

Phase 1 — Property Analysis

For every pair of target products, the algorithm calculates key physical differences that determine which separation technologies can distinguish them:

  • MW ratio — larger MW / smaller MW (drives UF, NF, size exclusion)
  • Charge difference — |charge_A − charge_B| (drives ion exchange)
  • Solubility ratio — higher solubility / lower solubility (drives crystallization)
  • log P difference — |log_P_A − log_P_B| (drives reverse phase)
  • Boiling point difference — |BP_A − BP_B| in °C (drives drying)
  • Overall separability score — weighted combination of all differences

Properties are sourced from the molecule database, falling back to component data entered in the feed stream.

Phase 2 — Technology Scoring

Each separation technology is scored for every product pair based on how well it exploits the available property differences:

Technology Applicable when Score scales with
Ultrafiltration / Nanofiltration Meaningful MW difference between soluble products Magnitude of MW ratio
Ion exchange Products carry different net charges Magnitude of charge difference
Size exclusion chromatography Large MW difference between soluble products Magnitude of MW ratio
Reverse phase Meaningful hydrophobicity difference; smaller molecules only Magnitude of log P difference
Crystallization Meaningful solubility difference between products Magnitude of solubility ratio
Drying (spray / freeze / vacuum) One product is volatile, the other is not Boiling point difference
Centrifugation / Depth filtration / MF At least one product is a particulate (cells, spores, etc.) Always strongly preferred — particle/soluble separation is highly selective
Affinity chromatography One product is a protein, the other is not Fixed baseline score

Phase 3 — Route Construction

Routes are built using two complementary strategies, run in parallel:

Strategy 1 — Sequential greedy: Product pairs are sorted by overall separability score. Starting from the most separable pair, the algorithm picks the top-scoring operation for that pair, then finds the best next operation for any remaining unseparated products. Produces focused 2-step routes.

Strategy 2 — Exhaustive permutations: All orderings of products are tried, with each position in the sequence assigned the highest-scored operation for that product pair. For 3 products this explores all A→B→C orderings across all scored technology combinations.

Both strategies generate outlet handle variants (trying both heavy and light at each two-outlet step). Non-drying routes are evaluated first. Up to 2,500 candidates are produced and deduplicated.

Phase 4 — Simulation & Validation

Every candidate route is passed through the full mass-balance simulation engine, then subjected to a series of hard rejection gates in order:

  1. Expert rules — evaluated against the actual stream composition at each step inlet (not just the feed). Violations are rejected immediately.
  2. Products must separate — at least one two-outlet operation where products go to different outlets.
  3. N-product constraint — N selected products must produce exactly N distinct (step, outlet) groups; routes that leave two products sharing an outlet are rejected.
  4. No-op rejection — steps with selectivity ≈ 1, yield ≈ 1, and unchanged flow are rejected as doing nothing useful.
  5. Purity & yield thresholds — routes that pass all structural checks are yielded immediately via streaming, tagged as feasible or infeasible relative to the user's targets.

Streaming results: Routes are yielded to the UI as soon as each simulation completes — you see results appear live without waiting for all candidates to finish.

Expert Rules System

The platform incorporates decades of downstream processing knowledge through 13 expert rules that prevent infeasible designs.

Core Prerequisites

  • Chromatography prerequisites: Requires clarification and high water content
  • Membrane fouling prevention: Cell removal before ultrafiltration
  • Crystallization thermodynamics: Supersaturation verification
  • Drying constraints: Must be final operation

Process Optimization Rules

  • Size-based separation order (large → small particles)
  • Concentration before expensive operations
  • Maximum consecutive operations of same type
  • Protein denaturation prevention (no reverse phase for MW > 1500 Da)

Molecule Database

The built-in molecule database currently covers a limited set of common biotech components — proteins, sugars, organic acids, amino acids, salts, alcohols, and cell types. It is actively being expanded over time based on user feedback and real-world process cases.

For testing purposes: If your molecule is not in the database yet, you can add it manually directly in the feed stream dialog. Enter the component name and as many physical properties as you know (MW, charge, solubility, pKa, log P, etc.). The simulator will use whatever properties you provide — missing values are handled gracefully, though accuracy improves with more complete data.

Note that molecules added this way are local to your simulation only — they are not automatically added to the central database. To request a molecule be added for all users, reach out via LinkedIn.

Property Categories

  • Basic: MW, charge, typical concentrations
  • Solubility: Water solubility, pH-dependent solubility
  • Transport: Diffusion coefficients, viscosity effects
  • Thermodynamic: Heat capacity, formation enthalpy
  • Chemical: pKa values, log P, isoelectric points

Suggest a Molecule

The database is continuously expanding. If you work with a molecule that is missing, reach out on LinkedIn — feedback from practitioners directly shapes what gets added next.

Corjan van den Berg — Revyve

Ready to start designing processes? Launch the workspace and begin optimizing your downstream operations.

Launch Workspace →