Your mission
Build agentic AI systems that reason, plan, and act inside real materials discovery workflows
Most agent systems live in clean environments: browsers, codebases, or synthetic benchmarks. At Dunia, agents must reason about messy reality: experiments that fail, data that contradicts itself, and physical systems that don’t reset cleanly.
As ML Engineer, Agents & Reasoning, you build the systems that make AI act responsibly inside that reality. You design agents that decide what to do next, use tools intelligently, recover from failure, and know when they don’t know.
Your work sits at the boundary between cognition and control.
Most agent systems live in clean environments: browsers, codebases, or synthetic benchmarks. At Dunia, agents must reason about messy reality: experiments that fail, data that contradicts itself, and physical systems that don’t reset cleanly.
As ML Engineer, Agents & Reasoning, you build the systems that make AI act responsibly inside that reality. You design agents that decide what to do next, use tools intelligently, recover from failure, and know when they don’t know.
Your work sits at the boundary between cognition and control.
Your tasks will include:
Build agentic decision-making systems for discovery
- Design and implement agentic systems that plan, reason, and act across materials discovery workflows
- Develop agents that operate over experiments, simulations, and scientific datasets, selecting next actions under uncertainty
- Define how autonomy is scoped, when humans stay in the loop, and how decisions are escalated
Ground reasoning in scientific and physical reality
- Implement planning, control logic, and uncertainty-aware decision-making tailored to physical systems
- Encode operational, experimental, and safety constraints directly into agent behavior
- Define stopping criteria, fallback strategies, and recovery mechanisms to prevent brittle behavior
Turn models into action
- Collaborate closely with AI researchers to embed predictive models into agent workflows
- Work with lab, automation, and software teams to connect agents to real experimental and simulation systems
- Ensure agent outputs translate into executable actions, not just recommendations
Measure what matters
- Build evaluation frameworks that assess decision quality, learning efficiency, and system behavior, not just model accuracy
- Analyze failure cases and iterate on system design based on real-world outcomes
- Help define what “good decisions” mean in scientific discovery contexts
Ship reliable, production-grade systems
- Translate research concepts into robust, maintainable ML systems
- Instrument agents with logging, monitoring, and diagnostics for observability and debugging
- Take ownership of systems from prototype through deployment and operation