Buzzword of the Month

Agentic AI

Agentic AI

What is Agentic AI?

Agentic AI is software that uses a language model to plan a sequence of actions, invoke tools or APIs, check results, and iterate toward a goal. It’s closer to workflow automation with probabilistic judgment than independent intelligence. It’s overhyped because polished demos hide brittle planning, verification gaps, and the operational overhead needed to keep agents safe and useful.

What is the adoption maturity?

Agents can deliver real gains on repetitive, well-specified tasks with clear success checks and reliable systems. They are not a drop-in replacement for expert judgment or messy cross-system operations. Treat them as guarded workflow automation with logging, review, and explicit stop rules.

What are the barriers to adoption?

  • Brittle planning: Plans break with vague inputs or stale context
  • Verification gap: Hard to prove success without explicit checks and ground truth
  • Tool reliability: Flaky APIs and poor errors cause loops or silent failures
  • Safety and permissions: Secrets, spend, scopes must be tightly controlled
  • Observability: Step logs, replay, and alerts are required to operate safely
  • Evaluation data: Few teams have task-specific test sets and metrics
  • Change management: Roles and processes must adapt to supervision and escalation
  • Cost and latency: Multiple model and tool calls add time and expense

Are there specific use cases where it works?

  • Customer support triage and drafting within strict templates
  • Finance and back-office checks such as duplicate-charge detection within limits
  • Developer productivity tasks like PR checklists, changelog updates, boilerplate scaffolding
  • Sales-ops hygiene: meeting note cleanup, approved-source enrichment, next-step creation
  • IT helpdesk flows: account unlock guidance, device compliance checks, knowledge routing

Are there specific use cases where it doesn’t work?

  • Open-ended research or strategy with subjective outcomes
  • High-stakes irreversible actions like payments or production schema changes
  • Multi-system coordination with inconsistent identifiers and partial records
  • Fast-changing or sparse domains where context is thin or outdated
  • Any task without a binary success test the system can self-verify

What questions you need to ask yourself before considering adoption over the next 12 months

  • Outcome test: What can the system verify as success without a person
  • Scope clarity: Can you write a five-to-ten-step plan with a binary end check
  • Tool catalog: Which APIs are deterministic and return useful error messages
  • Stop rules: When must the system escalate or abandon a path
  • Safety rails: How will you constrain secrets, permissions, and spend
  • Audit and replay: Can you log and replay every prompt, tool call, and output
  • Evaluation plan: What dataset and metrics will you track weekly
  • Ownership: Who fixes failures, reviews escalations, and improves prompts and tools

Case Studies

Successful: Klarna reports its AI assistant now handles about two-thirds of customer-service chats, performing work equivalent to roughly 700 full-time agents and contributing to a projected profit boost, with multiple outlets and the company detailing the shift publicly. This is a textbook “templated support with guardrails” fit: bounded workflows, clear checks, reliable backend systems, and full auditability.

Unsuccessful: Air Canada was ordered to compensate a customer after its website chatbot provided wrong guidance on bereavement fares; the tribunal ruled the airline responsible for information presented by its chatbot. It’s a cautionary tale about deploying conversational systems without robust verification, governance, and escalation paths.

Our ThINKING

Our recent client work