AI agents that carry out tasks with little human oversight are generating a lot of hype. These “action-taking” agents pursue goals by making decisions and interacting with digital environments or tools. They combine LLMs with structured workflows and tool access.
Big Tech has already dived into the agent market. Microsoft has Copilot agents, agents for scientists to accelerate R&D and a GitHub agent for coding. Google recently unveiled a shopping agent. OpenAI has Operator, which can handle repetitive tasks like form filling, and its Deep Research tool. Salesforce has Agentforce, a tool that coaches sales employees and makes product recommendations to customers.
With leading foundation models maturing, seed and early-stage investments by the top VC firms in Q1 consolidated around AI agents and their underlying infrastructure. Y Combinator, the Silicon Valley tech accelerator, is going all-in on agents, with the technology making up nearly 50% of its latest class of startups. To cash in on the buzz, AI startups are increasingly describing their products as agents.
As this technology advances, beware of “agent-washing.” Most of the solutions on the market now are not really agents, but repackaged AI assistants, robotic process automation tools, and chatbots. Many vendors are using the term “agent” as a marketing ploy; of thousands of products reviewed by consulting firm Gartner, only 130 were “real” agentic AI.
And accuracy concerns linger. One of Salesforce’s own researchers said that AI agents often underdeliver on basic CRM tasks: agents could only successfully complete a single-function task 58% of the time and multiple-step tasks only 35% of the time. User feedback has been negative, and there are complaints that Salesforce AI is prone to hallucinations.
Anthropic also recently tested a new AI agent operating with significant economic autonomy. They gave Claude control of running a physical store, and while it demonstrated some impressive capabilities, it failed to turn a profit, was easily manipulated, hallucinated conversations, and had what researchers called an “identity crisis.”
This technology is still new, and the current research suggests that deploying autonomous AI in business contexts requires troubleshooting flaws that do not exist in traditional software, building safeguards for novel problems we are only beginning to understand, and overcoming challenges in integrating agents into existing legacy systems. Gartner predicts that 40% of agent projects will be cancelled within just two years due to rising costs, lack of ROI, and poor risk controls.
The Center for Democracy and Technology identified six key considerations for companies integrating agents: security and misuse, user privacy, user control, technical and legal infrastructure for agent governance, the impact of human-life agents on real people, and responsibility for agent harms.
Questions to consider
How is a company governing the risks associated with agent adoption? What risk controls are in place? Does the Board have oversight?
How are companies evolving their security defenses to account for agent adoption? Are current security failure rates acceptable?
What privacy controls exist for the agentic technology being used? What information do the agents retain across sessions? How long is data stored? What kinds of inferences is the information used to make?
What tools do users have to supervise and control the agents? Can they assess, pause, or override agent actions? Do the agents expose their reasoning and planned actions for user approval? Are there any mandatory confirmation steps?
Has the company assessed the legal risks associated with implementing agents, including liability for errors that cause financial, reputational, or third party harms?
What design choices are being made to encourage - or prevent - users from building emotional relationships with agents?
Has the company been transparent with users and other stakeholders over how their information and other assets may be exposed to agents? Are they clearly informed when they are dealing with an AI system?


