Silicon Valley’s quest to automate everything is relentless, which explains its latest obsession: Auto-GPT.
Essentially, Auto-GPT uses the versatility of OpenAI’s latest AI models to interact with software and services online, allowing it to “autonomously” perform tasks like X and Y. But since we learn with large language models, this possibility seems as wide as an ocean but as deep as a puddle.
Auto-GPT – which you may have seen popping up on social media recently – is an open source app created by game developer Toran Bruce Richards which uses OpenAI’s text-generating models, primarily GPT-3.5 and GPT-4, to act “autonomously”.
There is no magic in that autonomy. Auto-GPT simply handles follow-ups to an initial prompt from OpenAI models, both asking and answering until a task is completed.
Auto-GPT is basically GPT-3.5 and GPT-4 combined with a companion bot that instructs GPT-3.5 and GPT-4 what to do. A user tells Auto-GPT what its goal is and the bot in turn uses GPT-3.5 and GPT-4 and various programs to perform each step necessary to achieve the stated goal.
What makes Auto-GPT quite capable is its ability to interact with apps, software, and services, both online and locally, such as web browsers and word processors. For example, if you get a prompt like “help me grow my flower business,” Auto-GPT can do that too develop a somewhat plausible advertising strategy and build a basic website.
As Joe Koen, a software developer who has been experimenting with Auto-GPT, explained to AapkaDost via email, Auto-GPT essentially automates multi-step projects that would have required back-and-forth queries with a chatbot-oriented AI model like for example , OpenAI’s ChatGPT.
“Auto-GPT defines an agent that communicates with OpenAI’s API,” says Koen. “The purpose of this agent is to execute various commands that the AI generates in response to the agent’s requests. The user is prompted for input to specify the AI’s role and objectives before the agent begins executing commands.”
In a terminal, users describe the name, role, and objective of the Auto-GPT agent and specify up to five ways to achieve that objective. For example:
- Name: Smartphone GPT
- Role: An AI designed to find the best smartphone
- Objectively: Find the best smartphones on the market
- Goal 1: Do market research for different smartphones on the market today
- Goal 2: Download the top five smartphones and list their pros and cons
Behind the scenes, Auto-GPT relies on functions such as memory management to perform tasks, along with GPT-4 and GPT-3.5 for text generation, file storage, and summarization.
Auto-GPT can also be connected to speech synthesizers, such as those from ElevenLabs, so that it can ‘call’, for example.
Auto-GPT is publicly available on GitHub, but it requires some configuration and knowledge to get started. To use it, Auto-GPT must be installed in a development environment such as Docker and registered with an API key from OpenAI – which requires a paid OpenAI account.
It may be worth it – although the jury is out on that. Early adopters have used Auto-GPT to take on the kinds of mundane tasks that are better delegated to a bot. For example, Auto-GPT can handle items like debugging code and writing an email or more advanced things like creating a business plan for a new startup.
“If Auto-GPT encounters obstacles or can’t complete the task, it develops new prompts to navigate the situation and determine the appropriate next steps,” Adnan Masood, the lead architect of UST, a technical consulting firm, told AapkaDost in an email. -mail. “Large language models excel at generating human responses, but rely on user cues and interactions to deliver the desired results. Auto-GPT, on the other hand, takes advantage of the advanced capabilities of OpenAI’s API to work independently without user intervention.”
In recent weeks, new apps have appeared to make Auto-GPT even easier to use, such as AgentGPT and GodMode, which provide a simple interface where users can enter what they want to achieve directly on a browser page. Please note that both, like Agent-GPT, require an API key from OpenAI to unlock their full capabilities.
However, like any powerful tool, Auto-GPT has its limitations – and risks.
Depending on the purpose of the tool, Auto-GPT can behave in very…unexpected ways. A Reddit user claims that given a budget of $100 to spend within a server instance, Auto-GPT created a wiki page about cats, exploited a bug in the instance to gain admin-level access, and took over the Python environment it was in run – and then “killed” himself.
There’s also ChaosGPT, a modified version of Auto-GPT with goals like “destroy humanity” and “establish global dominance.” Unsurprisingly, ChaosGPT hasn’t come close to bringing about the robot apocalypse yet, but it has tweeted rather unflattering about humanity.
Arguably more dangerous than Auto-GPT attempting to “destroy humanity”, however, are the unexpected problems that can crop up in otherwise perfectly normal scenarios. Because it’s built on OpenAI’s language models — models that, like all language models, are prone to inaccuracies — it can make mistakes.
That’s not the only problem. After successfully completing a task, Auto-GPT usually doesn’t remember how to run it for later use, and – even if it does – it often won’t remember to use the program. Auto-GPT also struggles to effectively break down complex tasks into simpler sub-tasks and struggles to understand how different goals overlap.
“Auto-GPT illustrates the power and unknown risks of generative AI,” said Clara Shih, the CEO of Salesforce’s Service Cloud and an Auto-GPT enthusiast, via email. “For enterprises, it is especially important to incorporate a human-in-the-loop approach when developing and using generative AI technologies such as Auto-GPT.”