在大型语言模型(LLM)的时代,基于大型语言模型的智能Agen在过去一年中取得了显著进展。
Agent,又可以翻译为代理或者智能体
Agent的定义和性质因学科或文化背景而异。通常,Agent是一个具有自主性的个体,能够行使自己的意志,做出决定并采取行动,而不仅仅是被动地响应外部刺激。人类是这个星球上最复杂的Agent。
自20世纪80年代中期以来,人工智能领域关于Agent的研究显著增加。基于此,Wooldridge将人工智能定义为旨在设计和构建表现出智能行为的计算机Agent。
从本质上讲,人工智能Agent是对Agent概念的具体化。
如图1所示,人工智能Agent是一个通过传感器感知其环境、做出决策并相应地响应的人造实体。
人工智能Agent研究的技术演变历史主要包括以下几个阶段。
在人工智能研究的早期阶段,主要采用的方法是符号人工智能,它使用逻辑规则和符号表示来封装知识并促进推理过程。符号Agent的架构如图2所示:
以前的各种基于知识的专家系统就是最常见的符号Agent。该类系统主要由知识库、推理引擎和解释器组成。
然而,正如决策引擎逐渐被AI模型所淘汰,人工构建的决策逻辑通常太过死板,难以具有应用价值。
与符号Agent不同,反应型Agent不采用复杂的符号推理。他们主要关注Agent与环境之间的互动,优先考虑快速和实时的反应。反应型Agent通常使用预定义的规则集来指导其行为,如图3所示:
相对于符号Agent,反应型Agent所使用的策略更为简单,举个例子,符号Agent类似于编译器,决策引擎中有大量逻辑推演规则,而反应型Agent则就是一堆if else,通过读取环境数据快速进行判断。
在LLM出现之前,基于强化学习的Agent属于是研究热点,最著名的应该就是AlphaGo。这一领域的主要关注点是如何使Agent通过与环境的互动来学习,以在特定任务中获得最大的累积奖励。
深度学习出现后深度神经网络与强化学习整合。这使得Agent能够从高维输入中学习复杂的策略。如图4所示。
然而,强化学习的问题包括:长时间的训练周期、采样效率低、在复杂的现实世界环境中模型不稳定。
近年来,大型语言模型(LLM)非常火热,潜力巨大。因此,一个新的研究领域已经出现,使用LLM作为Agent的核心控制器,以让Agent拥有人类水平的决策能力。
这是文章的重点,接下来将详细说明。
基于LLM的Agent的架构形式各异。然而,所有架构的核心模块都包括记忆、规划和行动。
提出了一个统一框架,如图5所示。这个框架包括一个分析(Profile)模块、一个记忆(Memory)模块、一个规划(Planning)模块和一个行动(Action)模块。
Agent在执行任务时通常会预定义一个身份,比如教师、某领域的专家等。分析模块的作用是定义这些agent所扮演的角色的详细档案,这些档案会被写入到提示中,用以影响大型语言模型(LLM)的行为。
Agent档案一般包含基本信息(如年龄、性别、职业)、个性相关的心理学信息,以及描述Agent间社交关系的信息。选择哪些信息主要取决于应用的具体场景。
基于LLM(大型语言模型)的Agent的记忆机制仿照了人类记忆。人类记忆可以分为短期记忆(短暂保持信息)和长期记忆(在较长时间内巩固信息)。
而在LLM中,短期记忆指transformer架构限制的上下文窗口内的输入信息。长期记忆类似于外部向量存储,Agent可以根据需要快速查询和检索。
面对复杂任务时,人类倾向于将其分解为更简单的子任务并分别解决它们。规划模块的目标是赋予Agent这种人类能力,使Agent的行为更加强大。思维链就是一种常见的规划策略。
行动模块将Agent的决策转化为具体的输出。这个模块直接与环境互动。它受到分析、记忆和规划模块的影响。行动模块可以分为4部分:
此外也有一些其他的框架,提出了一个基于LLM的Agent的一般概念框架,由三个关键部分组成:大脑(brain)、感知(perception)和行动(action) ,如图6所示。
大脑模块作为控制器,处理基本任务,如记忆、思考和决策。感知模块解释和处理来自外部环境的多模态信息,而行动模块则执行响应并使用工具与环境互动。
举个例子来说明工作流程:假设有人问今天是否会下雨。感知模块将这个查询转换成LLM可以理解的格式。然后,大脑模块根据当前的天气情况和在线天气报告进行推断。最后,行动模块作出响应并给这个人递一把伞。通过这一过程,Agent能够持续接收反馈并与环境互动。
根据领域的不同,基于LLM的Agent的应用可以分为三类:社会科学、自然科学和工程,如图7所示。
根据应用场景的不同,基于LLM的Agent的应用又可以分为:单一Agent、多重Agent和人机交互Agent,如图8所示。
单一Agent具有多样化的能力。当多个Agent互动时,它们可以通过合作或对抗性互动加强性能。人与Agent的互动中,人类的反馈可以使Agent更有效地执行任务。
接下来我们将用Langchain和python实现一个Agent的demo,demo的整体架构图如下:
这里我用了anaconda来配置环境,命令如下:
sh
复制代码
(base) Florian: conda create -n agent python=3.11
(base) Florian: conda activate agent
(agent) Florian: pip install langchain
(agent) Florian: pip install langchain_openai
(agent) Florian: pip install duckduckgo-search
最终所用库的版本如下:
js
复制代码
langchain 0.1.15
langchain-community 0.0.32
langchain-core 0.1.41
langchain-openai 0.1.2
langchain-text-splitters 0.0.1
duckduckgo_search 5.3.0
python
复制代码
import os
os.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_API_KEY"
from langchain.agents import AgentExecutor, Tool, ZeroShotAgent
from langchain.chains import LLMChain
from langchain.memory import ConversationBufferMemory
from langchain_openai import OpenAI
from langchain_community.utilities import DuckDuckGoSearchAPIWrapper
这里,我使用DuckDuckGo库作为Agent可以使用的工具,从而为Agent赋予搜索能力。
tools定义了一个工具列表,其中包含一个名为"Search"的工具,该工具使用search.run
函数。
python
复制代码
search = DuckDuckGoSearchAPIWrapper()
tools = [
Tool(
name="Search",
func=search.run,
description="useful for when you need to answer questions about current events",
)
]
如前所述,计划模块和行动模块是一个Agent的核心,这里我使用算法,构建计划模块和行动模块。该算法如下图所示:
以做一道菜为例,切好菜和打开煤气之间,有一个这样的推理,或者说内心独白:”现在我切好菜了,后面要煮菜了,我需要打开煤气。“在做菜过程中,如果发生意外,发现没有盐时,有这样一个推理:”没有盐了,今天我就用胡椒粉调味“,然后我们就会去拿胡椒粉。
通过这个洞察,作者提出一个方法:
以下面这个问题为例:
除了Apple遥控器,还有什么其他设备可以控制相关软件?相关软件指的是,Apple遥控器最早可以控制的软件。
说明:
ReAct方法,即推理+动作。得到结果:
想法1:我需要搜索Apple遥控器,并找到Apple遥控器最早可以控制的软件。
动作1:搜索【Apple 遥控器】
结果1:Apple遥控器是一种遥控器……最早可以控制“Front Row”……想法2:Apple遥控器最早可以控制Front Row。我需要搜索Front Row下一步,找出还有什么其他设备可以控制他
动作2:搜索【Front Row】
结果2:无结果,可以试试“Front Row Seat to Earth”和“Front Row软件”想法3:Front Row没有找到,我可以搜索“Front Row软件”
动作3:搜索【Front Row软件】
结果3:FrontRow是一种过时软件……可以被Apple遥控器和键盘功能键控制。想法4:现在我知道答案了。
动作4:完成【键盘功能键】
答案正确。通过显式推理,加上动作,LLM Agent自主找到了答案。整个过程感觉就像是个不是特别聪明的孩子,要把想法写出来,然后去综合所有想法和观察,接着再做出相应的动作。但显然这个方法很有效,它最终找到了答案。
python
复制代码
prefix = """Have a conversation with a human, answering the following questions as best you can. You have access to the following tools:"""
suffix = """Begin!"
{chat_history}
Question: {input}
{agent_scratchpad}"""
# 使用`ZeroShotAgent.create_prompt`方法创建一个提示,该提示将用于与LLM交互。
prompt = ZeroShotAgent.create_prompt(
tools,
prefix=prefix,
suffix=suffix,
input_variables=["input", "chat_history", "agent_scratchpad"],
)
这里定义了对话的前缀和后缀,以及对话历史、用户输入和Agent的思考过程的占位符。使用ZeroShotAgent.create_prompt
方法创建一个prompt
这里可能有读者疑惑,ReAct算法在哪使用呢————在prompt里!
这里展示下内容:
vbnet
复制代码
Have a conversation with a human, answering the following questions as best you can.
You have access to the following tools:
Search: useful for when you need to answer questions about current events
Use the following format:
Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [Search]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question
Begin!"
{chat_history}
Question: {input}
{agent_scratchpad}
prompt将使用一个定义好的[search]工具,并且prompt末尾有三个变量:
langchain已经提供了默认的记忆模块的函数:
python
复制代码
memory = ConversationBufferMemory(memory_key="chat_history")
创建一个ConversationBufferMemory
实例,用于存储对话历史。
python
复制代码
#创建一个`LLMChain`实例,它将使用OpenAI模型和之前创建的提示。
llm_chain = LLMChain(llm=OpenAI(temperature=0), prompt=prompt)
#创建一个`ZeroShotAgent`实例,它将使用LLM链和工具列表。
agent = ZeroShotAgent(llm_chain=llm_chain, tools=tools, verbose=True)
#创建一个`AgentExecutor`实例,它将用于运行agent。
agent_executor = AgentExecutor.from_agent_and_tools(
agent=agent, tools=tools, verbose=True, memory=memory
)
python
复制代码
agent_executor.run(input="How many people live in canada?")
agent_executor.run(input="what is their national anthem called?")
agent_executor.run(input="what is their capital?")
这里连续运行了三次agent执行器,每次处理一个不同的输入。第二次和第三次测试了agent的记忆功能,即agent能否利用之前交互中的信息来回答后续问题。
python
复制代码
import os
os.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_API_KEY"
from langchain.agents import AgentExecutor, Tool, ZeroShotAgent
from langchain.chains import LLMChain
from langchain.memory import ConversationBufferMemory
from langchain_openai import OpenAI
from langchain_community.utilities import DuckDuckGoSearchAPIWrapper
search = DuckDuckGoSearchAPIWrapper()
tools = [
Tool(
name="Search",
func=search.run,
description="useful for when you need to answer questions about current events",
)
]
prefix = """Have a conversation with a human, answering the following questions as best you can. You have access to the following tools:"""
suffix = """Begin!"
{chat_history}
Question: {input}
{agent_scratchpad}"""
prompt = ZeroShotAgent.create_prompt(
tools,
prefix=prefix,
suffix=suffix,
input_variables=["input", "chat_history", "agent_scratchpad"],
)
memory = ConversationBufferMemory(memory_key="chat_history")
llm_chain = LLMChain(llm=OpenAI(temperature=0), prompt=prompt)
agent = ZeroShotAgent(llm_chain=llm_chain, tools=tools, verbose=True)
agent_executor = AgentExecutor.from_agent_and_tools(
agent=agent, tools=tools, verbose=True, memory=memory
)
agent_executor.run(input="How many people live in canada?")
# To test the memory of this agent, we can ask a followup question that relies on information in the previous exchange to be answered correctly.
agent_executor.run(input="what is their national anthem called?")
agent_executor.run(input="what is their capital?")
这里我用了个gpt-3.5的api,输出如下:
sh
复制代码
> Entering new AgentExecutor chain...
Thought: I should use the Search tool to find the most recent population data for Canada.
Action: Search
Action Input: "Population of Canada"
Observation: Canada population density map (2014) Top left: The Quebec City-Windsor Corridor is the most densely inhabited and heavily industrialized region accounting for nearly 50 percent of the total population Canada ranks 37th by population among countries of the world, comprising about 0.5% of the world's total, with 40 million Canadians. Despite being the second-largest country by total area ... As of July 1, 2023, NPRs were estimated to represent 5.5% of the population of Canada. Among provinces, this proportion was highest in British Columbia (7.3%) and Ontario (6.3%) and lowest in Newfoundland and Labrador (2.4%) and Saskatchewan (2.5%). The 2.2 million NPRs now outnumber the 1.8 million Indigenous people enumerated during the 2021 ... Historical population of Canada. Statistics Canada conducts a country-wide census that collects demographic data every five years on the first and sixth year of each decade. The 2021 Canadian census enumerated a total population of 36,991,981, an increase of around 5.2 percent over the 2016 figure. It is estimated that Canada's population surpassed 40 million in 2023 and 41 million in 2024. Canada's population reaches 40 million. On June 16, 2023, Statistics Canada announced that Canada's population passed the 40 million mark according to the Canada's population clock (real-time model). Today's release of total demographic estimates and related data tables for a reference date of July 1, 2023, is the first since reaching that ... Canada's population was estimated at 40,528,396 on October 1, 2023, an increase of 430,635 people (+1.1%) from July 1. This was the highest population growth rate in any quarter since the second quarter of 1957 (+1.2%), when Canada's population grew by 198,000 people. At the time, Canada's population was 16.7 million people, and this rapid population growth resulted from the high number of ...
Thought: Based on the data, I can see that the population of Canada is estimated to be around 40 million as of October 1, 2023.
Final Answer: The estimated population of Canada as of October 1, 2023 is 40 million.
> Finished chain.
> Entering new AgentExecutor chain...
Thought: I should use the search tool to find the answer.
Action: Search
Action Input: "Canada national anthem"
Observation: O Canada, national anthem of Canada.It was proclaimed the official national anthem on July 1, 1980. "God Save the Queen" remains the royal anthem of Canada. The music, written by Calixa Lavallée (1842-91), a concert pianist and native of Verchères, Quebec, was commissioned in 1880 on the occasion of a visit to Quebec by John Douglas Sutherland Campbell, marquess of Lorne (later 9th ... Learn about the history and lyrics of Canada's national anthem 'O Canada', which has both French and English versions. The song was composed by Calixa Lavallée in 1880 and was proclaimed the official anthem in 1980. It replaced 'God Save the Queen', which is Canada's royal anthem. O Canada (French: Ô Canada) is the national anthem of Canada. The song was originally commissioned by Lieutenant Governor of Quebec Théodore Robitaille for t... National Anthem of Canada - O Canada (English only) - featuring new lyricsOther versions:Bilingual: https://www.youtube.com/watch?v=wBCuyeoSURoFrench only: h... Enjoy this virtual choir rendition of 'O Canada' arranged by George Alfred Grant-Shaefer . Make sure to subscribe for more virtual choir videos!After 100 yea...
Thought: I now know the final answer.
Final Answer: The national anthem of Canada is "O Canada".
> Finished chain.
> Entering new AgentExecutor chain...
Thought: I should use the Search tool to find the answer.
Action: Search
Action Input: "Capital of Canada"
Observation: Ottawa is the capital city of Canada.It is located in the southern portion of the province of Ontario, at the confluence of the Ottawa River and the Rideau River.Ottawa borders Gatineau, Quebec, and forms the core of the Ottawa-Gatineau census metropolitan area (CMA) and the National Capital Region (NCR). As of 2021, Ottawa had a city population of 1,017,449 and a metropolitan population of ... Ottawa, city, capital of Canada, located in southeastern Ontario.In the eastern extreme of the province, Ottawa is situated on the south bank of the Ottawa River across from Gatineau, Quebec, at the confluence of the Ottawa (Outaouais), Gatineau, and Rideau rivers.The Ottawa River (some 790 miles [1,270 km] long), the principal tributary of the St. Lawrence River, was a key factor in the city ... Skyline of Toronto. The national capital is Ottawa, Canada's fourth largest city. It lies some 250 miles (400 km) northeast of Toronto and 125 miles (200 km) west of Montreal, respectively Canada's first and second cities in terms of population and economic, cultural, and educational importance. The third largest city is Vancouver, a centre ... Learn about Canada's location, climate, terrain, natural resources, and major lakes and rivers. Find out the population distribution, ethnic groups, languages, and religions of Canada. The national capital, Ottawa, is prominently marked in the province of Ontario. Where is Canada? Canada is the largest country in North America. Canada is bordered by non-contiguous US state of Alaska in the northwest and by 12 other US states in the south. The border of Canada with the US is the longest bi-national land border in the world.
Thought: I now know the final answer.
Final Answer: The capital of Canada is Ottawa.
> Finished chain.
由于新岗位的生产效率,要优于被取代岗位的生产效率,所以实际上整个社会的生产效率是提升的。
但是具体到个人,只能说是:
“最先掌握AI的人,将会比较晚掌握AI的人有竞争优势”。
这句话,放在计算机、互联网、移动互联网的开局时期,都是一样的道理。
我在一线互联网企业工作十余年里,指导过不少同行后辈。帮助很多人得到了学习和成长。
我意识到有很多经验和知识值得分享给大家,也可以通过我们的能力和经验解答大家在人工智能学习中的很多困惑,所以在工作繁忙的情况下还是坚持各种整理和分享。但苦于知识传播途径有限,很多互联网行业朋友无法获得正确的资料得到学习提升,故此将并将重要的AI大模型资料包括AI大模型入门学习思维导图、精品AI大模型学习书籍手册、视频教程、实战学习等录播视频免费分享出来。
?有需要的小伙伴,可以点击下方链接免费领取或者V扫描下方二维码免费领取?
该阶段让大家对大模型 AI有一个最前沿的认识,对大模型 AI 的理解超过 95% 的人,可以在相关讨论时发表高级、不跟风、又接地气的见解,别人只会和 AI 聊天,而你能调教 AI,并能用代码将大模型和业务衔接。
该阶段我们正式进入大模型 AI 进阶实战学习,学会构造私有知识库,扩展 AI 的能力。快速开发一个完整的基于 agent 对话机器人。掌握功能最强的大模型开发框架,抓住最新的技术进展,适合 Python 和 JavaScript 程序员。
恭喜你,如果学到这里,你基本可以找到一份大模型 AI相关的工作,自己也能训练 GPT 了!通过微调,训练自己的垂直大模型,能独立训练开源多模态大模型,掌握更多技术方案。
到此为止,大概2个月的时间。你已经成为了一名“AI小子”。那么你还想往下探索吗?
对全球大模型从性能、吞吐量、成本等方面有一定的认知,可以在云端和本地等多种环境下部署大模型,找到适合自己的项目/创业方向,做一名被 AI 武装的产品经理。
学习是一个过程,只要学习就会有挑战。天道酬勤,你越努力,就会成为越优秀的自己。
如果你能在15天内完成所有的任务,那你堪称天才。然而,如果你能完成 60-70% 的内容,你就已经开始具备成为一名大模型 AI 的正确特征了。
保证100%免费
】?有需要的小伙伴,可以Vx扫描下方二维码免费领取==?