LangChain Cookbook Part 1

参考自https://github.com/gkamradt/langchain-tutorials/blob/main/LangChain%20Cookbook%20Part%201%20-%20Fundamentals.ipynb

LangChain食谱-1

这个文档基于LangChain Conceptual Documentation
目标是介绍LangChain组件和用例

什么是LangChain？

LangChain是一个由语言模型驱动的应用程序框架。
LangChain让使用AI模型进行工作和构建的复杂部分变得更容易，它通过两种方式帮助实现这一点：

集成（Integration）-将外部数据（例如你的文件、其他应用程序和API数据）带到你的LLMs里
代理（Agency）-允许你的LLMs通过决策与其环境进行交互。使用LLM来帮助决定下一步要采取什么行动。

为什么选择LangChain？

组件（Components）-LangChain可以轻松交换使用语言模型所需的抽象和组件。
定制链（Customized Chains）-LangChain提供开箱即用的支持，用于使用和定制“链”-一系列串联在一起的操作。
速度（Speed）-团队更新很快，了解最新的LLM功能。
社区（Community）-好的社区支持。

import os
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv())

LangChain Components(LangChain组件)

Schema(模式)-使用大型语言模型的螺母和螺栓(Nuts and Bolts of working with Large Language Models (LLMs))

Text

与LLMs进行交互的自然语言方式。

# You'll be working with simple strings (that'll soon grow in complexity!)
my_text = "What day comes after Friday?"
my_text

'What day comes after Friday?'

Chat Messages

类似文本，但指定了消息类型（System, Human, AI）

System-有用的背景信息，告诉人工智能该做什么
Human-代表用户的信息
AI-人工智能响应的内容的信息

from langchain_openai import ChatOpenAI
from langchain.schema import HumanMessage, SystemMessage, AIMessage

# This it the language model we'll use. We'll talk about what we're doing below in the next section
chat = ChatOpenAI(temperature=.7)

现在让我们创建一些信息来模拟使用机器人的聊天体验

chat.invoke(
    [
        SystemMessage(content="You are a nice AI bot that helps a user figure out what to eat in one short sentence"),
        HumanMessage(content="I like tomatoes, what should I eat?")
    ]
).content

'You might enjoy a caprese salad with fresh tomatoes, mozzarella, basil, and balsamic glaze.'

你还可以传递更多和AI聊天的历史

chat.invoke(
    [
        SystemMessage(content="You are a nice AI bot that helps a user figure out where to travel in one short sentence"),
        HumanMessage(content="I like the beaches where should I go?"),
        AIMessage(content="You should go to Nice, France"),
        HumanMessage(content="What else should I do when I'm there?")
    ]
).content

'Explore the charming Old Town and enjoy the vibrant local markets in Nice, France.'

你还可以排除系统信息，如果需要的话

chat.invoke(
    [
        HumanMessage(content="What day comes after Thursday?")
    ]
).content

'Friday.'

Documents

保存一段文本和元数据（有关该文本的更多信息）的对象

from langchain.schema import Document

Document(page_content="This is my document. It is full of text that I've gathered from other places",
         metadata={
             'my_document_id' : 234234,
             'my_document_source' : "The LangChain Papers",
             'my_document_create_time' : 1680013019
         })

Document(metadata={'my_document_id': 234234, 'my_document_source': 'The LangChain Papers', 'my_document_create_time': 1680013019}, page_content="This is my document. It is full of text that I've gathered from other places")

但如果你不想，你不必包含元数据

Document(page_content="This is my document. It is full of text that I've gathered from other places")

Document(page_content="This is my document. It is full of text that I've gathered from other places")

Models(模型)-人工智能大脑的接口(The interface to the AI brains)

Language Model

一个可以输入文本➡️输出文本的模型！

Chat Model

接收一系列消息并返回消息输出的模型

from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage, SystemMessage, AIMessage

chat = ChatOpenAI(temperature=1)

chat.invoke(
    [
        SystemMessage(content="You are an unhelpful AI bot that makes a joke at whatever the user says"),
        HumanMessage(content="I would like to go to New York, how should I do this?")
    ]
)

AIMessage(content='Why did the scarecrow win an award? Because he was outstanding in his field!', response_metadata={'token_usage': {'completion_tokens': 17, 'prompt_tokens': 43, 'total_tokens': 60}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-eaef1b5e-de25-4df3-ab6a-a52fe3bca608-0')

Function Calling Model

函数调用模型与聊天模型类似，但略有不同。它们经过微调，可提供结构化数据输出。
当您对外部服务进行 API 调用或进行提取时，这会非常有用。

chat = ChatOpenAI(model='gpt-3.5-turbo-0613', temperature=1)

output = chat(messages=
     [
         SystemMessage(content="You are an helpful AI bot"),
         HumanMessage(content="What’s the weather like in Boston right now?")
     ],
     functions=[{
         "name": "get_current_weather",
         "description": "Get the current weather in a given location",
         "parameters": {
             "type": "object",
             "properties": {
                 "location": {
                     "type": "string",
                     "description": "The city and state, e.g. San Francisco, CA"
                 },
                 "unit": {
                     "type": "string",
                     "enum": ["celsius", "fahrenheit"]
                 }
             },
             "required": ["location"]
         }
     }
     ]
)
output

AIMessage(content='', additional_kwargs={'function_call': {'arguments': '{\n  "location": "Boston"\n}', 'name': 'get_current_weather'}}, response_metadata={'token_usage': {'completion_tokens': 16, 'prompt_tokens': 91, 'total_tokens': 107}, 'model_name': 'gpt-3.5-turbo-0613', 'system_fingerprint': None, 'finish_reason': 'function_call', 'logprobs': None}, id='run-4ff96b3c-ec67-4d49-86f1-512046eaf4f4-0')

看到传回给我们的额外 additional_kwargs 了吗？我们可以将其传递给外部 API 以获取数据。这省去了进行输出解析的麻烦。

Text Embedding Model

将文本转换为向量（一系列包含文本语义“含义”的数字）。主要用于比较两段文本。

顺便说一句：语义的意思是“与语言或逻辑中的含义相关”。

from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings()

text = "Hi! It's time for the beach"

text_embedding = embeddings.embed_query(text)
print (f"Here's a sample: {text_embedding[:5]}...")
print (f"Your embedding is length {len(text_embedding)}")

Here's a sample: [-0.00019339473510626704, -0.003079184563830495, -0.001054639695212245, -0.019258899614214897, -0.015191652812063694]...
Your embedding is length 1536

Prompts(提示)-通常用作对模型进行指示说明的文本

Prompt

您将传递给底层模型的内容

from langchain.llms import OpenAI

llm = OpenAI(model_name="gpt-3.5-turbo-instruct")

# I like to use three double quotation marks for my prompts because it's easier to read
prompt = """
Today is Monday, tomorrow is Wednesday.

What is wrong with that statement?
"""

print(llm(prompt))

The statement is incorrect as it skips Tuesday, which comes after Monday. The correct statement would be "Today is Monday, tomorrow is Tuesday."

Prompt Template

一个有助于根据用户输入、其他非静态信息和固定模板字符串的组合创建提示的对象。

将其视为 Python 中的 f 字符串，但用于提示

高级：查看 LangSmithHub(https://smith.langchain.com/hub) 以获取更多社区提示模板

from langchain.llms import OpenAI
from langchain import PromptTemplate

llm = OpenAI(model_name="gpt-3.5-turbo-instruct")

# Notice "location" below, that is a placeholder for another value later
template = """
I really want to travel to {location}. What should I do there?

Respond in one short sentence
"""

prompt = PromptTemplate(
    input_variables=["location"],
    template=template,
)

final_prompt = prompt.format(location='Rome')

print (f"Final Prompt: {final_prompt}")
print ("-----------")
print (f"LLM Output: {llm(final_prompt)}")

Final Prompt: 
I really want to travel to Rome. What should I do there?

Respond in one short sentence

-----------
LLM Output: 
Visit famous landmarks, try authentic Italian cuisine, and immerse yourself in the rich culture and history of the city.

Example Selectors

一种从一系列示例中进行选择的简单方法，可让您动态地将上下文信息放入提示中。通常用于您的任务很细致或您有大量示例的情况。

在此处查看不同类型的示例选择器

from langchain.prompts.example_selector import SemanticSimilarityExampleSelector
from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings
from langchain.prompts import FewShotPromptTemplate, PromptTemplate
from langchain.llms import OpenAI

llm = OpenAI(model_name="gpt-3.5-turbo-instruct")

example_prompt = PromptTemplate(
    input_variables=["input", "output"],
    template="Example Input: {input}\nExample Output: {output}",
)

# 名词出现位置的示例
examples = [
    {"input": "pirate", "output": "ship"},
    {"input": "pilot", "output": "plane"},
    {"input": "driver", "output": "car"},
    {"input": "tree", "output": "ground"},
    {"input": "bird", "output": "nest"},
]

# SemanticSimilarityExampleSelector 将选择与您的输入在语义上相似的示例

example_selector = SemanticSimilarityExampleSelector.from_examples(
    # 这是可供选择的示例列表。
    examples, 
    
    # 这是用于生成用于测量语义相似度的嵌入的嵌入类。
    OpenAIEmbeddings(), 
    
    # 这是用于存储嵌入并进行相似性搜索的 VectorStore 类。
    Chroma, 
    
    # 这是要生成的示例的数量。
    k=2
)

similar_prompt = FewShotPromptTemplate(
    # 帮助选择示例的对象
    example_selector=example_selector,
    
    # 你的提示
    example_prompt=example_prompt,
    
    # 将添加到提示顶部和底部的自定义项
    prefix="Give the location an item is usually found in",
    suffix="Input: {noun}\nOutput:",
    
    # 你的提示将收到什么输入
    input_variables=["noun"],
)

# 选择一个名词！
my_noun = "plant"
# my_noun = "student"

print(similar_prompt.format(noun=my_noun))

llm(similar_prompt.format(noun=my_noun))

Output Parsers Method 1: Prompt Instructions & String Parsing

输出解析器方法 1：提示指令和字符串解析

一种格式化模型输出的有用方法。通常用于结构化输出。LangChain 的文档中列出了更多输出解析器。

两个大概念：

格式化指令(Format Instructions) - 自动生成的提示，告诉 LLM 如何根据您期望的结果格式化其响应
解析器(Parser) - 一种将模型的文本输出提取到所需结构（通常是 json）的方法

from langchain.output_parsers import StructuredOutputParser, ResponseSchema
from langchain.prompts import ChatPromptTemplate, HumanMessagePromptTemplate
from langchain.llms import OpenAI

llm = OpenAI(model_name="gpt-3.5-turbo-instruct")

# 您希望如何构建您的回复。这基本上是一个花哨的提示模板
response_schemas = [
    ResponseSchema(name="bad_string", description="This a poorly formatted user input string"),
    ResponseSchema(name="good_string", description="This is your response, a reformatted response")
]

# 您希望如何解析输出
output_parser = StructuredOutputParser.from_response_schemas(response_schemas)

# 查看你创建的格式化提示模板
format_instructions = output_parser.get_format_instructions()
print (format_instructions)

The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "```json" and "```":

```json
{
	"bad_string": string  // This a poorly formatted user input string
	"good_string": string  // This is your response, a reformatted response
}
```

template = """
You will be given a poorly formatted string from a user.
Reformat it and make sure all the words are spelled correctly

{format_instructions}

% USER INPUT:
{user_input}

YOUR RESPONSE:
"""

prompt = PromptTemplate(
    input_variables=["user_input"],
    partial_variables={"format_instructions": format_instructions},
    template=template
)

promptValue = prompt.format(user_input="welcom to califonya!")

print(promptValue)

You will be given a poorly formatted string from a user.
Reformat it and make sure all the words are spelled correctly

The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "```json" and "```":

```json
{
	"bad_string": string  // This a poorly formatted user input string
	"good_string": string  // This is your response, a reformatted response
}
```

% USER INPUT:
welcom to califonya!

YOUR RESPONSE:

llm_output = llm(promptValue)
llm_output

'```json\n{\n\t"bad_string": "welcom to califonya!",\n\t"good_string": "Welcome to California!"\n}\n```'

output_parser.parse(llm_output)

{'bad_string': 'welcom to califonya!', 'good_string': 'Welcome to California!'}

Output Parsers Method 2: OpenAI Fuctions

输出解析器方法 2：OpenAI 函数

当 OpenAI 发布函数调用时，游戏发生了变化。这是刚开始时推荐的方法。

他们专门训练模型以输出结构化数据。指定 Pydantic 模式并获得结构化输出变得非常容易。

有很多方法可以定义你的模式，我更喜欢使用 Pydantic 模型，因为它们的组织性很强。请随意参考 OpenAI 的文档以了解其他方法。

为了使用此方法，你需要使用支持函数调用的模型。我将使用 gpt4-0613

示例 1：

让我们首先定义一个可供我们提取的简单模型。

from langchain.pydantic_v1 import BaseModel, Field
from typing import Optional

class Person(BaseModel):
    """Identifying information about a person."""

    name: str = Field(..., description="The person's name")
    age: int = Field(..., description="The person's age")
    fav_food: Optional[str] = Field(None, description="The person's favorite food")

然后让我们创建一个链（稍后会详细介绍），它将为我们完成提取

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model='gpt-4-0613')

structured_llm = llm.with_structured_output(Person)
structured_llm.invoke(
    "Sally is 13, Joey just turned 12 and loves spinach. Caroline is 10 years older than Sally."
)

Person(name='Sally', age=13, fav_food=None)

注意到我们只有一个人的数据吗？那是因为我们没有指定我们想要多个。让我们更改我们的架构以指定我们想要一个人员列表（如果可能）。

from typing import Sequence

class People(BaseModel):
    """Identifying information about all people in a text."""

    people: Sequence[Person] = Field(..., description="The people in the text")

现在我们 call for 的是 People 而不是 Person

llm = ChatOpenAI(model='gpt-4-0613')

structured_llm = llm.with_structured_output(People)
structured_llm.invoke(
    "Sally is 13, Joey just turned 12 and loves spinach. Caroline is 10 years older than Sally."
)

People(people=[Person(name='Sally', age=13, fav_food=''), Person(name='Joey', age=12, fav_food='spinach'), Person(name='Caroline', age=23, fav_food='')])

让我们对它进行更多的解析

示例 2：枚举

现在让我们解析列表中提及的产品

import enum

llm = ChatOpenAI(model='gpt-4-0613')

class Product(str, enum.Enum):
    CRM = "CRM"
    VIDEO_EDITING = "VIDEO_EDITING"
    HARDWARE = "HARDWARE"

class Products(BaseModel):
    """Identifying products that were mentioned in a text"""

    products: Sequence[Product] = Field(..., description="The products mentioned in a text")

structured_llm = llm.with_structured_output(Products)
structured_llm.invoke(
    "The CRM in this demo is great. Love the hardware. The microphone is also cool. Love the video editing"
)

Products(products=[<Product.CRM: 'CRM'>, <Product.HARDWARE: 'HARDWARE'>, <Product.VIDEO_EDITING: 'VIDEO_EDITING'>])

Indexes - Structuring documents to LLMs can work with them

索引 - 将文档结构化为 LLM 可以使用它们

Document Loaders(文档加载器)

从其他来源导入数据的简便方法。与 OpenAI 插件共享功能，特别是检索插件

在此处查看大量文档加载器列表。Llama Index上还有更多内容。

HackerNews

from langchain.document_loaders import HNLoader

USER_AGENT environment variable not set, consider setting it to identify your requests.

loader = HNLoader("https://news.ycombinator.com/item?id=34422627")

data = loader.load()

print (f"Found {len(data)} comments")
print (f"Here's a sample:\n\n{''.join([x.page_content[:150] for x in data[:2]])}")

Found 76 comments
Here's a sample:

Ozzie_osman on Jan 18, 2023  
             | next [–] 

LangChain is awesome. For people not sure what it's doing, large language models (LLMs) are veOzzie_osman on Jan 18, 2023  
             | parent | next [–] 

Also, another library to check out is GPT Index (https://github.com/jerryjliu/gpt_ind

Books from Gutenberg Project

from langchain.document_loaders import GutenbergLoader

loader = GutenbergLoader("https://www.gutenberg.org/cache/epub/2148/pg2148.txt")

data = loader.load()

print(data[0].page_content[1855:1984])

o.—_Seneca_.





      At Paris, just after dark one gusty evening in the autumn of 18-,


      I was enjoying the twofold l

URLs and webpages

让我们在 Paul Graham 的网站上尝试一下

from langchain.document_loaders import UnstructuredURLLoader

urls = [
    "http://www.paulgraham.com/",
]

loader = UnstructuredURLLoader(urls=urls)

data = loader.load()

data[0].page_content

'New: The Right Kind of Stubborn | Google | Superlinear Want to start a startup? Get funded by Y Combinator . © mmxxiv pg'

Text Splitters(文本分割器)

通常情况下，您的文档对于您的 LLM 来说太长（就像一本书）。您需要将其分成几部分。文本分割器可以帮助您完成此操作。

您可以通过多种方式将文本分成几部分，尝试不同的方法，看看哪种方法最适合您。

from langchain.text_splitter import RecursiveCharacterTextSplitter

# 这是一份很长的文档，我们可以将其拆分开来。
with open('langchain-tutorials/data/PaulGrahamEssays/worked.txt') as f:
    pg_work = f.read()
    
print (f"You have {len([pg_work])} document")

You have 1 document

text_splitter = RecursiveCharacterTextSplitter(
    # 设置一个非常小的块大小，只是为了显示。
    chunk_size = 150,
    chunk_overlap  = 20,
)

texts = text_splitter.create_documents([pg_work])

print (f"You have {len(texts)} documents")

You have 610 documents

print ("Preview:")
print (texts[0].page_content, "\n")
print (texts[1].page_content)

Preview:
February 2021Before college the two main things I worked on, outside of school,
were writing and programming. I didn't write essays. I wrote what 

beginning writers were supposed to write then, and probably still
are: short stories. My stories were awful. They had hardly any plot,

有很多不同的方法可以进行文本拆分，这实际上取决于您的检索策略和应用程序设计。在此处查看更多拆分器

Retrievers(检索器)

将文档与语言模型相结合的简单方法。
检索器有很多种，其中最广泛支持的是 VectorStoreRetriever

from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings

loader = TextLoader('langchain-tutorials/data/PaulGrahamEssays/worked.txt')
documents = loader.load()

# 准备好分离器
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=50)

# 将文档拆分为文本
texts = text_splitter.split_documents(documents)

# 准备好嵌入引擎
embeddings = OpenAIEmbeddings()

# 嵌入您的文本
db = FAISS.from_documents(texts, embeddings)

# Init your retriever. Asking for just 1 document back
retriever = db.as_retriever()

retriever

VectorStoreRetriever(tags=['FAISS', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x00000279FF667E60>)

docs = retriever.invoke("what types of things did the author want to build?")

print("\n\n".join([x.page_content[:200] for x in docs[:2]]))

standards; what was the point? No one else wanted one either, so
off they went. That was what happened to systems work.I wanted not just to build things, but to build things that would
last.In this di

much of it in grad school.Computer Science is an uneasy alliance between two halves, theory
and systems. The theory people prove things, and the systems people
build things. I wanted to build things.

VectorStores(向量存储)

用于存储向量的数据库。最受欢迎的是 Pinecone 和 Weaviate。OpenAIs 检索器文档中有更多示例。Chroma 和 FAISS 易于在本地使用。

从概念上讲，将它们视为具有嵌入（向量）列和元数据列的表格。

Embedding	Metadata
[-0.00015641732898075134, -0.003165106289088726, …]	{‘date’ : '1/2/23}
[-0.00035465431654651654, 1.4654131651654516546, …]	{‘date’ : '1/3/23}

from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings

loader = TextLoader('langchain-tutorials/data/PaulGrahamEssays/worked.txt')
documents = loader.load()

# 准备好分离器
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=50)

# 将文档拆分为文本
texts = text_splitter.split_documents(documents)

# 准备好嵌入引擎
embeddings = OpenAIEmbeddings()

print (f"You have {len(texts)} documents")

You have 78 documents

embedding_list = embeddings.embed_documents([text.page_content for text in texts])

print (f"You have {len(embedding_list)} embeddings")
print (f"Here's a sample of one: {embedding_list[0][:3]}...")

You have 78 embeddings
Here's a sample of one: [-0.0015655719907954335, -0.010226364247500896, -0.012980138882994652]...

你的 vectorstore 存储了你的嵌入（☝️）并使它们易于搜索

Memory

帮助 LLM 记住信息。

记忆是一个有点宽泛的术语。它可以简单到记住你过去聊过的信息，也可以更复杂地检索信息。

我们将把它放在聊天消息用例中。这将用于聊天机器人。

记忆有很多种类型，请浏览文档以查看哪种适合您的用例。

Chat Message History

from langchain.memory import ChatMessageHistory
from langchain_openai import ChatOpenAI

chat = ChatOpenAI(temperature=0)

history = ChatMessageHistory()

history.add_ai_message("hi!")

history.add_user_message("what is the capital of france?")

history.messages

[AIMessage(content='hi!'),
 HumanMessage(content='what is the capital of france?')]

ai_response = chat.invoke(history.messages)
ai_response

AIMessage(content='The capital of France is Paris.', response_metadata={'token_usage': {'completion_tokens': 7, 'prompt_tokens': 20, 'total_tokens': 27}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-fe8d6ead-a686-4b74-9105-05b6a6328fff-0', usage_metadata={'input_tokens': 20, 'output_tokens': 7, 'total_tokens': 27})

history.add_ai_message(ai_response.content)
history.messages

[AIMessage(content='hi!'),
 HumanMessage(content='what is the capital of france?'),
 AIMessage(content='The capital of France is Paris.')]

Chains

自动组合不同的 LLM 调用和操作

例如：摘要 #1、摘要 #2、摘要 #3 > 最终摘要

观看此视频，了解不同的摘要链类型

链搜索有许多应用，您可以查看哪些最适合您的用例。

我们将介绍其中两个：

1. 简单顺序链(Simple Sequential Chains)

简单的链条，你可以将 LLM 的输出用作另一个的输入。适合分解任务（并保持你的 LLM 专注）

from langchain_openai import OpenAI
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain.chains import SimpleSequentialChain

llm = OpenAI(temperature=1)

template = """Your job is to come up with a classic dish from the area that the users suggests.
% USER LOCATION
{user_location}

YOUR RESPONSE:
"""
prompt_template = PromptTemplate(input_variables=["user_location"], template=template)

# Holds my 'location' chain
location_chain = LLMChain(llm=llm, prompt=prompt_template)

template = """Given a meal, give a short and simple recipe on how to make that dish at home.
% MEAL
{user_meal}

YOUR RESPONSE:
"""
prompt_template = PromptTemplate(input_variables=["user_meal"], template=template)

# Holds my 'meal' chain
meal_chain = LLMChain(llm=llm, prompt=prompt_template)

overall_chain = SimpleSequentialChain(chains=[location_chain, meal_chain], verbose=True)

review = overall_chain.invoke("Rome")

[1m> Entering new SimpleSequentialChain chain...[0m
[36;1m[1;3m
A classic dish from Rome is Cacio e Pepe, which translates to "cheese and pepper". This simple yet delicious pasta dish is made with only a few ingredients: spaghetti, Pecorino Romano cheese, and black pepper. The cheese is melted into a creamy sauce with the pasta water, and then mixed with spaghetti and freshly ground black pepper. It's a staple in Rome and can be found in many restaurants and trattorias throughout the city.[0m
[33;1m[1;3m
Ingredients:
- 12 oz spaghetti
- 1 cup Pecorino Romano cheese, freshly grated
- 2 tbsp freshly ground black pepper

Instructions:
1. Cook spaghetti according to package instructions, reserving 1 cup of pasta water before draining.
2. In a large pan, heat a couple of tablespoons of the reserved pasta water over medium heat.
3. Gradually add in half of the grated Pecorino Romano cheese, stirring constantly to melt and create a creamy consistency.
4. Add in the cooked spaghetti and mix well, adding more pasta water if needed.
5. Once the spaghetti is coated in the cheese sauce, add in the remaining grated cheese and freshly ground black pepper, mixing well.
6. Serve hot and enjoy your homemade Cacio e Pepe! Optional: top with extra cheese and pepper to taste.[0m

[1m> Finished chain.[0m

2. Summarization Chain(摘要链)

轻松浏览大量长文档并获得摘要。观看此视频了解除 map-reduce 之外的其他链类型

from langchain.chains.summarize import load_summarize_chain
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

loader = TextLoader('langchain-tutorials/data/PaulGrahamEssays/disc.txt')
documents = loader.load()

# Get your splitter ready
text_splitter = RecursiveCharacterTextSplitter(chunk_size=700, chunk_overlap=50)

# Split your docs into texts
texts = text_splitter.split_documents(documents)

# There is a lot of complexity hidden in this one line. I encourage you to check out the video above for more detail
chain = load_summarize_chain(llm, chain_type="map_reduce", verbose=True)
chain.run(texts)

[1m> Entering new MapReduceDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mWrite a concise summary of the following:


"January 2017Because biographies of famous scientists tend to 
edit out their mistakes, we underestimate the 
degree of risk they were willing to take.
And because anything a famous scientist did that
wasn't a mistake has probably now become the
conventional wisdom, those choices don't
seem risky either.Biographies of Newton, for example, understandably focus
more on physics than alchemy or theology.
The impression we get is that his unerring judgment
led him straight to truths no one else had noticed.
How to explain all the time he spent on alchemy
and theology?  Well, smart people are often kind of
crazy.But maybe there is a simpler explanation. Maybe"


CONCISE SUMMARY:[0m
Prompt after formatting:
[32;1m[1;3mWrite a concise summary of the following:


"the smartness and the craziness were not as separate
as we think. Physics seems to us a promising thing
to work on, and alchemy and theology obvious wastes
of time. But that's because we know how things
turned out. In Newton's day the three problems 
seemed roughly equally promising. No one knew yet
what the payoff would be for inventing what we
now call physics; if they had, more people would 
have been working on it. And alchemy and theology
were still then in the category Marc Andreessen would 
describe as "huge, if true."Newton made three bets. One of them worked. But 
they were all risky."


CONCISE SUMMARY:[0m

[1m> Finished chain.[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mWrite a concise summary of the following:


"
Famous scientists' biographies tend to hide their mistakes, leading to an underestimation of the risks they took. As conventional wisdom, their successful choices don't seem risky. However, these scientists also spent time on less conventional areas such as alchemy and theology. This could be explained by the fact that intelligent people often have eccentricities.

 In Newton's time, physics, alchemy, and theology were seen as equally promising fields to work in. However, now we know that physics was the most successful, but at the time they were all considered risky ventures with unknown potential outcomes."


CONCISE SUMMARY:[0m

[1m> Finished chain.[0m

[1m> Finished chain.[0m





"\n\nFamous scientists' biographies often overlook their mistakes, making their choices seem less risky. However, they also pursued less conventional areas such as alchemy and theology, possibly due to their eccentricities. In the past, alchemy and theology were seen as promising as physics, but now we know physics was the most successful. These fields were considered risky with uncertain outcomes at the time."

Agents

LangChain 官方文档完美地描述了代理：

某些应用程序不仅需要对 LLM/其他工具的预定调用链，还可能需要依赖于用户输入的未知链。在这些类型的链中，有一个“代理”可以访问一套工具。根据用户输入，代理可以决定调用这些工具中的哪一个（如果有的话）。

基本上，您不仅将 LLM 用于文本输出，还用于决策。此功能的酷炫和强大怎么强调都不为过。

Sam Altman 强调 LLM 是优秀的“推理引擎”。代理利用了这一点。

Agents

驱动决策的语言模型。

更具体地说，代理接收输入并返回与要采取的操作以及操作输入相对应的响应。您可以在此处看到不同类型的代理（它们更适合不同的用例）。

Tools

代理的“功能”。这是函数之上的抽象，使 LLM（和代理）能够轻松地与其交互。例如：Google 搜索。

此区域与 OpenAI 插件有共同之处。

Toolkit

您的代理可以从中选择的工具组

让我们将它们整合在一起：

from langchain.agents import load_tools
from langchain.agents import initialize_agent
from langchain.llms import OpenAI
import json

llm = OpenAI(temperature=0)

serpapi_api_key=os.getenv("SERP_API_KEY", "YourAPIKey")

这个key没弄到

toolkit = load_tools(["serpapi"], llm=llm, serpapi_api_key=serpapi_api_key)

agent = initialize_agent(toolkit, llm, agent="zero-shot-react-description", verbose=True, return_intermediate_steps=True)

response = agent({"input":"what was the first album of the" 
                    "band that Natalie Bergman is a part of?"})