VectorStore-Backed Memory

Ricardo Reis
5 min readMay 28, 2023

VectorStoreRetrieverMemory armazena memórias em um VectorDB e consulta os top-K documentos mais ‘salientes’ toda vez que é chamada.”

Aqui, “memórias” se referem a fragmentos de conversas ou outras informações pertinentes que a IA deve lembrar. “VectorDB” é o banco de dados onde essas memórias são armazenadas na forma de vetores numéricos. “Top-K documentos mais ‘salientes’” refere-se aos K documentos (onde K é um número) que mais se destacam ou são mais relevantes para uma consulta ou interação específica. Portanto, toda vez que a VectorStoreRetrieverMemory é chamada, ela procura e retorna os K documentos mais relevantes armazenados no banco de dados.

“VectorStore-Backed Memory” pode ser traduzido para o português como “Memória suportada por VectorStore”.

Essa frase se refere a uma técnica de armazenamento de memória em que os dados são guardados em uma estrutura chamada VectorStore. Nesse método, os fragmentos de conversas (ou qualquer outra informação relevante) são transformados em vetores numéricos e armazenados em um banco de dados vetorial.

Essa abordagem permite que os dados possam ser recuperados de forma eficiente com base na similaridade semântica, ou seja, o conteúdo e o significado da informação, e não apenas com base em correspondências exatas de palavras-chave. Isso pode ser particularmente útil em aplicações de IA, como chatbots, onde a capacidade de entender e se referir a informações contextuais de conversas anteriores pode melhorar significativamente a qualidade das interações.

VectorStoreRetrieverMemory é diferente da maioria das outras classes Memory porque não rastreia explicitamente a ordem das interações.

Nesse caso, os “docs” são trechos de conversas anteriores. Isso pode ser útil para se referir a informações relevantes que a IA recebeu no início da conversa.

from datetime import datetime
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.llms import OpenAI
from langchain.memory import VectorStoreRetrieverMemory
from langchain.chains import ConversationChain
from langchain.prompts import PromptTemplate

Inicialize seu VectorStore

Dependendo da loja que você escolher, esta etapa pode parecer diferente. Consulte a documentação relevante do VectorStore para obter mais detalhes.

import faiss

from langchain.docstore import InMemoryDocstore
from langchain.vectorstores import FAISS


embedding_size = 1536 # Dimensions of the OpenAIEmbeddings
index = faiss.IndexFlatL2(embedding_size)
embedding_fn = OpenAIEmbeddings().embed_query
vectorstore = FAISS(embedding_fn, index, InMemoryDocstore({}), {})

Crie seu VectorStoreRetrieverMemory

O objeto de memória é instanciado de qualquer VectorStoreRetriever.

# In actual usage, you would set `k` to be a higher value, but we use k=1 to show that
# the vector lookup still returns the semantically relevant information
retriever = vectorstore.as_retriever(search_kwargs=dict(k=1))
memory = VectorStoreRetrieverMemory(retriever=retriever)
# When added to an agent, the memory object can save pertinent information from conversations or used tools
memory.save_context({"input": "My favorite food is pizza"}, {"output": "thats good to know"})
memory.save_context({"input": "My favorite sport is soccer"}, {"output": "..."})
memory.save_context({"input": "I don't the Celtics"}, {"output": "ok"}) #
# Notice the first result returned is the memory pertaining to tax help, which the language model deems more semantically relevant
# to a 1099 than the other documents, despite them both containing numbers.
print(memory.load_memory_variables({"prompt": "what sport should i watch?"})["history"])
input: My favorite sport is soccer
output: ...

Usando em uma cadeia

Vamos percorrer um exemplo, configurando novamente verbose=Truepara que possamos ver o prompt.

llm = OpenAI(temperature=0) # Can be any valid LLM
_DEFAULT_TEMPLATE = """The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.
Relevant pieces of previous conversation:
{history}
(You do not need to use these pieces of information if not relevant)Current conversation:
Human: {input}
AI:"""
PROMPT = PromptTemplate(
input_variables=["history", "input"], template=_DEFAULT_TEMPLATE
)
conversation_with_summary = ConversationChain(
llm=llm,
prompt=PROMPT,
# We set a very low max_token_limit for the purposes of testing.
memory=memory,
verbose=True
)
conversation_with_summary.predict(input="Hi, my name is Perry, what's up?")
> Entering new ConversationChain chain...
Prompt after formatting:
The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.
Relevant pieces of previous conversation:
input: My favorite food is pizza
output: thats good to know
(You do not need to use these pieces of information if not relevant)Current conversation:
Human: Hi, my name is Perry, what's up?
AI:
> Finished chain.
" Hi Perry, I'm doing well. How about you?"
# Here, the basketball related content is surfaced
conversation_with_summary.predict(input="what's my favorite sport?")
> Entering new ConversationChain chain...
Prompt after formatting:
The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.
Relevant pieces of previous conversation:
input: My favorite sport is soccer
output: ...
(You do not need to use these pieces of information if not relevant)Current conversation:
Human: what's my favorite sport?
AI:
> Finished chain.
' You told me earlier that your favorite sport is soccer.'
# Even though the language model is stateless, since relavent memory is fetched, it can "reason" about the time.
# Timestamping memories and data is useful in general to let the agent determine temporal relevance
conversation_with_summary.predict(input="Whats my favorite food")
> Entering new ConversationChain chain...
Prompt after formatting:
The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.
Relevant pieces of previous conversation:
input: My favorite food is pizza
output: thats good to know
(You do not need to use these pieces of information if not relevant)Current conversation:
Human: Whats my favorite food
AI:
> Finished chain.
' You said your favorite food is pizza.'
# The memories from the conversation are automatically stored,
# since this query best matches the introduction chat above,
# the agent is able to 'remember' the user's name.
conversation_with_summary.predict(input="What's my name?")
> Entering new ConversationChain chain...
Prompt after formatting:
The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.
Relevant pieces of previous conversation:
input: Hi, my name is Perry, what's up?
response: Hi Perry, I'm doing well. How about you?
(You do not need to use these pieces of information if not relevant)Current conversation:
Human: What's my name?
AI:
> Finished chain.
' Your name is Perry.'

--

--