History Manager
📘 HistoryManager Documentation¶
The HistoryManager is a critical component responsible for managing the conversation history, tool call results, and references during the orchestration process. It ensures that the history provided to the LLM is optimized to fit within the token window constraints while maintaining a coherent and complete context for the conversation.
🔑 Key Responsibilities (Expanded)¶
1. Conversation History Management¶
-
Tracking and Storing History:
TheHistoryManagermaintains a detailed record of all user messages, assistant responses, and tool call results. This includes both the current conversation loop and any prior interactions stored in the database.- User Messages: Captures the original user input and any rendered versions (e.g., processed with Jinja templates).
- Assistant Responses: Tracks the assistant's replies, including system-generated messages and tool call results.
- Tool Call Results: Logs the outputs of tools invoked during the conversation.
-
Combining Uploaded Content:
Uploaded files, such as documents or images, are integrated into the conversation history. This ensures that the LLM has access to all relevant context when generating responses.- Uploaded content is processed and merged with the conversation history.
- A portion of the token window is reserved for this content, as configured in the
UploadedContentConfig.
-
Unified View:
TheHistoryManagercreates a cohesive history by merging uploaded content, user messages, and assistant responses. This unified view is essential for providing the LLM with a complete context for generating accurate and relevant answers.
2. Token Window Optimization¶
-
Token Limit Awareness:
Each LLM has a fixed token limit for the input it can process in a single API call. TheHistoryManagerensures that the conversation history fits within this limit by dynamically adjusting its size. -
Dynamic Reduction with Loop Token Reducer:
The Loop Token Reducer is a specialized component used to reduce the size of the history dynamically. It prioritizes the most relevant references and messages while discarding less critical information.- Reference Reduction: Limits the number of references included in the history to avoid exceeding the token window.
- Message Prioritization: Ensures that the most recent and relevant messages are retained.
-
Balancing Uploaded Content and History:
TheHistoryManagerallocates a portion of the token window for uploaded content and the remaining portion for conversation history. This balance is configurable and ensures that both types of information are represented effectively. -
Optimization Goals:
- Maximize the amount of relevant information provided to the LLM.
- Ensure that the history remains coherent and complete, even after reduction.
3. Tool Call Integration¶
-
Appending Tool Call Queries:
When the orchestrator invokes tools, theHistoryManagerappends the tool call queries to the history. This ensures that the LLM has a record of the tools it requested and their purposes. -
Appending Tool Call Results:
After the tools return their results, theHistoryManagerappends these outputs to the history. This includes:- Successful Results: The content or references generated by the tool.
- Failed Results: Error messages indicating why the tool call failed.
-
Temporary and Persistent Storage:
- Tool call queries and results are temporarily stored during the current conversation loop.
- Persistent storage of tool call data is not yet implemented but is a planned improvement.
-
Context for Subsequent Interactions:
By integrating tool call queries and results into the history, theHistoryManagerensures that the LLM has the necessary context for follow-up interactions.
4. Post-Processing and Cleanup¶
-
Removing Unnecessary Content:
TheHistoryManagerhandles post-processing steps to clean up the history. This includes removing content that is not directly relevant to the LLM's understanding of the conversation. Examples include:- Follow-Up Questions: Generated by post-processors but not part of the LLM's output.
- Stock Tickers or Graphs: Added for user display but irrelevant to the LLM.
-
Using
remove_from_text:
Theremove_from_textfunction is used to strip out post-processed content from the history. This ensures that the LLM is not confused by content it did not generate. -
Maintaining Coherence:
Post-processing ensures that the history remains coherent and focused on the conversation's core context. This improves the LLM's ability to generate accurate and relevant responses. -
Dynamic Adjustments:
Post-processing is applied dynamically during each loop of the orchestrator, ensuring that the history is always optimized for the current interaction.
🛠️ Key Functionalities¶
1. Adding Tool Call Results¶
-
add_tool_call_results(tool_call_results: list[ToolCallResponse])
Appends the results of tool calls to the history. If a tool call fails, an error message is added instead.
-
_append_tool_call_result_to_history(tool_response: ToolCallResponse)
Adds a successful tool call result to the history.
2. Retrieving History for Model Calls¶
get_history_for_model_call(...)
Retrieves the conversation history formatted for the LLM, ensuring it fits within the token window. This includes:- The original user message.
- The rendered user message (processed via Jinja templates).
- The rendered system message.
- The
remove_from_textfunctions are handed over function to clean up the history from post processing steps or other artifacts in the messages that are not produced by the LLM. In order to not confuse the LLM with these text snippets that have not been produced by it. E.g. follow-up questions.
3. Token Window Management¶
- The Loop Token Reducer is used to dynamically adjust the size of the history to fit within the LLM's token limit. This involves:
- Reducing the number of references included in the history.
- Prioritizing the most relevant chunks and messages.
- Ensuring that the history remains coherent and complete.
4. Appending Tool Calls¶
_append_tool_calls_to_history(tool_calls: list[LanguageModelFunction])
Adds tool call queries to the history.
5. Assistant Message Management¶
add_assistant_message(message: LanguageModelAssistantMessage)
Appends an assistant message to the history.
🛠️ Areas for Improvement¶
- Tool Call and Tool Message Persistence
-
Currently, tool calls and tool messages are not saved in the database. This limits the ability to reconstruct past interactions fully.
-
Uploaded Content Correlation
-
Uploaded images and files are not directly linked to user messages in the history. This makes it difficult to reconstruct the context of uploaded content.
-
Code Cleanup
-
The history construction logic, especially for database interactions, requires refactoring for better maintainability and clarity.
-
Enhanced Reference Management
- Improve the integration with the
ReferenceManagerto better handle references across multiple iterations and tools.