_plan_or_execute() Generate answer or request the tool calls:¶
Decides how to call the model:
- Builds messages via _compose_message_plan_execution() (which renders prompts and history). These are the instructions for the LLMs the context on which it will build it's next action. To either call tools to gather more information or to stream a message.
- Case 1: If forced tools exist and it’s the first iteration → call LLM once per toolChoice, merge tool calls and references across responses.
- Case 2: If it’s the last iteration → do not provide tools; force the model to answer.
- Case 3: Default → provide tools; let the model decide to call tools or stream content.
The method always returns a LanguageModelStreamResponse which can includes:
- message.text or/and
- tool_calls
This makes heavy use of the function complete_with_references_async, it is described in detail here LINK.
In a nutshell it either streams a response from the llm or/and it returns the decision from the LLM that more tools need to be used, in oder for the LLM to answer
Should it stream it makes sure the references are automatically cited correctly and with proper linkage so it can be displayed to the user.
Flow diagram:
Detailed description of Step A from the Main Control flow:
# @track()asyncdef_plan_or_execute(self)->LanguageModelStreamResponse:self._logger.info("Planning or executing the loop.")messages=awaitself._compose_message_plan_execution()self._logger.info("Done composing message plan execution.")# Forces tool calls only in first iterationif(len(self._tool_manager.get_forced_tools())>0andself.current_iteration_index==0):self._logger.info("Its needs forced tool calls.")self._logger.info(f"Forced tools: {self._tool_manager.get_forced_tools()}")responses=[awaitself._chat_service.complete_with_references_async(messages=messages,model_name=self._config.space.language_model.name,tools=self._tool_manager.get_tool_definitions(),content_chunks=self._reference_manager.get_chunks(),start_text=self.start_text,debug_info=self._debug_info_manager.get(),temperature=self._config.agent.experimental.temperature,other_options=self._config.agent.experimental.additional_llm_options|{"toolChoice":opt},)foroptinself._tool_manager.get_forced_tools()]# Merge responses and refs:tool_calls=[]references=[]forrinresponses:ifr.tool_calls:tool_calls.extend(r.tool_calls)references.extend(r.message.references)stream_response=responses[0]stream_response.tool_calls=(tool_callsiflen(tool_calls)>0elseNone)stream_response.message.references=referenceselif(self.current_iteration_index==self._config.agent.max_loop_iterations-1):self._logger.info("we are in the last iteration we need to produce an answer now")# No tool calls in last iterationstream_response=awaitself._chat_service.complete_with_references_async(messages=messages,model_name=self._config.space.language_model.name,content_chunks=self._reference_manager.get_chunks(),start_text=self.start_text,debug_info=self._debug_info_manager.get(),temperature=self._config.agent.experimental.temperature,other_options=self._config.agent.experimental.additional_llm_options,)else:self._logger.info(f"we are in the iteration {self.current_iteration_index} asking the model to tell if we should use tools or if it will just stream")stream_response=awaitself._chat_service.complete_with_references_async(messages=messages,model_name=self._config.space.language_model.name,tools=self._tool_manager.get_tool_definitions(),content_chunks=self._reference_manager.get_chunks(),start_text=self.start_text,debug_info=self._debug_info_manager.get(),temperature=self._config.agent.experimental.temperature,other_options=self._config.agent.experimental.additional_llm_options,)returnstream_response