Sorry, I am not sure the exact reason why you decided to go with the conversation memory-based approach. However, the good news is that it's not required for this use case. The reason for optimization is to save cost by reducing the token size. Also improve the overall performance by asking for what is required.