I'm building a sales and technical support agent for a customer of mine and the problem i'm facing is that with ollama 3.2 the respond are taking very long 2-4 min and the amount of token they are using is huge - has anymore encountered this type of symthom and if so could you please guide me towards a solution?