I need to build a chatbot to lead people to use an app, the app has some pretty complicated features.
I need to use local AI for privacy.
I need to use RAG for user manuals and educational videos.
Thinking of model like Llama 3.1 8B and some other local models for transcribing and embedding.
Do you think such models, particularly Llama 3.1 8B and the available small embedding models can perform well in handling conversation to lead users to configure a relatively complex app?
Would Llama 3.1 8B follow strictly a complex system prompt and make a correct usage of tools? Any other SLMs can handle this well?
What about small embedding models?
Anything about small transcribing models?
Regards
Shadi Ghaith