Being a person exposed to ai ,llms and investing due to my job and being a healthy Investors always fascinated me can we leveraged both.
In a cost effective and experimental way due to work I was exploring into RAG systems and after using it with investor conference material of navine flourine , I was truly amazed that it actually did something in reality as I put around 20+ conference documents.
So does anyone know or tried something like this as I want some inputs as this system that I am building is currently building is good to do many type of docs but have only started less then 2 week ago with some issues and mostly solved in coding process.
So would like to discuss with the community’s over it and if is fruitful would also love to share( note I have made as a side project but have also devoted some time of company so need to ask the job company before sharing as it was a part of their poc)
As a note this is my first post as a first thread so would also love some comments also how to write better
3 Likes
Can you please explain, you put 20+ documents of conf calls, what prompt you gave & how it performed?
Like their is no limit of documents like I have put around 15+ docs though it took around.
For the case of prompt it was earlier designed to be a chat like model at least in terms of persona now I am switching to a financial analyst persona(testing is required).
In current scheme was using while con call but I am also testing summarization + embedding technique with a better model.
Like was using tinny llama but using mistral , phi or even llama 2 40B could yield better result.
Like as a question it was able to translate the growth and explain what molecules company make and some other like future capes.
But right now comparison don’t work.
I am also trying to find Claude 2 like large model so a great amount of data can be their in case of context
3 Likes
Also for test would require some people help to test the functionality.
As everything is dockerized so by using just postman a person can question , put pdfs etc in it.
I know making a cool UI would make things easy but firstly I am a back end first person and testing with a front end will only delay this at the end
This sounds very interesting
Where can I download the docker image to run similar tests ?
Maybe we can run a small group here to share our findings
Like I need some revaluation of the summary approach and a prompt change so after that will share the repo link here
Sounds very interesting. I have tried Phi-2 on LLM studio. How are you uploading the documents? Are you also training the model?
what metrics are you using to evaluate your mode response, I extensively work on LLMs and from my experience what i can say is the model response is not always reliable, the model provides you incorrect response confidently, also the more data you provide to the model, the more chances of it to hallucinate, so i think it’s very important to put some metrics in place for the response evaluation especially in cases of analyzing financial documents where the numbers are of utmost importance.
I am actually want to work on that only like i want some more real world question and answers to check it.
Working is done that phi2 and other also for this I am using both lmstudio and ollama support.
I am doing a RAG system that is used and using the embedding extraction to talk to database of these pdf and for halisunation I was earlier using tinyllama and after move to phi2 it work better.
Will need suggestions how to test as in my organisation I am not responsible for testing mainly for integration and making.
In future can add finance or chat finance based fine tuned models to solve it.
In rag the best case I found is to stuff accurate and good data using relevance and limit.
Like in navinflourine after phi2 it was 8 and 40% relevenace while for itc it was 20 and 60.
For tech stack I am using Qdrant,Hugging face text embedding, semantic kernel , .net api format mainly and ollama with everything being in a docker container and all stored locally
In this week only I am planning to public a fork of my project to my personal git and we all can contribute and make it useful and fast.
Like i think model picking and chunking could solve a great deal of hallucination problems