GenAI is everywhere you look, and organizations across industries are putting pressure on their teams to join the race – 77% of business leaders fear they’re already missing out on the benefits of GenAI.
Data teams are scrambling to answer the call. But building a generative AI model that actually drives business value is hard.
And in the long run, a quick integration with the OpenAI API won’t cut it. It’s GenAI, but where’s the moat? Why should users pick you over ChatGPT?
That quick check of the box feels like a step forward. Still, if you aren’t already thinking about how to connect LLMs with your proprietary data and business context actually to drive differentiated value, you’re behind.
That’s not hyperbole. This week, I’ve talked with half a dozen data leaders on this topic alone. It wasn’t lost on any of them that this is a race. At the finish line, there are going to be winners and losers: the Blockbusters and the Netflixes.
If you feel like the starter’s gun has gone off, but your team is still at the starting line stretching and chatting about “bubbles” and “hype,” I’ve rounded up five hard truths to help shake off the complacency.
1. Your Generative AI Features Are Not Well Adopted, and You’re Slow to Monetize
“Barr, if generative AI is so important, why are the current features we’ve implemented so poorly adopted?”
Well, there are a few reasons. One, your AI initiative wasn’t built to respond to an influx of well-defined user problems. For most data teams, that’s because you’re racing, and it’s early, and you want to gain some experience. However, it won’t be long before your users have a problem that GenAI best solves, and when that happens – you will have much better adoption compared to your tiger team brainstorming ways to tie GenAI to a use case.
And because it’s early, the generative AI features that have been integrated are just “ChatGPT but over here.”
Let me give you an example. Think about a productivity application you might use every day to share organizational knowledge. An app like this might offer a feature to execute commands like “Summarize this,” “Make longer,” or “Change tone” on blocks of unstructured text. One command equals one AI credit.
Yes, that’s helpful, but it’s not differentiated.
Maybe the team decides to buy some AI credits, or perhaps they just simply click over on the other tab and ask ChatGPT. I don’t want to completely overlook or discount the benefit of not exposing proprietary data to ChatGPT. Still, it’s also a smaller solution and vision than what’s being painted on earnings calls across the country.
That pesky middle step from concept to value.
So consider: What’s your GenAI differentiator and value add? Let me give you a hint: high-quality proprietary data.
That’s why a RAG model (or sometimes, a fine-tuned model) is so important for Gen AI initiatives. It gives the LLM access to that enterprise’s proprietary data. (I’ll explain why below.)
2. You’re Scared To Do More With Gen AI
It’s true: generative AI is intimidating.
Sure, you could integrate your AI model more deeply into your organization’s processes, but that feels risky. Let’s face it: ChatGPT hallucinates and can’t be predicted. There’s a knowledge cutoff that leaves users susceptible to out-of-date output. There are legal repercussions to data mishandling and providing consumers with misinformation, even if accidental.
Sounds real enough, right? Llama 2 sure thinks so.
Your data mishaps have consequences. And that’s why it’s essential to know exactly what you are feeding GenAI and that the data is accurate.
In an anonymous survey, we sent to data leaders asking how far away their team is from enabling a Gen AI use case, one response was, “I don’t think our infrastructure is the thing holding us back. We’re treading quite cautiously here – with the landscape moving so fast and the risk of reputational damage from a ‘rogue’ chatbot, we’re holding fire and waiting for the hype to die down a bit!”
This is a widely shared sentiment across many data leaders I speak to. If the data team has suddenly surfaced customer-facing, secure data, then they’re on the hook. Data governance is a massive consideration and a high bar to clear.
These are real risks that need solutions, but you won’t solve them by sitting on the sideline. There is also a real risk of watching your business being fundamentally disrupted by the team that figured it out first.
Grounding LLMs in your proprietary data with fine tuning and RAG is a big piece to this puzzle, but it’s not easy…
3. RAG Is Hard
I believe that RAG (retrieval augmented generation) and fine-tuning are the centerpieces of the future of enterprise generative AI. However, RAG is a simpler approach in most cases; developing RAG apps can still be complex.
Can’t we all just start RAGing? What’s the big deal?
RAG might seem like the obvious solution for customizing your LLM. But RAG development comes with a learning curve, even for your most talented data engineers. They need to know prompt engineering, vector databases and embedding vectors, data modeling, data orchestration, data pipelines, and all for RAG. And, because it’s new (introduced by Meta AI in 2020), many companies just don’t yet have enough experience with it to establish best practices.