A Technology Gifted by Aliens Without a Manual

Reading Time: 5 mins

In an intriguing podcast, Gramener’s CEO, S Anand, engaged with Kathirmani Sukumar, Co-founder of Quelit, to discuss the profound impact of Large Language Models (LLMs).

The discussion started with Kathir outlining his plan to explore the impact of LLMs across various sectors, probing into their effects on companies and individuals in the dynamic field of data science.

Watch the full podcast below.

LLMs are Gifted By Aliens Without Instruction Manual

As Anand responded to the opening question about his opinion and experience with LLMs, he highlighted the non-deterministic nature of Large Language Models.

Drawing an analogy, he portrayed LLMs as powerful entities, saying, “I found it easier when I stopped thinking of LLMs as machines, but rather as humans or aliens that we just don’t understand.” This alien comparison emphasized LLMs’ complex and elusive nature, challenging traditional views on computing.

Anand highlighted the reproducibility challenge in LLM outputs, emphasizing that the same question might yield varied answers.

He stated, “How LLMs come up with results are also non-trivial,” citing instances where a simple prompt and subsequent queries could lead to diverse responses. Anand’s insightful observation highlighted the need to adapt mental models when dealing with the intricate workings of LLMs.

Navigating the Complexity of LLM Outputs

Kathir dived further into the complexities of LLMs, addressing concerns about reproducibility and the consistency of results.

He posed a scenario where the sentiment analysis score for a given passage might fluctuate and asked Anand how to handle such variations.

Anand emphasized LLMs’ human-centric nature, stating, “Humans are the ones defining these.” He suggested approaches such as framing questions better, crafting studies, sampling effectively, or providing pre-questions to reduce ambiguity.

LLM Integration at Gramener

Kathir asked about Gramener’s involvement with Large Language Models (LLMs) and any recent company projects.

Anand revealed that the company currently boasts four applications in production and a staggering 18 ongoing pilot projects. The rapid pace of implementation shows Gramener’s commitment to harnessing the potential of LLMs.

Must Read: Find out the top Generative AI projects brewing at Gramener.

Kathir then inquired about Gramener’s approach to LLMs, specifically questioning whether the company experimented with open-source models or primarily focused on platforms like OpenAI.

“For generation, it’s more OpenAI or Azure OpenAI,” says Anand, emphasizing the significance of the environment. Regarding embeddings, Gramener has leaned towards open models, with Microsoft’s E5 as an example, noting their comparable, if not superior, performance in certain instances.

The podcast further explored Gramener’s model selection process, asking whether the company began with lower-end models or directly explored the highest-tier options.

Anand clarified their strategy, revealing that “during evaluation, we scrutinized the highest models to understand capabilities. In contrast, for testing and prototyping, we opted for lower-end models to manage costs and computational efforts. However, when it came to production, we reverted to the highest-end models, ensuring optimal quality.”

Exploring Small LLMs

When asked about using smaller LLMs, Anand responded, “The lower-end local models work fine, but not for the final generation or a generalized task.”

While acknowledging their limitations in standalone tasks compared to giants like GPT, Anand highlighted their efficacy in specific scenarios. “Lower-end LLMs found utility in internal tasks, such as enhancing chatbot queries or performing targeted extractions,” said Anand.

Also Read: How to mitigate LLM hallucinations while producing Gen AI outputs.

Entering the LLM Domain: User or Creator?

Regarding the practical implications of LLMs for budding data scientists, particularly in Natural Language Processing (NLP), Kathir was curious to know the future trajectory for newcomers.

Anand mentioned, “For individuals already immersed in the industry, continuing with modeling might make sense. However, for newcomers, the landscape could lean towards becoming users rather than creators.”

Anand also highlighted that the opportunities appeared more abundant in utilizing existing models, emphasizing that while creating models is valuable, the current opportunities seem more concentrated in their application.

Accessibility Challenges: Running LLMs Locally

The conversation then steered towards the challenges faced by small startups and individuals wanting to engage with LLMs, particularly considering the high costs and resource requirements.

Addressing concerns about affordability and accessibility, Anand acknowledged the substantial investment required for high-end GPUs and the associated costs. However, he countered this by pointing out that certain LLMs, such as chat GPT, come at a fraction of the cost.

He also suggests utilizing cloud services provided by major platforms like Azure, Google, or Amazon. He said, “These platforms often host secret data securely, and running LLMs through their services is a practical and cost-effective approach for small startups or individuals lacking substantial funding.”

Exploring LLMs for Storytelling and Visualization

Kathir inquired about Anand’s exploration of LLMs in storytelling and data visualization.

Anand shared how, at Gramener, they experimented with clustering content using LLMs, taking the example of the Gramener Employee Survey.

They posed questions like “What works well at Gramener?” and categorized the responses. Anand highlighted topics like freedom to experiment, work-life balance, collaboration, and leadership as strong clusters. This method, powered by LLMs, allowed them to gain valuable insights into employee sentiments.

He mentioned their use of LLMs by exploring public reviews of Quaker Oats on Amazon.

By categorizing the reviews into topics like innovation, portability, and nutritional value, they identified aspects customers loved and areas needing improvement, such as portion size packaging and ingredients.

Anand showcased how LLMs facilitated dynamic trend analysis. For instance, the trending data on portion size showed improvement over time. Notably, the entire chart, 100%, was generated by ChatGPT, demonstrating the potential of LLMs in automating data visualization without human code editing.

Moving beyond corporate data, Anand shared an intriguing example involving G20 documents. Using topic modeling, they assessed how often specific topics, like multilateral development, finance, and climate-smart agriculture, were mentioned in the G20 documents.

This was done for the Gates Foundation’s interest in the G20’s discussions, revealing critical gaps, such as the limited focus on health services, finance, accessibility, and research.

In this way, Anand showcased the versatility of LLMs in extracting meaningful insights, driving innovative storytelling, and automating complex data visualizations for various domains, from employee sentiments to global policy discussions.

The evolving landscape of LLMs opens new avenues for exploration and understanding in the realm of data analysis and interpretation.

Anand’s Upcoming Ambitions in NLP

Anand expressed a keen interest in bidirectional embeddings, aiming to subtract specific embeddings to explore differences in content. Reflecting on clustering Gramener’s feedback, he envisions subtracting specific embeddings to distill essential insights.

Anand poses a thought-provoking scenario: “Are love and hate very different from each other or very similar if you take out emotions?”

This kind of embedding arithmetic fascinates him. Moving forward, Anand explores the challenges of converting embeddings back to words, pointing out the scarcity of effective bidirectional encoders. His exploration extends to the embedding space, contemplating shorter representations for efficient generation and navigating the complex terrain of language model possibilities.

Anand wonders, “How do stories move in the embedding space?” He imagines a data story like a journey through operations, asking, “Is there a theory? Are there patterns for good or not-so-good stories?”

Excited, Anand is eager to explore the embedding space, aiming to uncover the secrets of storytelling structures. It’s like he’s on a mission to figure out how stories flow and what makes them better or worse in the world of language models. The embedding space becomes a fascinating terrain for Anand to navigate and understand the essence of storytelling.

He playfully asks, “How do we frame better questions using LLMs?” Exploring the use of language models for social study support, Anand wonders if we can determine someone’s political leaning or psychological profile based on their writings. It’s a quest to make LLMs more human-like in their responses.

Then, he dives into the world of agents, treating LLMs like team members with distinct personalities. Anand’s excitement is palpable as he shares an experiment using GPT Engineer to create a collaborative team for coding and content creation. “Now I have a little team working on this,” he says, envisioning copywriters, editors, and SEO optimizers collaborating seamlessly.

As Anand delves into these unexplored realms, his curiosity becomes contagious. He’s not just exploring technology; he’s shaping the future of how we interact with language models, turning them into dynamic teammates in our creative endeavors.

End Notes

As Anand embarks on these significant experiments, the future of LLMs in storytelling, data exploration, and collaborative agent-based interactions seems promising and filled with possibilities.

Source link