- The Deep View
- Posts
- ⚙️ DataChat has solved for the risks of GenAI analytics
⚙️ DataChat has solved for the risks of GenAI analytics
Good Morning.
Welcome to this special weekend edition of The Deep View, presented in partnership with DataChat, which is applying the power of genAI to enterprise data analytics.
Why LLMs are useful for analytics
Data analytics refers to the oft-laborious process of sifting through enormous quantities of data to identify patterns and trends that can inform business decisions and improve operational efficiency.
Large Language Models (LLMs), meanwhile, excel at identifying patterns and relevant text in massive amounts of textual data. They can also generate code from prompts written in plain English.
This positions LLMs as a perfect tool not just for data analysts, but for business users who consult analytics daily yet rarely take part in creating them. Anyone can load a dataset into an LLM, ask questions, and get insights — but not without significant risks.
The question is how to harness the analytical powers of LLMs while making their strengths and weaknesses transparent and manageable for users.
The potential problems of LLMs
There are several obstacles to LLM adoption in the enterprise.
For one, multiple surveys and studies (Lucidworks, F5) have found that business leaders consider generative AI to be a data privacy and security risk. And they’re not wrong to be worried — studies have found that any data shared with models like GPT-4 could be accessed and used by third parties, something that has led security experts to recommend users “refrain from sharing any sensitive information on any AI language model.”
The other major concern is model reliability. In part, this involves the aforementioned issues with security; if companies can’t share confidential data with a model, the model can’t analyze that data, or at least not without complicated workarounds.
But this issue of reliability goes deeper, into the architecture of LLMs. These models are historically susceptible to hallucinations, which refer to models’ propensity to output false information.
One study found that the “hallucination problem … poses serious risks for individuals relying on AI for medical, legal and daily decision-making.” That problem extends to analytical work. For instance, one model was asked to analyze why the Kansas City Chiefs won Super Bowl LVIII using news coverage. The LLM cited false players statistics and injuries and pulled events from the 2023 Super Bowl (like Patrick Mahomes’ ankle sprain) to explain the 2024 game.
Workers, meanwhile, have grown increasingly concerned that AI adoption will threaten their jobs. The World Economic Forum estimated last year that 44% of workers’ skills “will be disrupted in the next five years” due to AI. OpenAI itself predicted last year that 19% of workers might soon see half of their tasks “impacted” by AI.
The challenge for enterprise adopters of AI is to thread the needle between the security risks, potential hallucinations, and impacts on employees.
The key to that involves transparency, explainability, security and trust.
All things that are a core tenet of DataChat’s mission.
DataChat’s new paradigm
DataChat was founded in 2017, five years before the launch of ChatGPT. While many companies leaped to incorporate generative AI into their business strategy following ChatGPT’s release, DataChat was built with conversational analytics in mind.
According to co-founder Dr. Rogers Jeffrey Leo John, the pitch was for people to “solve data science problems by writing directions in English.”
At the platform's core is a language called GEL, which stands for Guided English Language — a CNL (controlled natural language) that allows DataChat to “abstract away programming languages,” such as Python and SQL functions. Users don’t actually need to learn or use GEL — it just ensures that each step in an analysis is easily understood and reproducible.
This means that the domain experts and people who know their business best can engage with their data to uncover actionable insights and opportunities without a technical background or knowledge of code languages.
“Every organization has untapped data, but few employees besides data scientists have the tools and skills to make use of it. DataChat solves that problem,” Leo John said. “We think the people who know the business best, regardless of their department or technical background, should have the tools to question data on their own terms.”
Transforming what it means to interact with data
Businesses cannot thrive without data. The core of any successful business strategy are intelligent answers to crucial questions, answers that are only possible with data.
However, a spreadsheet filled with dates and numbers is not some magical key to business insights.
Through its conversational AI platform, DataChat can turn that stream of incomprehensible data into actionable insights. A user just types what they want to know, in plain English, and DataChat generates an answer. Behind the scenes, the platform leverages an LLM but doesn’t share anything besides the table schema. The data stays where it is. The resulting insights can save corporations millions of dollars.
Transparency, explainability and trust
DataChat has built-in safeguards against the risks of data leaks and hallucinations.
The GEL language that makes up the core of DataChat’s functionality is designed to detect LLM hallucinations and take corrective actions. The GEL layer allows DataChat to either recover from hallucinations using heuristics in its post-processing layer or inform the user to prevent the spread of inaccurate outputs. This layer also provides protection against malicious prompt injection attacks.
DataChat always shows its work, providing a confidence score on each answer to eliminate the black box of generative AI.
“Reproducibility is the difference between making pretty graphs and doing true data science,” Leo John said. “So, we baked reproducibility into DataChat. When users converse with DataChat, the platform automatically documents the conversation as steps in a data science process. That way, others can validate the workflow or reapply it to new data.”
DataChat offers an extra layer of security: the platform never shares customer data with third-party LLMs. This ensures that your data remains secure and private, allowing the model to be useful without risk.
The goal is not to replace human workers or achieve total autonomy. Instead, the aim is to give people the power to engage proactively with their data, on their timeline and terms.
“Questions are the true driver of knowledge,” CEO Viken Eldemir said. “People want to leverage AI to help with insights but we still need that human in the loop to ask those questions, to see the answers and then to iterate on the answers to ask more questions.”