What is Barista Live Generative Answers?

By Pat Calhoun, Chief Executive Officer
November 28, 2023

Espressive recently announced Barista Live Generative Answers (LGA), which has raised a number of questions related to how it works.

When Espressive was founded 7 years ago, we leveraged our own language model, which we called the Employee Language Cloud (ELC). The ELC is comprised of a number of technologies and our data model, which enabled our virtual agent, Espressive Barista, to understand the language of employees. Through general use by our customers, the ELC ended up growing to recognize well over 4B phrases across 15 enterprise departments.

Once the employee interaction was understood, Barista would always attempt to resolve their problem via automation. When automation was not possible, Barista would then leverage “researched content”. What is researched content? The Espressive ML Operations team would create “answers” to tens of thousands of questions employees would ask (e.g., “How do I setup out of office in Outlook?”). This eliminated the need for customers to create knowledge, or content, where answers would be universally the same across the industry. When researched content was not available, Barista would then attempt to source answers from the enterprise’s own internal knowledge.

The concept of researched content was very popular, but it introduced two issues:

  1. There were times customers had their own content for a topic, which they wanted to use instead of our researched content. Essentially, use knowledge first. Certainly, this was possible with Barista, but it required some additional configuration.
  2. While our researched content was impressive, it would occasionally become out of date as software or hardware features evolved. The ML Ops team would of course get notified of these changes, but we needed a better way to get the most recent content for virtually any question updated immediately.

Separately, nearly two years ago, we embarked on a complete redesign of our AI engine to improve our overall understanding of the human language as well as our deflection rates. This was done through what we call the Experience Selector.

What is the Experience Selector and how is it better? Unlike the previous serial process, the Experience Selector makes a real-time decision on how to best handle an interaction, with the goal of maximizing the overall experience and achieving a deflection. This means the Experience Selector can now, with a high degree of accuracy, predict:

  • If resolution that is best achieved via automation,
  • If the interaction is related to a topic where the answer is an internal one, meaning it should only come from a knowledge article,
  • If the answer should first be sourced from internal knowledge before looking at researched content or a large language model (LLM),
  • If the answer should only be sourced externally (meaning researched content or an LLM), and lastly,
  • If all the above are not possible or effective, then human help is achieved by creating a ticket on behalf of the employee.

While working on the Experience Selector, we also recognized an opportunity to improve our researched content, making it more dynamic and having a much broader reach. This led us to building a new capability, which we call Barista Live Generative Answers. Live Generative Answers is part of the Experience Selector and is how we answer questions once we determine that the question would best be answered from an external source, such as the public domain or a LLM.

Does this mean that all employee interactions leverage Live Generative Answers? No. The Experience Selector recognizes whether an interaction is internal vs. externally focused and only uses LGA for interactions where answers are not specific to an organization. For example, something like “What is our laptop refresh policy?” is a question that really requires internal content, and the Experience Selector would only source answers from knowledge articles (or open a ticket). However, something like “How do I accept a change in Microsoft Word?” is something for which public domain content would be ideal if there is no internal knowledge for the topic.

Why does Live Generative Answers not simply use LLMs for all answers? Certainly, LLMs are impressive in the amount of content they have, and the broad range of questions they can answer. It is, however, important to note that their content is only as good as their training data. In the case of ChatGPT, it was last trained in January 2022 (and April 2023 for the new GPT-4 Turbo model), and any new software or hardware features introduced since that time will be “made up”, which the industry calls “hallucinations”. This means the LLM makes up answers based on its understanding of the question, but for technical questions, these answers are likely incorrect and misleading.

For that reason, Live Generative Answers first decides whether to leverage a public domain answer or use an LLM. With access to public domain information, Barista has access to the whole internet to source answers. That said, Barista is also smart enough to know to restrict public domain access to only trusted sources (e.g., all Microsoft related questions need to come from

However, when Live Generative Answers determines that the generative capabilities of an LLM are best suited, it leverages a private LLM. Unlike consumer versions of Google’s Bard or OpenAI’s ChatGPT, none of the employee’s interactions are used in the private LLM’s training corpus.

In short, Live Generative Answers was such a fundamental innovation that it caused Barista to now understand virtually any topic under the sun.

These are exciting times at Espressive, and if you'd like to see LGA in action yourself, request a demo here.

Share this post: