ACCESSWIRE
30 Mar 2023, 22:43 GMT+10
Compared to GPT-4 and ChatGPT-3.5 , a smaller LLM can perform just as well as ChatGPT3.5 on dialog-based use cases but also check generated responses for hallucinations.
PALO ALTO, CA / ACCESSWIRE / March 30, 2023 / Got It AI announced today its enterprise-ready LLM named ELMAR (for Enterprise Language Model ARchitecture). ELMAR is an order of magnitude smaller than GPT-3 and can run on premise with integrations to any knowledge base for dialog-based chatbot Q&A applications. Key innovations in the architecture enable commercial use because the ELMAR is not based on Facebook Research's LLaMA or Stanford Alpaca. Additionally, truth checking on responses and post-processing to prevent the user from seeing incorrect responses lowers the final incorrect response rate seen by the user to a very low level. ELMAR can run on low-cost hardware compared to very large language models which require expensive hardware, and will be available for pilots with enterprise beta testers who sign up at https://www.got-it.ai/elmar.
In addition, Got It AI is publishing results of LLM hallucination rates based on a 100-article dataset with a Q&A test set for GPT-4, GPT-3, ChatGPT, GPT-J/Dolly, Stanford Alpaca and Got It AI's own ELMAR. Got It's Truth Checker AI, capable of catching 90% of hallucinations, announced earlier in January, was used to compare the hallucination rates, and the results were validated by humans. A playground to test TruthChecker functionality with this data set and test set will be made available at https://www.got-it.ai/truthchecker.
Hallucination Rate Test Results (Model, Size, Hallucination Rate, **After Truth Checking)
* ELMAR and Alpaca are fine-tuned on the 100-article data set since these models are significantly smaller than OpenAI GPT-3 and GPT-4. Fine-tuning on OpenAI models is not considered because they already have been trained on large conversational data sets, and fine-tuned large OpenAI models are not as cost-effective to run as Alpaca and ELMAR. We attempt to compare equivalent price/performance of models to the greatest extent possible.
** After Truth Checking percentage assumes 90% catch rate and prevention of sending response to user.
'Recently, it was suggested that smaller and older models like GPT-J can deliver ChatGPT-like experiences. In our experiments, we did not find this to be the case. Despite fine-tuning, such models performed significantly worse than other more advanced models. It is not just about the data, but also about modern model architectures and training techniques,' said Chandra Khatri, Head of Conversational AI Research at Got It AI. 'In our testing, smaller open source LLMs perform poorly on specific tasks unless they are fine-tuned on target datasets. For example, when we used Alpaca, an open source model, for a Q&A task on our target 100-article set, it resulted in a significant fraction of answers being incorrect or hallucinations, but did better after fine-tuning. ELMAR, when fine-tuned on the same dataset, produced accurate results, equivalent to ChatGPT-3.'
'Enterprises are shutting off access to ChatGPT because they want guardrails. The key guardrails we're offering are in two critical areas: a truth checking model on the responses generated, and a pre-processing model for filtering out sensitive information sent to dialog-based chatbots,' said Peter Relan, Chairman of Got It AI. 'While LLaMA and Alpaca are research constrained licenses, our work enables commercial use. The ability to run ELMAR on premise creates a sense of control and safety around the use of generative AI models in the enterprise. It opens up a number of chatbot use cases involving potentially sensitive data: Internal Knowledge Base Queries, Customer Service Agent Assist, IT and HR Helpdesks, Customer Support, Sales Training, and many more. The availability of a solution for generative AI hallucinations is another key factor enabling enterprise adoption.'
Key Features and Benefits of on-prem ELMAR
Media Enquiries: Contact Peter Brooks at peterjbrooks@msn.com
David Chu
david@got-it.ai
4082212176
Peter Brooks
peterjbrooks@msn.com
SOURCE: Got It AI
Get a daily dose of Boston Star news through our daily email, its complimentary and keeps you fully up to date with world and business news as well.
Publish news of your business, community or sports group, personnel appointments, major event and more by submitting a news release to Boston Star.
More InformationWASHINGTON D.C.: The Biden administration is expected to approve a deal allowing General Electric to assemble jet engines in India ...
TAIPEI, Taiwan: Taiwan and the United States are set to sign the first deal under a new trade framework, marking ...
NEW YORK, New York - U.S. stocks started the week under pressure on Monday. "Markets are catching their breath after ...
NEW YORK CITY, New York: A JPMorgan Chase & Co spokesperson has confirmed that as part of the bank's efforts ...
WASHINGTON D.C.: The Pentagon said this week that the satellite communications service started by billionaire Elon Musk, SpaceX's Starlink, has ...
CHARLESTON, South Carolina: During a media tour this week at Boeing's facilities in Charleston, South Carolina, company CEO Dave Calhoun ...
NEW YORK CITY, New York: A JPMorgan Chase & Co spokesperson has confirmed that as part of the bank's efforts ...
WASHINGTON D.C.: The Federal Communications Commission (FCC) has said that more than 8.3 million US homes and businesses do not ...
WASHINGTON D.C.: A Transportation Security Administration (TSA) report released this week revealed that over the Memorial Day weekend, the start ...
WASHINGTON D.C.: The Defense POW/MIA Accounting Agency has announced that 2nd Lt. James Litherland of South Williamsport, Pennsylvania, an Army ...
Cleveland Guardians outfielder Will Brennan boasts an unremarkable track record, but the 25-year-old Colorado native may hold the key to ...
After his team finished a seven-game road trip with a 4-1 victory over the Boston Red Sox on Monday, Tampa ...