OpenAI enhanced ChatGPT’s capabilities by connecting it to the internet. Now, You.com aims to replicate this for all large language models (LLMs). The company has introduced a set of APIs designed to provide real-time access to the open web, or specific segments of it, for LLMs like Meta’s Llama 2. Priced starting at $100 per month, You.com’s APIs enhance LLMs’ responses to user queries (e.g., “Which holidays are this week?”) by incorporating current context from the internet.
Companies such as LlamaIndex, Anthropic, and Cohere have actively incorporated it into their models.
You.com Introduces New APIs Connecting Large Language Models
Richard Socher, CEO and founder of You.com shared in an email interview with TechCrunch, “[We’ve] received many requests for an API with these capabilities. When inquiring about recent events, such as a Super Bowl score on the day of the event, our API searches the web for those scores. You can then instantly add that information to the LLM, allowing it to provide more accurate answers to your questions.
Most LLMs learn from publicly accessible, static data gathered from web pages, ebooks, and other sources. This training is sufficient for tasks like composing emails or drafting cover letters and essays. However, this approach confines their knowledge to the timeframe of the data; an LLM trained before September 2021 wouldn’t be aware of recent events.
You.com’s new APIs address this limitation by constructing an index of extensive website snippets. This sets them apart from standard search APIs from Bing and Google, which, according to Socher, only offer brief snippets “designed to entice someone to click a link.” LLMs can utilize this tailored index to answer questions, pinpointing relevant snippets and summarizing them to provide updated responses.
Navigating the Risks and Advancements in LLMs and APIs
Utilizing an LLM with web access can pose risks, regardless of the APIs it utilizes. The live web is less curated compared to a static training dataset, implying less filtration. Search results are susceptible to manipulation, and they may not accurately represent the entirety of the web. As algorithms often prioritize websites employing modern technologies like encryption, mobile support, and schema markup, high-quality content on other websites may go unnoticed.
Socher acknowledged weaknesses in You.com’s API, especially in handling localized “near me” queries (e.g., “Where’s good sushi near me?”), as the API lacks knowledge of LLM users’ locations. However, ongoing improvements, including enhancements enabling You.com’s APIs to code and “produce much more complex answers” with traceable citations, are underway, according to Socher.
We’ll soon integrate news and general web search to simplify the experience for companies using our APIs,” he added. “By incorporating our API into creators’ solutions, their responses will be more relevant and helpful for end users. The solution can then verify facts by turning to the web.
These new APIs make me consider whether search is becoming the next battleground in generative AI. As open-source LLMs reach the capabilities of closed-source counterparts, the effectiveness of the search engine supporting those closed-source LLMs (Bing in ChatGPT’s case, Google in Bard’s) becomes a more compelling factor—unless APIs like You.com’s manage to equalize the competition effectively.
Admittedly, no API is perfect, and You.com surely has flaws beyond those mentioned by Socher. However, I would argue that competition is always beneficial.
The new You.com APIs have an initial cost of $100 per month, covering 14,200 API calls after a 60-day trial, which includes 1,000 free monthly calls. For more extensive enterprise deals, You.com provides customized packages with annual subscriptions and associated discounts.