Brave announced a new privacy-focused AI search engine called Answer with AI that works with a proprietary search index of billions of websites. The company's current search engine already handles 10 billion search queries a year. This means Brave's AI-powered search engine is one of the largest AI search engines online.
Many in the search marketing and e-commerce communities have expressed concerns about the future of the web with AI search engines. Brave's AI search engine still shows links and, most importantly, does not respond to commercial or transactional queries with AI by default, which should be good news for his SEO and online business. . Brave is focused on the web ecosystem and will monitor website visitation patterns.
Search Engine Journal interviewed Josep M. Pujol, head of search at Brave, to answer questions about search indexing, how it works with AI, and most importantly, how SEO and business can improve rankings. We've shared what owners need to know.
AI answers powered by Brave
Unlike other AI search solutions, Brave's AI search engine is fully powered by a unique search index of crawled and ranked websites. The entire underlying technology, from search indexes to large-scale language models (LLMs) to search augmentation generation (RAG) technology, was all developed by Brave. This is particularly good from a privacy perspective, making Brave's search results unique and further distinguishing it from other me-too search engine alternatives.
search technology
The search engine itself is all done in-house. Josep M. Pujol, Head of Search at Brave, said:
“We have access to all indexes of over 20 billion pages at query time, which means we are extracting arbitrary information (schema, tables, snippets, descriptions, etc.) in real time. Make very detailed decisions about the data you use, from entire paragraphs and text on a page to single sentences and rows in tables.
Given that you have an entire search engine at your disposal, the focus is not on search but on selection and ranking. Additionally, pages in the index have access to the same information used for ranking, such as score and popularity. This is essential to select more relevant sources. ”
Search extension generation (RAG)
Search engines work with search indexes and large language models, with Search Augmentation and Generation (RAG) technology built in between them to keep answers fresh and fact-based. . I asked about his RAG and Josep confirmed how it works.
he answered:
“You are correct that our new feature uses RAG. In fact, this technique was already used in the previous Summarizer feature released in March 2023. However, this new feature The feature expands on both the quantity and quality of data used in prompt content.
Large language model used
We asked about the language models used in new AI search engines and how they are being deployed.
“The model is deployed to an AWS p4 instance using VLLM.
A combination of Mixtral 8x7B and Mistral 7B is used as the main LLM model.
However, it also runs multiple custom trained transformer models for auxiliary tasks such as semantic matching and question answering. These models have stricter latency requirements (10-20 ms) and are much smaller.
These auxiliary tasks are critical to this feature, as they are responsible for selecting the data that ultimately appears in the LLM prompt. This data can be query-dependent snippets of text, schema, tabular data, or internal structured data from rich snippets. It's not important to be able to retrieve a lot of data, it's important to be able to choose which suggestions to add to the prompt context.
For example, the query “French presidents by political party” processes 220 KB of raw data containing 462 rows selected from 47 tables and 7 schemas. The prompt size is approximately 6500 tokens, and the final response is only 876 bytes.
So we can say that with “Answer with AI” we went from 20 billion pages to a few thousand tokens. ”
How AI handles local search results
Next I asked how the new search engine would display local searches. I asked Josep if he could share some scenarios and example queries where the AI answer engine reveals local businesses. For example, if I run a query about the best burgers in San Francisco, will the AI answer engine provide that answer and a link to it? Would this be useful for someone planning a business or vacation trip?
Josep replied:
“The Brave Search index has over 1 billion location-based schemas from which we can extract over 100 million companies and other points of interest.
Answer with AI is an umbrella term for search, LLM, and multiple specialized machine learning models and services for retrieving, ranking, cleaning, combining, and representing information. I mention this because the LLM does not make all the decisions. Currently, we primarily use them to synthesize unstructured and structured information. This occurs not only for query-time operations but also for offline operations.
In some cases, the end result may feel heavily influenced by LLM (this is because the answer to the user's question is a single point of interest (e.g. “Check in for farro food”) Also, if your user's job is more nuanced (e.g. “best hamburger”), you might want to generate business descriptions across different web references or with a consistent taxonomy. You can combine business categories.
Tips for ranking well
I then asked if using Schema.org's structured data can help your site rank better on Brave, and if you have any other tips for SEO and online business.
he answered:
“We definitely pay special attention to structured data in schema.org when building the context for LLM prompts. Be prepared. The more comprehensive these schemas are, the more accurate your answer will be.
That said, AI-powered answers can also show data about your business that isn't in these schemas, but it's always a good idea to repeat the information in different formats.
Some businesses rely solely on aggregators (Yelp, TripAdvisor, Yellow Pages) for their business information. There are benefits to adding schema to your business website, even if it's just for crawling bots. ”
Plan for AI search in Brave Browser
Brave shared that it plans to integrate new AI search functionality directly into the Brave Browser at some point in the near future.
Josep explained:
“Soon, we will be integrating our AI response engine with Brave Leo (the AI assistant built into the Brave browser). Users will have the option to send their answers to Leo and continue their session there.”
Other facts
Brave's announcement also shared the following facts about the new search engine:
“Brave Search's generative answers are more than just text. Deep index and model integration allows for online, contextual, and named entity enrichment (adding more context to people, places, or things) when generating answers. This means that the answer combines generated text with other media types such as information cards and images.
The Brave Search answer engine can also combine data from the index with geographically local results to provide rich information about points of interest. To date, the Brave Search index has over 1 billion location-based schemas from which you can extract over 100 million companies and other points of interest. These lists are larger than any public dataset, meaning our answer engine can provide rich, instant results for points of interest around the world. ”
Try the new AI search at http://search.brave.com/.