Search Disrupted Newsletter (Issue 7)
Apple Delays Siri because Sesame? Gemini is living life live. Build an LLM from scratch (for fun and educating yourself). Crawl and Chat.
Siri’s Delayed and Sesame’s Gain
I’ve alluded to a few times that Apple is trying to bet big on the next version of Siri. I imagine all the Apple PMs swearing inside their giant glass donut of a headquarters and raving about how they’re not going to be first at AI but best at it.
I mention this again as Apple is delaying its smarter, more personal Siri because they’re “not ready” to launch it.
Perhaps not coincidentally, Sesame released a demo of a genuinely mind-blowing “Siri-like” interface.
You should go try it right now: https://www.sesame.com/research/crossing_the_uncanny_valley_of_voice#demo as the level of fidelity and the “realness” of was just shocking.
Like all of these AI tools, the level of “actually sounds like a real person” is now the floor for what future technologies and expectations will be.
Build an LLM from Scratch
Large Language Models (LLMs) are the core of the changing AI and search landscape and are often considered an unknowable black box.
You can shine a light on the black box by building your own, which is what I’ve been doing following along with the Writing an LLM from scrach blog post series by Giles Thomas.
What I love about this approach is that it demystifies what can seem like magic.
By building a simple version yourself, you gain intuition about:
- How tokens are processed and predicted
- The attention mechanism that powers transformers
- Why these models can seem so intelligent yet make bizarre mistakes
This isn’t going to result in some actually useful LLM models, but understanding the fundamentals helps you better recognize the limits and utility of the actual models in production use.
Knowatoa adds new Perplexity models
It’s been really interesting to see the moves that Perplexity has made regarding its internal search models. Although it isn’t a “foundation” model company, it is agile in adopting open-source models and putting them into production.
Knowatoa users on paid plans can now check their Perplexity search rankings on the latest Sonar Perplexity model.
This integration allows you to see how your content performs specifically in Perplexity’s search results, which is becoming increasingly important as more users adopt AI-powered search tools. The Sonar model represents Perplexity’s latest approach to ranking and surfacing content.
Crawl and Chat
JR Oakes (whom I first met at last year’s amazing TechSEOConnect conference in Raleigh) released a new open-source tool called “Crawl and Chat”. This tool allows you to crawl a website and then chat with it.
LinkedIn Announcement: https://www.linkedin.com/posts/jr-oakes-3660546_rag-activity-7303907804526829569-euvA
Github Repo: https://github.com/jroakes/CrawlnChat
The tool is particularly useful for: - Content audits and analysis - Competitive research - Quick information extraction from large websites - Understanding the structure and focus of a site
It’s open source, and he’s looking for pull requests and feedback.
Grounding LLMs
Someone smarter than me once said, “LLMs are good at all the things computers aren’t and bad at all the things computers are good at.”
We’re also seeing more services that combine LLMs with other tools to leverage the best of both, in other words, grounding LLMs.
For search, this most often means comparing the results from an LLM to the results from a traditional search engine for citations and links.
In other areas, it’s things like the Mayo Clinic using a system with “Reverse RAG” to ground LLMs in structured medical knowledge - https://venturebeat.com/ai/mayo-clinic-secret-weapon-against-ai-hallucinations-reverse-rag-in-action/
Thanks this week
Cheers to Tyler Einberger for his kind words about our last newsletter and to Ann Smarty (with whom I had previously done a livestream on AI Search) for her thoughts and suggestions on how we can improve Knowatoa.
Cheers!

p.s. It would really help me out if you could Follow me on LinkedIn