AI Devtools Are Eating the World

In this inaugural edition, we look at why AI code search is trending, GPT-4 rumours, Stanford's Human Preferences Dataset and new AI tools for developers

Louis’ AI Devtools: Tuesday 28th February, 2023

Two years ago YC backed my cofounder Gabe and I to build bloop, a natural language code search engine. Each week I’ll sift through the noise to try and understand what might stick to the wall and why.

What’s Trending: AI Code Search 🔥

Copilot dominates the IDE’s edit pane, but AI code search is everywhere else. The space has exploded in the last few months, with tons of new demos and established projects that may have been semi-useful a few months ago feeling like they could be reaching a tipping point. Latency is the key factor influencing how these products are designed, with davinci-003 and Claude being slow by traditional search benchmarks. Below is a brief overview of a few companies in the space and how they’re tackling this issue:

Phind - Search engine for developers, with LLM generated explanations of web search results. Uses their own model under the hood (possibly fine-tuned flan-t5) which produces summaries comparable to davinci-003, but about 3X faster (from eye-balling their site).

Buildt - In-IDE semantic code search. Code is split into chunks and embedded for retrieval at search time. This makes search much faster than if there was an LLM in the loop. They implement lots of interesting techniques to make semantic search more accurate for this task, like bias matrix.

bloop - Oh look, it’s my company. AI code search for large codebases. We have a hybrid approach that you could describe as “GPT assisted semantic code search“. Local models embed code at index time and we use Anthropic’s Claude to rank and explain the semantic results. This is both fast and slow, with users first seeing the semantic results and later seeing the LLM generated explanation.

Note: all of the above are under (very) active development and approaches have been massively simplified and may have changed.

New and useful tools 🐣 

  • Raycast AI - Add AI everywhere on macOS, and also create your own AI macros

  • kapa.ai - Technical support discord bot for communities

  • Enhance AI - Add LLM completions to any site with 2 lines of JS

  • Wizi AI - Code search for frontend developers

  • what the cron? - Turn natural language into cron

Research and resources 🛠️ 

Stanford released the Human Human Preferences Dataset (SHP), a large open source dataset derived from Reddit that can be used to train RLHF reward models. Conveniently they’ve also released a finetuned flan-t5 model.

Facebook released LLaMa, a 65B parameter LLM which supposedly has GPT-3 level performance. It’s almost a devtools-problem dream, except despite their stated ‘commitment to open science’ the weights are only available via a sign up form and for non-commercial use 😭

DAIR.AI released a huge prompt engineering guide. It’s got a very comprehensive library of papers on the subject.

Someone found an unprotected Google Doc with GPT-4 (probably) details and pricing. They mention a new ‘GPT-3.5 Turbo’ model which is consistent with the recent update to ChatGPT Plus which has much faster completion. There’s also reference to a new ‘DV’ model which I assume is GPT-4 and has a max context length of 32k tokens. Improvements in both speed and context length are especially important for engineers using AI devtools on larger codebases, where the models need to consider more possible answers.

Jobs 👷 

  • Shameless plug for bloop, we’re hiring a Front End Engineer

  • Have an AI devtools job? I’d love to feature it

Of course there are far too many developments happening to summarise in one newsletter. If you’d like to see something in the next edition, please reach out!