CommunityNews

CommunityNews

What I Learned Building a Free Semantic Search Tool for GitHub and Why I Failed

Over the last few months, I have built and launched a free semantic search tool for GitHub called SemHub. In this blog post, I share what I’ve learned and why I’ve failed, so that other builders can learn from my experience. This blog post runs long and I have sign-posted each section. I have marked the sections that I consider the particularly insightful with an asterisk (*).

I have also summarized my key lessons here:

  1. Default to pgvector, avoid premature optimization.
  2. You probably can get away with shorter embeddings if you’re using Matryoshka embedding models.
  3. Filtering with vector search may be harder than you expect.
  4. If you love full stack TypeScript and use AWS, you’ll love SST. One day, I wish I can recommend Cloudflare in equally strong terms too.
  5. Building is only half the battle. You have to solve a big enough problem and meet your users where they’re at.

Read in full here:

Where Next?

Popular General Dev topics Top

First poster: bot
Neovim nightly, v0.5.0 and v0.4.4 has been released. Link: Release Nvim development (prerelease) build · neovim/neovim · GitHub Link:...
New
First poster: dyowee
Everyone seems to be striving for ‘clean’ code at the moment. You can’t read a blog post without the author telling you how clean their a...
New
First poster: bot
The overengineered Solution to my Pigeon Problem. TL;DR: I built a wifi-equipped water gun to shoot the pigeons on my balcony, controlle...
New
First poster: dani
The pool of talented C++ developers is running dry. Highly sought after, rarely provided.
New
First poster: Korbin73
Whatever happened to Elm, anyway?. I see this question pop up quite frequently in lots of different arenas - folks are curious as to wha...
New
First poster: joeb
GitHub - crablang/crab: A community fork of a language named after a plant fungus. All of the memory-safe features you love, now with 100...
New
New
CommunityNews
GitHub - ItzCrazyKns/Perplexica: Perplexica is an AI-powered search engine. It is an Open source alternative to Perplexity AI. Perplexic...
New
First poster: dyowee
olmOCR is an open-source tool for converting PDFs to text with high accuracy, preserving reading order and supporting tables, equations, ...
New
CommunityNews
GitSyncPad is an innovative micro keypad designed for effortless Git version control. Execute commands like git add, git commit, and git ...
New

Other popular topics Top

ohm
Which, if any, games do you play? On what platform? I just bought (and completed) Minecraft Dungeons for my Nintendo Switch. Other than ...
New
AstonJ
We have a thread about the keyboards we have, but what about nice keyboards we come across that we want? If you have seen any that look n...
New
AstonJ
In case anyone else is wondering why Ruby 3 doesn’t show when you do asdf list-all ruby :man_facepalming: do this first: asdf plugin-upd...
New
rustkas
Intensively researching Erlang books and additional resources on it, I have found that the topic of using Regular Expressions is either c...
New
Maartz
Hi folks, I don’t know if I saw this here but, here’s a new programming language, called Roc Reminds me a bit of Elm and thus Haskell. ...
New
PragmaticBookshelf
Rails 7 completely redefines what it means to produce fantastic user experiences and provides a way to achieve all the benefits of single...
New
AstonJ
Was just curious to see if any were around, found this one: I got 51/100: Not sure if it was meant to buy I am sure at times the b...
New
PragmaticBookshelf
Author Spotlight Jamis Buck @jamis This month, we have the pleasure of spotlighting author Jamis Buck, who has written Mazes for Prog...
New
PragmaticBookshelf
Author Spotlight: VM Brasseur @vmbrasseur We have a treat for you today! We turn the spotlight onto Open Source as we sit down with V...
New
DevotionGeo
I have always used antique keyboards like Cherry MX 1800 or Cherry MX 8100 and almost always have modified the switches in some way, like...
New