Brands.place is a tool that makes it easier to find brands - and therefore products - that are worth buying. Users can submit brands, whose stores are then crawled for keywords. The site makes it easy to search "hiking socks" and see a list of companies making hiking socks outside of the usual big-box big brands and brandless Amazon sellers.
I built it because despite having the internet at our fingertips, it’s challenging to find things that are thoughtfully designed and meet our own unique needs. For example, I once spent six months shopping - on and off - for a backpack. Say what you will about that kind of dedication but it’s the perfect bag for me. It serves as my daily laptop bag, my camping backpack, and my day trip bag, all in one. The hardest part of finding it was knowing where to look.
Luckily, I had the itch to tackle a web project with multiple moving parts. My background is in systems software, so everything from picking a font to thinking about database queries was firmly outside my comfort zone. I had seen a modern web stack at a previous company, but had yet to understand how it all worked end to end. I set out to learn as much as I could, and hopefully make something that I could share with others.
I knew from the start that Brands.place would be a side project. It needed to be easy maintain and cheap to run. I would only be able to give it an hour or two a week once it was done. Anything complicated would need to be automated or documented, and the site would need to be efficient enough that it could run on cheap server instances.
At a high level, the site is a series of components: a website, search server, database, web scraper, and deployment script work together to serve requests, load new brands, and keep everything running.
The frontend communicates with the backend as a GRPC client. I named the API subsystem transit. If you're unfamiliar, traditional web APIs use HTTP requests and JSON to exchange data, in a design pattern called REST. GRPCs abstract the web requests away from the application and send encoded binary messages in place of human readable text. The encoded data results in substantially smaller messages when compared to REST. Even so, browser support is still maturing and it is unusual to see GRPCs as a part of a web API. To keep compatibility with older browsers, I used Envoy to translate normal http requests into spec-compliant GRPC sessions. I wanted the website to run on low cost instances, and was willing to work around bugs if it meant cheaper hosting.
A graph database, dgraph, stores the brand and user data. Traditional relational databases store information in tables and rows, and link rows together with IDs. Each ID lookup involves a search of the table, looking for that ID. So if a particular keyword is associated with n brands, it would require n lookups on the brand table to get the data associated with each ID. Graph databases use root nodes and link any associated data with it's actual address, skipping the search process for any node except the root. Because of the direct access, dgraph can distribute data across on any number of machines in a cluster and execute complicated queries without the overhead of a traditional database. Queries do need a root node, so each brand is tagged with keywords during indexing. A search request by a user on brands.place results in only one slow lookup for each keyword, no matter how many brands each are associated with it. Additionally, the structure of the database allows deeply nested queries, like finding all keywords associated with a user's favorite brands or finding brands similar to another brand, to run in nearly constant time on a sufficiently powerful cluster.
Between the gateway, backend itself, and database, there are five services running the backend. The original plan was to start with docker-compose in development and move to kubernetes in production, but after testing a production deployment, and seeing a $70 bill for the month, it became clear that kubernetes carried far too much overhead for the project. Failover, redundancy and load balancers were completely unnecessary. I moved back to docker-compose and used a simple deploy script, bringing hosting costs down to $5/month. In the process, I noticed average latency drop from 20ms to 10ms per request.
Deploying the code is an involved process, so I made sure I couldn't break it. A script kicks off the process, making sure that any necessary infrastructure has been created with terraform. It includes certificates, servers, the firewall, and the load balancer. The website is built and pushed to a free static content host. The backend container is built and packaged. Then, the proxy and backend containers are pushed to a package repository. The host is sent temporary keys and pulls the newly created packages from the repository. Finally, the host starts the proxy, backend, and database.
Brands tend to use standard ecommerce website tools, so I found that metadata is typically structured in the same way across brand websites, making it easy to scrape for high level keywords in the brand's description. Though brands often do not explain what products they sell in that description. To properly index products, the scraper crawls every product page, and passes the title of the product to a 3rd party keyword extractor, called prose. Prose tends to under-extract terms in titles, and tends to over extract the longer product descriptions. This is an ongoing issue, and a big blocker to a full release. Furthermore, crawling store fronts across many stores from the same host significantly increases the chances of that the ecommerce platform flags the scraper for abuse. As an early design decision, the scraper was put in its own package so that it could be easily run on a different instance to get around rate limits, and be resilient to crashes from malicious user-submissions or bugs.
Learnings and Observations
As expected, I learned a ton while building the tool, both in software and overall systems architecture. The journey was more important to me this time and I'm happy with how I built it, but it was good practice in making wise engineering decisions, and thinking through tradeoffs.
Standards are standard for a reason
I wanted to try something new with this project, so it didn't bother me that I was using new and sometimes unsupported software. Though it certainly slowed down development. GRPCs are faster, but when a query takes 10ms total, an 80% increase in request time is still sufficiently fast. Configuring and deploying the envoy proxy took up a significant chunk of time early on and the extra complexity made debugging transit slightly more painful when things broke. If the backend had no dependencies aside from the database, standard “app” hosts like Heroku or AppEngine would have allowed me to deploy easily and cut host costs further, even after accounting for the higher computational load of JSON.
In the same way, dgraph is a newer project that is still figuring out its own documentation, and has no real community support. There are tools for SQL to help with migrations, query generation, and debugging that do not yet exist for dgraph and would have helped get things moving faster.
Focus on one thing at a time
I was working with new concepts, so switching between disjoint areas of the codebase slowed me down more than expected. At one point in the middle of development, I found myself relearning envoy's config to make a minor change then relearning both docker-compose and kubernetes to fix the development and production kubernetes environments. I was making a lot of mistakes for the first time, so building incrementally and jumping around did allow me to build exactly what I needed in each area, but at the cost of extra debug time. Some back and forth is to be expected, but minimizing context switching with more planning could have sped up some of the early design.
Frontend is just not my thing. Code that I write rarely comes with a visual interface. At the same time I wanted other people to feel happy while using it. After dealing with awkward input fields, background colors and obscure browser bugs, I was ready to be done. It would have been better to worry about the UI last because I didn't need any thing fancy to test the rest of the system. I stuck with it because I wanted to understand what I was doing. In retrospect, I had confused my intent to learn with my desire to make something nice. There is a time for each but I was fighting myself and taking time away from areas that needed more work. If this was my second or third frontend project and I wanted autocomplete or live searching, I could do it. Diving in without knowing where the bottom is can turn out poorly and this was a good reminder to stick to take small steps when wading in unknown waters.
Stay close to what you love
I tend to do my best work when I'm really interested in the topic. This is common, I think. Looking back, I did my best work when writing performant queries and building out the deployment process, both relatively systems heavy topics. Consequently that's where I wrote most of the tests, help, and documentation. Tagging brands with keywords, laying out the homepage and figuring out search took the fun out of the project. I'm glad I stuck with it, but those areas still need help if the project ever grows.
Creating something from scratch is always rewarding and I am happy with the new experience. It was an especially good exercise in making architectural decisions for an entire project. Lower-level software doesn't always give us the joy of seeing everyday people using and appreciating our software. So, while I had the vision of a beautiful website in the same class as any Silicon Valley Unicorn, I can live with what it is. Even in its current state, I'm sure that Brands.place will one day become a staple of quality in the world of online shopping. But for now, I'm just happy that it's fast, cheap to run, and almost shows me what I'm looking for.