Erik Bernhardsson
24  novembre     00h00
Storm in the stratosphere: how the cloud will be reshuffled
   Here’s a theory I have about cloud vendors (AWS, Azure, GCP): Cloud vendors will increasingly focus on the lowest layers in the stack: basically leasing capacity in their data centers through an API. Other pure-software providers will build all the stuff on top of it.
23  juillet     00h00
What is the right level of specialization? For data teams and anyone else.
   This isn’t as much of a blog post as an elaboration of a tweet I posted the other day: I think this specialization of data teams into 99 different roles (data scientist, data engineer, analytics engineer, ML engineer etc) is generally a bad thing driven by the fact that tools are bad and too hard...
07  juillet     00h00
Building a data team at a mid-stage startup: a short story
   I guess I should really call this a parable. The backdrop is: you have been brought in to grow a tiny data team ( 4 people) at a mid-stage startup ( 10M annual revenue), although this story could take place at many different types of companies.
19  avril     00h00
Software infrastructure 2.0: a wishlist
   Software infrastructure (by which I include everything ending with aaS, or anything remotely similar to it) is an exciting field, in particular because (despite what the neo-luddites may say) it keeps getting better every year I love working with something that moves so quickly.
01  avril     00h00
What’s Erik up to?
   I joined Better in early 2015 because I thought the team was crazy enough to actually change one of the largest industries in the US. For six years, I ran the tech team, hiring 300 people, probably doing 2,000 interviews, and according to GitHub I added 646,941 lines of code and removed 339,164.
16  décembre     00h00
Giving more tools to software engineers: the reorganization of the factory
   It’s a popular attitude among developers to rant about our tools and how broken things are. Maybe I’m an optimistic person, because my viewpoint is the complete opposite I had my first job as a software engineer in 1999, and in the last two decades I’ve seen software engineering changing in ways...
06  octobre     00h00
Developer experience as a competitive advantage
   I spent a ton of time looking at different software providers, both as a CTO, and as a nerd advanced consumer who builds stuff in my spare time. In the last 10 years, there has been an order of magnitude more products that cater directly to developers, through APIs, SDKs, and tooling.
23  septembre     00h00
Mortality statistics and Sweden’s dry tinder effect
   We live in a year of about 350,000 amateur epidemiologists and I have no desire to join that club . But I read something about COVID-19 deaths that I thought was interesting and wanted to see if I could replicated it through data.
08  juin     00h00
How to set compensation using commonsense principles
   Compensation has always been one of the most confusing parts of management to me. Getting it right is obviously extremely important. Compensation is what drives our entire economy, and you could look at the market for labor as one gigantic resource-allocating machine in the same way as people look...
10  mars     00h00
Never attribute to stupidity that which is adequately explained by opportunity cost
   Hanlon’s razor is a classic aphorism I’m sure you have heard before: Never attribute to malice that which can be adequately explained by stupidity. I’ve found that neither malice nor stupidity is the most common reason when you don’t understand why something is in a certain way.
13  janvier     00h00
How to hire smarter than the market: a toy model
   Let’s consider a toy model where you’re hiring for two things and that those are equally valuable. It’s not very important what those are, so let’s just call them thing A and thing B for now.
19  décembre     00h00
What can startups learn from Koch Industries?
   I recently finished the excellent book Kochland. This isn’t my first interest in Koch I read The Science of Success by Charles Koch himself a couple of years ago. Charles Koch inherited a tiny company in 1967 and turned it into one of the world’s largest ones.
09  décembre     00h00
We’re hiring at Better
   Just a quick note that my team is always hiring at Better. A lot of new people have been joining the team here in NYC lately the tech team has actually grown from 35 to 60 in just 3 months.
16  octobre     00h00
Buffet lines are terrible, but let’s try to improve them using computer simulations
   My company has a buffet every Friday, and the lines grow to epic proportions when the food arrives. I’ve suspected for years that the classic buffet line system is a deeply flawed and inefficient method, and every time I’m stuck in the line has made me more convinced.
26  septembre     00h00
Miscellaneous unsolicited (and possibly biased) career advice
   No one asked for this, but I’m something like 12 years into my career and have had my fair share of mistakes and luck so I thought I’d share some. Honestly, I feel like I’ve mostly benefitted from luck.
05  août     00h00
Modeling conversion rates using Weibull and gamma distributions
   This is a blog post originally featured on the Better engineering blog. If you want to link to this article or share it, please go to the original post URL Separately, I’m sorry it’s been so long with no posts on this blog.
15  avril     00h00
Why software projects take longer than you think: a statistical model
   Anyone who built software for a while knows that estimating how long something is going to take is hard. It’s hard to come up with an unbiased estimate of how long something will take, when fundamentally the work in itself is about solving something.
21  février     00h00
Headcount goals, feature factories, and when to hire those mythical 10x people
   When I started building up a tech team for Better, I made a very conscious decision to pay at the high end to get people. I thought this made more sense: they cost a bit more money to hire, but output usually more than compensates for it.
10  janvier     00h00
Data architecture vs backend architecture
   A modern tech stack typically involves at least a frontend and backend but relatively quickly also grows to include a data platform. This typically grows out of the need for ad-hoc analysis and reporting but possibly evolves into a whole oil refinery of cronjobs, dashboards, bulk data copying, and...
08  octobre     00h00
The hacker’s guide to uncertainty estimates
   It started with a tweet: New years resolution: every plot I make during 2018 will contain uncertainty estimates - Erik Bernhardsson ( bernhardsson) January 7, 2018 Why? Because I’ve been sitting in 100,000,000 meetings where people endlessly debate whether the monthly number of widgets is going up...
30  août     00h00
I don’t want to learn your garbage query language
   This is a bit of a rant but I really don’t like software that invents its own query language. There’s a trillion different ORMs out there. Another trillion databases with their own query language. Another trillion SaaS products where the only way to query is to learn some random query DSL they made...
16  août     00h00
Business secrets from terrible people
   I get bored reading management books very easily and lately I’ve been reading about a wide range of almost arbitrary topics. One of the lenses I tend to read through is to see different management styles in different environments.
17  juin     00h00
New approximate nearest neighbor benchmarks
   As some of you may know, one of my side interests is approximate nearest neighbor algorithms. I’m the author of Annoy, a library with 3,500 stars on Github as of today. It offers fast approximate search for nearest neighbors with the additional benefit that you can load data super fast from disk...
04  juin     00h00
Missing the point about microservices: it’s about testing and deploying independently
   Ok, so I have to first preface this whole blog post by a few things: I really struggle with the term microservices. I can’t put my finger on exactly why. Maybe because the term is hopelessly ill-defined, maybe because it’s gotten picked up by the hype train.
02  mai     00h00
Interviewing is a noisy prediction problem
   I have done roughly 2,000 interviews in my life. When I started recruiting, I had so much confidence in my ability to assess people. Let me just throw a couple of algorithm questions at a candidate and then I’ll tell you if they are good or not
27  mars     00h00
Waiting time, load factor, and queueing theory: why you need to cut your systems a bit of slack
   I’ve been reading up on operations research lately, including queueing theory. It started out as a way to understand the very complex mortgage process (I work at a mortgage startup) but it’s turned into my little hammer and now I see nails everywhere.
07  mars     00h00
Lessons from content marketing myself (aka blogging) for five years
   I started writing this blog in late 2012, partly because I felt like it would help me improve my English and my writing skills, partly because I kept having a lot of random ideas in my head and I wanted to write them down somewhere.
15  février     00h00
New benchmarks for approximate nearest neighbors
   UPDATE(2018-06-17): There are is a later blog post with newer benchmarks One of my super nerdy interests include approximate algorithms for nearest neighbors in high-dimensional spaces. The problem is simple. You have say 1M points in some high-dimensional space.
28  janvier     00h00
I’m looking for data engineers
   I’m interrupting the regular programming for a quick announcement: we’re looking for data engineers at Better. You would be the first one to join and would work a lot directly with me. Some fun things you could work on (these are all projects I’m working on right now):
17  janvier     00h00
Books I consumed in 2017
   Turns out having a toddler isn’t super compatible with reading. I used to read 100 books year as a teenager, but it has slowly deteriorated to maybe 20-30 books, at most. And I don’t even finish all of them because life is too short
03  janvier     00h00
Plotting author statistics for Git repos using Git of Theseus
   I spent a few days during the holidays fixing up a bunch of semi-dormant open source projects and I have a couple of blog posts in the pipeline about various updates. First up, I made a number of fixes to Git of Theseus which is a tool (written in Python) that generates statistics about Git...
29  décembre     00h00
Toxic meeting culture
   I spent six years at a company that went from 50 people to 1500 and one contributing factor leading to my departure was that I went from a maker to a person stuck in meetings every day.
12  décembre     00h00
Learning from users faster using machine learning
   I had an interesting idea a few weeks ago, best explained through an example. Let’s say you’re running an e-commerce site (I kind of do) and you want to optimize the number of purchases. Let’s also say we try to learn as much as we can from users, both using A B tests but also using just basic...
26  novembre     00h00
Annoy 1.10 released, with Hamming distance and Windows support
   I’ve been a bit bad at posting things with a regular cadence lately, partly because I’m trying to adjust to having a toddler, partly because the hunt for clicks has caused such a high bar for me that I feel like I have to post something Pulitzer-worthy.
30  octobre     00h00
Why conversion matters: a toy model
   There are often close relationships between top level business metrics. For instance, it’s well known that retention has a super strong impact on the valuation of a subscription business. Or that the % of occupied seats is super important for an airline.
26  septembre     00h00
On the Equifax breach and how to really prevent identity theft
   A funny thing about being a foreigner is how you realize people take broken things for granted. I’m going to go out on a limb here claiming that the US has a pretty dumb banking system.
06  septembre     00h00
The number of letters in the word for each number
   Just for fun, I generated these graphs of the number of letters in the word for each number. I really spent about 10 minutes on this (ok...possibly also another 40 minutes tweaking the plots): More languages
29  août     00h00
The software engineering rule of 3
   Here’s a dumb extremely accurate rule I’m postulating for software engineering projects: you need at least 3 examples before you solve the right problem . This is what I’ve noticed: Don’t factor out shared code between two classes.
19  août     00h00
Machine, Platform, Crowd
   I just bought Machine, Platform, Crowd: Harnessing Our Digital Future and discovered that it mentions my blog - in particular the post When machine learning matters. Ok, I lied a little bit. I didn’t discover it serendipitously.
14  août     00h00
Google diversity memo, global warming, Pascal’s wager, and other stuff
   There’s about 765 million blog posts about the diversity memo that leaked out of Google a couple of weeks ago. I think the case for any biological difference is pretty weak, and it bothers me when people refer to an interest gap as anything else than caused by the environment.
12  juillet     00h00
Fun with trigonometry: the world’s most twisted coastline
   I just spent a few days in Italy, on the Ligurian coast. Even though we were on the west side of Italy, the Mediterranean sea was to the east, because the house was situated on a long bay.
06  juillet     00h00
Optimizing for iteration speed
   I’ve written before about the importance of iterating quickly but I didn’t necessarily talk about some concrete things you can do. When I’ve built up the tech team at Better, I’ve intentionally optimized for fast iteration speed above almost everything else.
09  juin     00h00
   Remember when everyone had a really ugly blog with a blogroll? Anyway, just think the word is funny. I follow a few hundred blogs using Feedly and Reeder and have been reading a few hundred thousand blog posts over the last 10 years.
23  mai     00h00
Conversion rates - you are (most likely) computing them wrong
   How hard can it be to compute conversion rate? Take the total number of users that converted and divide them with the total number of users. Done. Except... it’s a lot more complicated when you have any sort of significant time lag.
09  avril     00h00
The mathematical principles of management
   I’ve read about 100 management books by now but if there’s something that always bothered me it’s the lack of first principles thinking. Basically it’s a ton of heuristics. And heuristics are great, but when you present heuristics as true objectives, it kind of clouds the underlying objectives (and...
15  mars     00h00
The eigenvector of Why we moved from language X to language Y
   I was reading yet another blog post titled Why our team moved from <language X> to <language Y> (I forgot which one) and I started wondering if you can generalize it a bit. Is it possible to generate a N N contingency table of moving from language X to language Y?
17  février     00h00
Why I went into the mortgage industry
   I just realized last Thursday that I have spent two full years at Better, incidentally on the same day as we announced a 15M round led by Kleiner Perkins. So it was a good point to reflect a bit and think back - what the F led me to abandon my role managing the machine learning team at Spotify?
01  février     00h00
Language pitch
   Here’s a fun analysis that I did of the pitch (aka. frequency) of various languages. Certain languages are simply pronounced with lower or higher pitch. Whether this is a feature of the language or more a cultural thing is a good question, but there are some substantial differences between...
10  janvier     00h00
Functional programming is the libertarianism of software engineering
   This is a pretty dumb post, in which I argue that functional programming has a lot of the bad parts of libertarianism and a lot of the good parts: Both ideologies strive to eliminate [the] state.
05  décembre     00h00
The half-life of code & the ship of Theseus
   As a project evolves, does the new code just add on top of the old code? Or does it replace the old code slowly over time? In order to understand this, I built a little thing to analyze Git projects, with help from the formidable GitPython project.