atlas news
    
David’s blog
31  mai     10h28
How to set up a reverse SSH tunnel with Amazon Web Services
David Lindelöf    When the startup shut down there were still dozens of netbooks out there in the wild collecting data on the residential houses fitted with our adaptive heating control algorithms, hopelessly attempting to connect to our VPN server that didn’t exist anymore in order to upload all that data to our...
17  mai     06h16
Deep silence or deep work
David Lindelöf    It’s Monday afternoon. It’s a holiday but I have a couple of things to catch up from last week that I didn’t finish. The rest of the family is either on holiday camp or taking a nap in the bedroom. I’m working from home. But the home is anything but silent. I can hear the The post Deep silence...
03  mai     06h33
Is The Ratio of Normal Variables Normal?
David Lindelöf    In Trustworthy Online Controller Experiments I came across this quote, referring to a ratio metric M frac X Y, which states that: Because X and Y are jointly bivariate normal in the limit, M , as the ratio of the two averages, is also normally distributed. That’s only partially true....
20  avril     08h26
Working with that data scientist
David Lindelöf    In my current team we have decided to split up the work in a number of workstreams, which are in effect subteams responsible for different aspects of the product. One workstream might be responsible for product instrumentation, another for improving the recommendation algorithms, another...
05  avril     08h33
Controlling for covariates is not the same as slicing
David Lindelöf    To detect small effects in experiments you need to reduce the experimental noise as much as possible. You can do it by working with larger sample sizes, but that doesn’t scale well. A far better approach consists in controlling for covariates that are correlated with your response. I recently gave...
22  mars     10h14
Getting into data science
David Lindelöf    A while back I had the pleasure to address a team of user experience researchers at YouTube, and I got asked for a few resources that could help someone pretty good at science, math, and programming who wanted to get into data science. Here’s the list I gave. These have worked for me in the The...
08  mars     09h12
The law of total probability applied to a conditional probability
David Lindelöf    Dear future self, I’ve just lost again about half an hour of my life trying to find a vaguely remembered formula that generalizes the law of total probability to the case of conditional probabilities. Here it is. You’re welcome. The law of total probability says that if you can decompose the set...
20  février     09h39
XKCD on Data Science
David Lindelöf    I’ve been collecting all XKCD comics related to Data Science and or Statistics. Here they are, but if you think I’m missing any please let me know in the comments. Use at will in your data visualizations but remember to attribute. Sorted in reverse chronological order. The post XKCD on Data Science...
06  février     10h14
Quick note about bootstrapping
David Lindelöf    Cross validation the act of keeping a subset of data to measure the performance of a model trained on the rest of the data never sounded right to me. It just doesn’t feel optimal to retain an arbitrary fraction of the data when you train your model. Oh and then you’re also supposed to keep another...
16  juin     20h58
The most under-rated programming books
David Lindelöf    Ask any programmer what their favourite programming book is, and their answer will be one of the usual suspects: Code Complete, The Pragmatic Programmer, or Design Patterns. And rightly so; these are outstanding and highly regarded works that belong to every programmer’s bookshelf. If you’re just...