Tag: projects
Efficient type validation for Python functions
April 18, 2021#projects #python
When it comes to writing complex pipelines running in production, it’s critical to have a clear understanding of what each function does, and how its outputs affect downstream functions. But despite our best efforts to write modular, well-tested functions, bugs love hiding in the handoffs between functions, and they can be hard to catch even with end-to-end tests.
Building a full-stack spam catching app 3. Frontend & Deployment
March 21, 2021#machine-learning #projects #python
Building a full-stack spam catching app 2. Backend
March 14, 2021#machine-learning #projects #python
Building a full-stack spam catching app 1. Context
March 11, 2021#machine-learning #projects #python
Visualizing the danger of multiple t-test comparisons
May 13, 2018#projects #r #statistics
It’s often tempting to make multiple t-test comparisons when running analyses with multiple groups. If you have three groups, this logic would look like “I’ll run a t-test to see if Group A is significantly different from Group B, then another to check if Group A is significantly different from Group C, then one more for whether Group B is different from Group C.” This logic, while seemingly intuitive, is seriously flawed. I’ll use an R function I wrote, false_pos
, to help visualize why multiple t-tests can lead to highly inflated false positive rates.
Linear regression via gradient descent
April 22, 2018#machine-learning #projects #r #statistics
After hearing so much about Andrew Ng’s famed Machine Learning Coursera course, I started taking the course and loved it. (His demeanor can make any topic sound reassuringly simple!) Early in the course, Ng covers linear regression via gradient descent. In other words, given a series of points, how can we find the line that best represents those points? And to take it a step further, how can we do that with machine learning?
Visualizing my daily commute
November 1, 2017#projects #r
I love data visualization, and one holiday my partner surprised me with the book Dear Data. The book is a series of weekly letters two data analysts wrote to one another with visualizations of data on random topics. One week they tracked the number of times they said “thank you,” for example; another week, they counted the number of times they looked at a clock. In their letters, they visualized their data. One of the most interesting parts of the book was seeing how differently they could plot the same type of data.
For loops vs. apply - a race in efficiency
July 13, 2017#projects #r
Welcome to the first Random R post, where we ask random programming questions and use R to figure them out. In this post we’ll look at the computational efficiency of for
loops versus the apply
function.