Overparametrized Linear Models

Posted on July 09 2017 in Statistics • Tagged with overparametrization, linear models, statistics, rLeave a comment

Surprisingly, quite a few data scientists overlook the importance of linear regression and the problem of overparametrization. In this post, I'm going to describe mathematically what it means for a model to be overparametrized and the general strategies used by a statistician to resolve non-uniqueness.

Ranks, Subspaces, and Bases

Suppose …

Continue reading

Bayesian Zero-Inflated Poisson Model

Posted on July 05 2017 in Bayesian Statistics • Tagged with zero inflated poisson, mcmc, bayesian statistics, statistics, rLeave a comment

Wikipedia defines

The zero-inflated Poisson model concerns a random event containing excess zero-count data in unit time. For instance, the number of insurance claims within a population for a certain type of risk would be zero-inflated by those people who have not taken out insurance against the risk and thus …

Continue reading

Foursquare Location-Content-Aware Recommender System

Posted on June 26 2017 in Bayesian Statistics • Tagged with latent dirichlet allocation, hierarchical bayes, recommender system, statistics, rLeave a comment

Foursquare uses its unique location technology and foot traffic panel to produce personalized recommendations of spatial items such as restaurants. Recently I've read a paper LCARS: A Spatial Item Recommender System that I implemented from scratch in R.

The recommender system combines the querying user's interest and the local preference …

Continue reading

Bayesian Approach to Ranking Movies

Posted on March 03 2017 in Bayesian Statistics • Tagged with ranking, recommender system, statistics, machine learning, rLeave a comment

Many people like watching movies, and I do too. Recently I've discovered this somewhat sketchy Korean site that hosts quite a few movies online. I call it sketchy because I almost always see ads asking me to download a piece of software that's supposed to protect my computer from a …

Continue reading

Estimating the Total Daily Number of Customers at Bon Me

Posted on January 30 2016 in Bayesian Statistics • Tagged with bayesian statistics, hierarchical bayes, monte carlo simulation, statistics, machine learning, r, jagsLeave a comment

Bon Me is one of my favorite places to grab a quick bite for lunch. Surprisingly I didn't know about its existence until earlier last year despite being so close to my workplace. I really recommend you try miso-braised pulled pork on brown rice without cilantro. It's really addicting.

Anyhow …

Continue reading

Predicting the Fitbit Challenge Winner with Hierarchical Bayes

Posted on November 28 2015 in Bayesian Statistics • Tagged with bayesian statistics, hierarchical bayes, monte carlo simulation, statistics, machine learning, r, jagsLeave a comment

Hi there! It's been a little more than a month since my last post. Thanksgiving was two days ago, and I had a good time with my family. How's everyone doing?

With the upcoming Cyber Monday, I'm sure many of you have cool electronics and gadgets in mind to look …

Continue reading

A/B Testing Part I

Posted on April 03 2015 in Statistics • Tagged with a/b testing, bayesian statistics, statistics, machine learningLeave a comment

This is going to be my first post on a topic in data science, and in the next few posts including this one, I will talk about A/B testing, specifically how to do it right using Bayesian methods in comparison with the traditional Frequentist hypothesis testing.

the basic concept …

Continue reading