A Standard Problem: Determining Sample Size Recently, I was tasked with a straightforward question: "In an A/B test setting, how many samples do I have to collect in order to obtain significant results?" As ususal in statistics, the answer is not quite as straightforward as the question, and it depends quite a bit on the framework. In this case, the A/B test was supposed to test whether the effect of a treatment on the success rate p had the assumed size e.

Continue reading

The Final Result Let me start this article by showing you my usage of the functionality described. I work as a Data Scientist and use org-mode in Emacs for a large number of every day tasks. One of them is the documentation of new findings within datasets or other software's documentation or websites etc. In order to easily collect all these informations into a single reference, I like to use screenshots:

Continue reading

The Setting: Avoiding 4 Weeks of Runtime Recently, I was faced with a problem: I had written a rather complex simulation of a discrete time queueing network, and I needed to let this simulation run with some repetitions of the entire simulation, for some varying different parameter values, with many observations (i.e. ~ 2.000.000 observation). The goal was to verify that a new estimating procedure for such queueing networks provides sensible results.

Continue reading

Suddenly, Attention So, it seems like my blog is getting some attention. Recently, I was even featured on the front page of Hacker News: Number 17... only 16 more to go! Obviously, I am delighted and flattered by the number of people reading and discussing my blog. But since we're living in a material world, and I am a material guy, I kept on wondering "is there any way to monetize this in an ethical way?

Continue reading

A couple of weeks ago, I started to work with Emacs, and I grow fonder of it every day. During a very short time period, it has become my go-to editor for nearly everything I do on my computer, including (but not limited to) planning my Todos (in org-mode, to be precise), setting up my agenda (org-mode again), taking memos during meetings writing my (longer) e-mails play around with new stuff write blog posts (this is the first of these.

Continue reading

In a previous post I have shown you how to setup an AWS instance running the newest RStudio, R, Python, Julia and so forth, where the configuration of the instance can be freely chosen. However, there is quite a lot of possibilities of instance configurations out there: There are different instance classes (General Purpose, Compute Optimized, RAM Optimized, … ) and different instance sizes within these classes. For General Purpose, or t2, there are, e.

Continue reading

Hello world! My name is Sebastian Schweer, and I am a Data Scientist. This job description is increasingly popular, but it is notoriously difficult to describe precisely, what that entails. Let me show you one of my favourite definitions: Source. My job requires me to spend a lot of time each day writing code in varying languages, mostly R but also Python and SAS. This inevitably leads me to spend a lot of time thinking about both code as well as the process of programming itself.

Continue reading

Author's picture

Sebastian Schweer

Theoretiker, Ingenieur, Berater, Erzähler.

Data Scientist (Teamlead)

Heidelberg, Germany