Twitch earnings and the Zipfian distribution

Recently there was the Twitch leaks that someone packaged the earnings subset of into a neat space-separated file. Since it’s so easily read by machine I decided to do some simple graphs with it.

Twitch streamers earnings against rank, with Zipf and ordinary exponential fits.

Twitch streamers earnings against rank, with Zipf and ordinary exponential fits.

One of the things I noticed is that it’s not really Zipfian. It is exponential but the tail is very much flatter than what a Zipfian distribution would expect. However, even the exponential fit isn’t perfect. There’s that characteristic bulge where the exponent turns over significantly at about the 400th rank. I saw it happen with some other distributions too, like the word frequency chart in Wikipedia, so I am thinking if I can make a new empirical distribution out of it.

I also only have the top 10 000 accounts, so I don’t know what would happen if I include all of it. If someone has a more complete list, here’s the source code that I used to generate the chart above, assuming that you have it in a directory where it is called earnings.csv.

set terminal pngcairo font ",10" size 1024, 768
set output "twitch-earnings-gen.png"
set title "Twitch streamers earnings against rank"
set logscale xy
set grid
set key autotitle columnhead
set ylabel "Earnings/USD"
set xlabel "Rank"

expfit(x) = a * x ** b
a = 3000000
b = -1
fit expfit(x) 'earnings.csv' using 1:4 via a, b

zipf(x) = zipf_a * x ** -1
zipf_a = 3000000
fit zipf(x) 'earnings.csv' using 1:4 via zipf_a

set label sprintf("Fit type: exponential (ax^b) \na = %8.2f; b = %1.9f", a, b) \
    at graph 0.2, graph 0.2
set label sprintf("Fit type: Zipf (a/x) \na = %8.2f", zipf_a) \
    at graph 0.2, graph 0.1
plot 'earnings.csv' using 1:4, \
     expfit(x) title "Fit of GrossEarning", \
     zipf(x) title "Zipf fit of GrossEarning"

Disclaimer – I will not publish any raw data that I have.

🗼 gemini://