In [1]:
%%capture
%config InlineBackend.figure_format = 'svg'
%matplotlib inline
import scipy as sc, pandas as pd, seaborn as sns
import gpflow as gp
import numpy as np
import matplotlib.pyplot as plt
In [2]:
sns.set()

Results of the go rank survey

Following are some visualizations and tables based on the data gathered in the march survey here. The raw data can be accessed here in the form of a .csv file.

The plots were made by mapping kyu ranks to negative integers, so that 1d corresponds to 0 (i.e. 1k -> -1, 2d -> 1). The tables are based on OGS ranks. They range from 15k to 7d as that's where almost all the responses lie.

In [3]:
data = pd.read_csv('Go Rank Survey March 2018.csv')
In [4]:
data.head()
Out[4]:
Timestamp OGS KGS DGS IGS Foxwq Tygem WBaduk GoQuest AGA EGF Korea China Japan
0 2018/03/25 10:04:02 AM GMT+3 1k 1k NaN 1k NaN 2d NaN NaN NaN NaN NaN NaN NaN
1 2018/03/25 10:15:30 AM GMT+3 5k 4k NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 2018/03/25 10:18:14 AM GMT+3 4k 3k NaN 3k NaN 1d NaN NaN NaN NaN NaN NaN NaN
3 2018/03/25 10:18:50 AM GMT+3 14k 14k NaN 14k NaN NaN NaN NaN NaN NaN NaN NaN NaN
4 2018/03/25 10:31:37 AM GMT+3 NaN 5k NaN 6k 3k 4k NaN NaN NaN NaN NaN NaN NaN

How many responses were given for each server.

In [5]:
data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 253 entries, 0 to 252
Data columns (total 14 columns):
Timestamp    253 non-null object
OGS          159 non-null object
KGS          170 non-null object
DGS          21 non-null object
IGS          73 non-null object
Foxwq        41 non-null object
Tygem        94 non-null object
WBaduk       33 non-null object
GoQuest      36 non-null float64
AGA          29 non-null object
EGF          92 non-null object
Korea        2 non-null object
China        8 non-null object
Japan        15 non-null object
dtypes: float64(1), object(13)
memory usage: 27.8+ KB
In [6]:
def mapping(x):
    try:
        t = x[-1]
    except:
        return x
    if t == 'k':
        n = -int(x[:-1])
    else:
        n = int(x[:-1])-1
    return n
X = data.iloc[:,1:].copy()
X = X.applymap(mapping)
X=X.iloc[:,((X.shape[0]-X.isna().sum())>5).values]
X.dropna().shape
Out[6]:
(0, 12)
In [7]:
# outliers
X.drop([164,95,], inplace=True)

Pair-wise plotting of server ranks with linear regression

In [8]:
%%capture --no-display
sns.pairplot(X.iloc[:,((X.shape[0]-X.isna().sum())>5*3).values], diag_kind='kde', kind='reg');
Out[8]:
<seaborn.axisgrid.PairGrid at 0x7ff0bcf85278>