Skip to content
Snippets Groups Projects
Commit bacd4a76 authored by Prof. Dr. Robert Jäschke's avatar Prof. Dr. Robert Jäschke
Browse files

added first notebook

parent 53a60e37
No related branches found
No related tags found
No related merge requests found
* Beispiele für Jupyter Notebooks
Ein Platz, um Jupyter-Notebooks zu sammeln.
Alle Notebooks sollten ohne weitere Dateien funktionieren,
Standard-Python-Bibliotheken nutzen (pandas, scikit-learn, etc.) und
ihre Daten idealerweise aus dem Netz ziehen.
%% Cell type:markdown id: tags:
# Basic statistics using the top 50 faculty dataset
[Dataset of 2200 faculty in 50 top US Computer Science Graduate Programs](http://cs.brown.edu/people/apapouts/faculty_dataset.html)
## Preprocessing
Load the data and get an overview:
%% Cell type:code id: tags:
```
import pandas as pd
df = pd.read_csv('https://raw.githubusercontent.com/brownhci/drafty/master-node/databaits/data/professors.csv')
df.head()
```
%% Cell type:markdown id: tags:
Clean the data (run just once):
%% Cell type:code id: tags:
```
df.loc[(df['JoinYear']=='Full')] # that row contains an error!
df.drop(3115, inplace=True) # so let's remove it
df['JoinYear'] = pd.to_numeric(df["JoinYear"]) # required for pandas.hist()
```
%% Cell type:markdown id: tags:
## Exploration
Plot histograms of some of the columns:
%% Cell type:code id: tags:
```
df.hist(column='JoinYear')
```
%% Cell type:code id: tags:
```
df['Rank'].value_counts().plot(kind='bar')
```
%% Cell type:code id: tags:
```
df['University'].value_counts().plot(kind='bar', figsize=(15,5))
```
%% Cell type:code id: tags:
```
df['Gender'].value_counts().plot(kind='bar')
```
%% Cell type:markdown id: tags:
**Apparently, the data needs more cleansing ...**
%% Cell type:markdown id: tags:
## Analysis
Questions that could be explored:
- Is the proportion of female staff increasing over time?
- Are higher ranks predominantly occupied by male staff?
- Which universities or fields have an almost equal gender distribution?
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment