# Code kommentieren

<br/>
<br/>

Dieses Notebook finden Sie hier: https://scm.cms.hu-berlin.de/ibi/python/-/blob/master/programmierspass/Kommentare.ipynb

<br/>



Dieses Notebook ist als freies Werk unter der Lizenz [Creative Commons Attribution-NonCommercial 3.0 Unported](http://creativecommons.org/licenses/by-nc/3.0/) verfügbar. Sie dürfen die Inhalte kopieren, verteilen und verändern, solange Sie die Urheber nennen und sie nicht für kommerzielle Zwecke nutzen.

## Kommentare: was, wie, warum?

### Was?

- Erklärung oder Anmerkung im Quellcode
- unterstützen Verständlichkeit für Menschen
- Compiler/Interpreter ignorieren diese i.d.R.

```python
#
# traverses a directory and collects title names
#
def traverse(directory, titles):
 ...
```

### Wie? → (Python-)Syntax

#### Zeilenkommentare

```python
# am Zeilenanfang

print("Hello World!") # am Ende einer Zeile
```

#### Blockkommentare

```python
def dothis():
 """Als Beschreibung einer Funktion."""
 pass
```

### Warum Code kommentieren?

> Good code is its own best documentation. ([Steve McConnell](http://en.wikipedia.org/wiki/Steve_McConnell))

> Code is far better describing what code does than English, so just write clear code. ([Mike Grouchy](https://mikegrouchy.com/blog/yes-your-code-does-need-comments))

- viele widersprüchliche Meinungen, z.B.
 - möglichst selbsterklärender Code ("self-documenting") mit wenigen Kommentaren
 - ausführliche Kommentare und Erklärungen
- wichtig: 
 - Kommentare müssen korrekt sein und zum Code passen
 - überflüssige oder nicht hilfreiche Kommentare vermeiden
 - Zielgruppe beachten!

```java
String s = "Wikipedia"; /* Assigns the value "Wikipedia" to the variable s. */
```
→ als Einführungstext für Anfänger*innen ok, sonst überflüssig

## Zweck von Kommentaren

### Code planen und prüfen

In [None]:
# alle Einträge rückwärts (chronologisch!) durchlaufen
for eintrag in reversed(logbuch):
 # Eintrag verarbeiten
 update(eintrag)

- vor dem Programmieren in einer Art Pseudocode den zu schreibenden Code skizzieren
- sollte den Sinn des Codes beschreiben, nicht den Code selbst
- ermöglicht Begutachtung/Prüfung des fertigen Codes → erfüllt dieser seine Aufgabe?

## Code beschreiben

> Don't document bad code – rewrite it. ([The Elements of Programming Style](https://en.wikipedia.org/wiki/The_Elements_of_Programming_Style_(book)), Kernighan & Plauger)

> Good comments don't repeat the code or explain it. They clarify its intent. Comments should explain, at a higher level of abstraction than the code, what you're trying to do. ([Code Complete](https://en.wikipedia.org/wiki/Code_Complete), McConnell)

```python
"status" : d["status"] # unused but required by standard
```

In [None]:
# Wir benötigen eine stabile Sortierung. Geschwindigkeit spielt keine Rolle.
insertion_sort(liste) 

- Code zusammenfassen oder Absicht erklären
- NICHT: Code in natürlicher Sprache wiederholen
- auch: (Ursache für) Besonderheiten erklären
- auch: Algorithmus beschreiben 

## Einbettung von Ressourcen



In [None]:
# Graph:
#
# 5
# / \
# 3---4
# |\ /|
# | X |
# |/ \|
# 1---2
# 
# Adjazenzliste:
nikolaus = {
 1: [2, 3, 4],
 2: [1, 3, 4],
 3: [1, 2, 4, 5],
 4: [1, 2, 3, 5],
 5: [3, 4]
}

- Diagramme, Flussdiagramme, Tabellen, etc.

## Metadaten

In [None]:
#
# Process (extract, filter, merge) Vossantos in an org mode file.
#
# Usage: Without any arguments, extracts all Vossanto canidates from
# the given org file.
#
# Author: rja
#
# Changes:
# 2019-12-18 (rja)
# - added field sourceImageLicense
# 2019-12-16 (rja)
# - added option "--images" to enrich URLs for Wikipedia Commons images
# ...

- Namen der Autor*innen, Versionshinweise, Lizenzangaben, Link zur Dokumentation, etc.
- Hinweise zur Nutzung, zu Fehlern/Verbesserungsmöglichkeiten, zum Melden von Fehlern, etc.

## Debugging

In [None]:
def get_matching_item(bibentry, items):
 # heuristic: rely on Crossref's ranking and take the first item
 item = items[0]
 # heuristic: do (almost) exact string matching on the title (only)
 if item.get("title")[0].casefold() == bibentry["title"].casefold():
 return item
 # debug output
 # print(bibentry["ID"], bibentry["title"])
 # print(" best Crossref match:", item.get("title"), item.get("DOI"))
 return None

- auskommentierter Code wird nicht ausgeführt
- Fehlersuche; "alten" oder alternativen Code deaktivieren

## Automatische Erzeugung von Dokumentation

Quelle: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_curve.html

In [None]:
def roc_curve(y_true, y_score, *, pos_label=None, sample_weight=None,
 drop_intermediate=True):
 """Compute Receiver operating characteristic (ROC)

 Note: this implementation is restricted to the binary classification task.

 Read more in the :ref:`User Guide <roc_metrics>`.

 Parameters
 ----------

 y_true : array, shape = [n_samples]
 True binary labels. If labels are not either {-1, 1} or {0, 1}, then
 pos_label should be explicitly given.

 y_score : array, shape = [n_samples]
 Target scores, can either be probability estimates of the positive
 class, confidence values, or non-thresholded measure of decisions
 (as returned by "decision_function" on some classifiers).

 pos_label : int or str, default=None
 The label of the positive class.
 When ``pos_label=None``, if y_true is in {-1, 1} or {0, 1},
 ``pos_label`` is set to 1, otherwise an error will be raised.

 sample_weight : array-like of shape (n_samples,), default=None
 Sample weights.

 drop_intermediate : boolean, optional (default=True)
 Whether to drop some suboptimal thresholds which would not appear
 on a plotted ROC curve. This is useful in order to create lighter
 ROC curves.

 .. versionadded:: 0.17
 parameter *drop_intermediate*.

 Returns
 -------
 fpr : array, shape = [>2]
 Increasing false positive rates such that element i is the false
 positive rate of predictions with score >= thresholds[i].

 tpr : array, shape = [>2]
 Increasing true positive rates such that element i is the true
 positive rate of predictions with score >= thresholds[i].

 thresholds : array, shape = [n_thresholds]
 Decreasing thresholds on the decision function used to compute
 fpr and tpr. `thresholds[0]` represents no instances being predicted
 and is arbitrarily set to `max(y_score) + 1`.

 See also
 --------
 roc_auc_score : Compute the area under the ROC curve

 Notes
 -----
 Since the thresholds are sorted from low to high values, they
 are reversed upon returning them to ensure they correspond to both ``fpr``
 and ``tpr``, which are sorted in reversed order during their calculation.

 References
 ----------
 .. [1] `Wikipedia entry for the Receiver operating characteristic
 <https://en.wikipedia.org/wiki/Receiver_operating_characteristic>`_

 .. [2] Fawcett T. An introduction to ROC analysis[J]. Pattern Recognition
 Letters, 2006, 27(8):861-874.

 Examples
 --------
 >>> import numpy as np
 >>> from sklearn import metrics
 >>> y = np.array([1, 1, 2, 2])
 >>> scores = np.array([0.1, 0.4, 0.35, 0.8])
 >>> fpr, tpr, thresholds = metrics.roc_curve(y, scores, pos_label=2)
 >>> fpr
 array([0. , 0. , 0.5, 0.5, 1. ])
 >>> tpr
 array([0. , 0.5, 0.5, 1. , 1. ])
 >>> thresholds
 array([1.8 , 0.8 , 0.4 , 0.35, 0.1 ])

 """


## Anweisungen für bestimmte Programme

In [None]:
#!/usr/bin/env python3
# -*- coding: UTF-8 -*-
# vim: tabstop=8 expandtab shiftwidth=4 softtabstop=4

print("Testing")

- "Shebang" (#!) → gibt den zu verwendenden Interpreter an
- "Magic comments", z.B. zur verwendeten Codierung
- ... oder zur Konfiguration des Editors

## Stressabbau :-)

```c
/*
 * Some dipshit decided to store some other bit of information
 * in the high byte of the file length. Truncate size in case
 * this CDROM was mounted with the cruft option.
 */
```
Quelle: https://github.com/torvalds/linux/blob/master/fs/isofs/inode.c#L1396

→ [Linux Swear Count](http://www.vidarholen.net/contents/wordcount/)

```c
// 
// Dear maintainer:
// 
// Once you are done trying to 'optimize' this routine,
// and have realized what a terrible mistake that was,
// please increment the following counter as a warning
// to the next guy:
// 
// total_hours_wasted_here = 42
// 
```
Quelle: https://stackoverflow.com/questions/184618/what-is-the-best-comment-in-source-code-you-have-ever-encountered 

## Beispiele

- https://borgelt.net/bin/apriori (→ apriori.c)
- https://github.com/zedshaw/lamson/blob/master/lamson/server.py#L55
- https://github.com/weltliteratur/vossanto/blob/master/org.py
- https://scm.cms.hu-berlin.de/ibi/notebooks/-/blob/master/Mondrian.ipynb

## *Anleitung zum Unglücklichsein* oder *How to write unmaintainable code*

> Any fool can tell the truth, but it requires a man of some sense to know how to lie well. ([Samuel Butler](https://en.wikipedia.org/wiki/Samuel_Butler_%28novelist%29) (1835 - 1902))

> Incorrect documentation is often worse than no documentation. ([Bertrand Meyer](https://en.wikipedia.org/wiki/Bertrand_Meyer))

Since the computer ignores comments and documentation, you can lie outrageously and do everything in your power to befuddle the poor maintenance programmer.

Quelle: Roedy Green, [How To Write Unmaintainable Code](https://web.archive.org/web/20120306115925/http://freeworld.thc.org/root/phun/unmaintain.html) ([Originalseite](https://www.mindprod.com/jgloss/unmain.html))

### Lie in the comments

You don't have to actively lie, just fail to keep comments as up to date with the code.

In [None]:
# remove control characters
def rm_ctrl(text):
 return re.sub(r"[\n\t\r]+", " ", text).strip()

### Document the obvious

Pepper the code with comments like `/* add 1 to i */` however, never document wooly stuff like the overall purpose of the package or method.

In [None]:
def einefunktion(a, b):
 # a² zu b² addieren
 c = a**2 + b**2
 # die Wurzel des Ergebnisses zurückgeben
 return math.sqrt(c)

In [None]:
def einefunktion(a, b):
 # Für zwei gegebene Seitenlängen a und b die dritte Seitenlänge
 # c eines rechtwinkligen Dreiecks mit Hilfe des Satzes des 
 # Pythagoras (a² + b² = c²) berechnen.
 c = a**2 + b**2
 return math.sqrt(c)

### Document How Not Why

Document only the details of what a program does, not what it is attempting to accomplish. That way, if there is a bug, the fixer will have no clue what the code should be doing.

In [None]:
def read_dict(lines):
 d = dict()
 for line in lines:
 if not line.startswith('#'):
 # nur Zeilen ohne # am Anfang verarbeiten
 key, val = line.strip().split('\t', 1)
 d[key] = val
 return d

### Avoid Documenting the "Obvious"

If, for example, you were writing an airline reservation system, make sure there are at least 25 places in the code that need to be modified if you were to add another airline. Never document where they are. People who come after you have no business modifying your code without thoroughly understanding every line of it.

In [None]:
sep = '\t' # column separator

def print_csv(parts):
 for part in parts:
 print(sep.join([part_to_string(part[p]) for p in part]))

def read_dict(lines):
 d = dict()
 for line in lines:
 if not line.startswith('#'):
 key, val = line.strip().split('\t', 1)
 d[key] = val
 return d

### On the Proper Use Of Documentation Templates

Consider function documentation prototypes used to allow automated documentation of the code. These prototypes should be copied from one function (or method or class) to another, but never fill in the fields. If for some reason you are forced to fill in the fields make sure that all parameters are named the same for all functions, and all cautions are the same but of course not related to the current function at all.

```java	
/**
 * @param receiver
 * @param limit
 * @param offset
 * @param session
 *
 * @return a lists of posts of type R with the inbox content
 */
public List<Post<R>> getPostsFromInbox(final String receiver, final int limit, final int offset, final DBSession session) {
 ... 
}

/**
 * TODO: improve docs
 *
 * @param days
 * @param limit
 * @param offset
 * @param hashId
 * @param session
 *
 * @return list of posts
 */
public List<Post<R>> getPostsPopular(final int days, final int limit, final int offset, final HashID hashId, final DBSession session) {
 ...
}
```

### On the Proper Use of Design Documents

When implementing a very complicated algorithm, use the classic software engineering principles of doing a sound design before beginning coding. Write an extremely detailed design document that describes each step in a very complicated algorithm. The more detailed this document is, the better.

In fact, the design doc should break the algorithm down into a hierarchy of structured steps, described in a hierarchy of auto-numbered individual paragraphs in the document. Use headings at least 5 deep. Make sure that when you are done, you have broken the structure down so completely that there are over 500 such auto-numbered paragraphs. For example, one paragraph might be (this is a real example)

> 1.2.4.6.3.13 - Display all impacts for activity where selected mitigations can apply (short pseudocode omitted).

**then ...** (and this is the kicker) when you write the code, for each of these paragraphs you write a corresponding global function named:

```python
Act1_2_4_6_3_13() 
```

Do not document these functions. After all, that's what the design document is for!

Since the design doc is auto-numbered, it will be extremely difficult to keep it up to date with changes in the code (because the function names, of course, are static, not auto-numbered.) This isn't a problem for you because you will not try to keep the document up to date. In fact, do everything you can to destroy all traces of the document.

Those who come after you should only be able to find one or two contradictory, early drafts of the design document hidden on some dusty shelving in the back room near the dead 286 computers.

### Units of Measure

Never document the units of measure of any variable, input, output or parameter, e.g., feet, metres, cartons. This is not so important in bean counting, but it is very important in engineering work. As a corollary, never document the units of measure of any conversion constants, or how the values were derived. It is mild cheating, but very effective, to salt the code with some incorrect units of measure in the comments. If you are feeling particularly malicious, make up your **own** unit of measure; name it after yourself or some obscure person and never define it. If somebody challenges you, tell them you did so that you could use integer rather than floating point arithmetic.

In [None]:
def wurfdistanz(geschwindigkeit, winkel):
 return geschwindigkeit**2 / 9.81 * math.sin(2*winkel)

### Gotchas

Never document gotchas in the code. If you suspect there may be a bug in a class, keep it to yourself. If you have ideas about how the code should be reorganised or rewritten, for heaven's sake, do not write them down. Remember the words of Thumper in the movie Bambi "*If you can't say anything nice, don't say anything at all*". What if the programmer who wrote that code saw your comments? What if the owner of the company saw them? What if a customer did? You could get yourself fired. An anonymous comment that says "This needs to be fixed!" can do wonders, especially if it's not clear what the comment refers to. Keep it vague, and nobody will feel personally criticised.

In [None]:
print(key, ":", value) # FIXME

In [None]:
print(key, ":", value) # FIXME: ignore keys with empty values

### Documenting Variables
Never put a comment on a variable declaration. Facts about how the variable is used, its bounds, its legal values, its implied/displayed number of decimal points, its units of measure, its display format, its data entry rules (e.g. total fill, must enter), when its value can be trusted etc. should be gleaned from the procedural code. If your boss forces you to write comments, lard method bodies with them, but never comment a variable declaration, not even a temporary!

```c
const char *name;
int len;
char c;
unsigned long hash;
```

### Disparage In the Comments

Discourage any attempt to use external maintenance contractors by peppering your code with insulting references to other leading software companies, especial anyone who might be contracted to do the work, e.g.:
```java
/* The optimised inner loop.
 This stuff is too clever for the dullard at Software Services Inc., who would
 probably use 50 times as memory & time using the dumb routines in <math.h>.
*/
class clever_SSInc
{
 ...
} 
```

If possible, put insulting stuff in syntactically significant parts of the code, as well as just the comments so that management will probably break the code if they try to sanitise it before sending it out for maintenance.

### COMMENT AS IF IT WERE CØBØL ON PUNCH CARDS

Always refuse to accept advances in the development environment arena, especially SCIDs. Disbelieve rumors that all function and variable declarations are never more than one click away and always assume that code developed in Visual Studio 6.0 will be maintained by someone using edlin or vi. Insist on Draconian commenting rules to bury the source code proper.

### Monty Python Comments

On a method called makeSnafucated insert only the JavaDoc `/* make snafucated */`. Never define what *snafucated* means **anywhere**. Only a fool does not already know, with complete certainty, what *snafucated* means. For classic examples of this technique, consult the Sun AWT JavaDOC.

```java
/* make snafucated */
void makeSnafucated() {
 ...
}
```

## Aufgaben

### Kommentare beschreiben

Suchen Sie nach Quellcode auf Ihrem Rechner oder im Web und lesen Sie die Kommentare im Code.
1. Welche *Arten von Kommentaren* können Sie finden?
2. Was ist der *Zweck* der Kommentare?

Suchen Sie so lange, bis Sie jeweils zwei Varianten gefunden haben.

### Code selber kommentieren

Fügen Sie Kommentare zu den folgenden Codebeispielen hinzu. Überlegen Sie sich für jeden Kommentar, 
- welchen *Zweck* Sie mit dem Kommentar erreichen wollen, 
- welche *Art* von Kommentar Sie verwenden wollen und 
- welche *Zielgruppe(n)* Sie damit ansprechen wollen. 

In [None]:
def boxsay(s, h="-", v="|", c="+"):
 print(c + (2+len(str(s)))*h + c)
 print(v, str(s), v)
 print(c + (2+len(str(s)))*h + c)
 
boxsay("Hallo Welt!")

In [None]:
def equal_chars(v, w):
 m = []
 for i in range(min(len(v), len(w))):
 if v[i] == w[i]:
 m.append(v[i])
 else:
 m.append("_")
 return m

equal_chars("Magdeburg", "Hannover")

In [None]:
import urllib.request
import re

with urllib.request.urlopen("https://de.wikipedia.org/wiki/IP-Adresse") as f:
 html = f.read().decode('utf8')
 
 for ip in sorted(re.findall("[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+", html)):
 print(ip)

In [None]:
import json

def chars_markdown(fname, celltype):
 with open(fname) as json_file:
 data = json.load(json_file)
 
 count = 0
 for cell in data['cells']:
 if cell['cell_type'] == celltype:
 for line in cell['source']:
 count += len(line.strip().replace(" ", ""))
 return count

chars_markdown("Kommentare.ipynb", "code")

In [None]:
from urllib import request
import json 

def get_dracor(corpus, play=None):
 url = "https://dracor.org/api/corpora/" + corpus
 if play is not None:
 url = url + "/play/" + play + "/spoken-text"
 with request.urlopen(url) as req:
 text = req.read().decode()
 if play is None:
 return json.loads(text)
 return text
 
get_dracor("ger")["description"]

In [None]:
from collections import Counter
import re

re_word = re.compile("\W")

def count_terms(text):
 nouns = Counter()
 upper = Counter()
 tokens = text.split(' ')
 for i, term in enumerate(tokens):
 if len(term) > 0 and term[0].isupper():
 if i > 0 and len(tokens[i-1]) > 0 and tokens[i-1][-1] not in ['.', '?', '!', ':']:
 term = re_word.sub("", term)
 nouns[term] += 1
 if term.isupper():
 term = re_word.sub("", term)
 if len(term) > 1:
 upper[term] += 1
 return nouns, upper

count_terms("Die Europäische Union hat Beobachterstatus in der G7, ist Mitglied in der G20 und vertritt ihre Mitgliedstaaten in der Welthandelsorganisation.")

In [None]:
import numpy as np
import matplotlib.pyplot as plt

k = 50 # number of horizontal points
n = 80 # number of vertical lines

plt.rcParams['figure.figsize'] = (15, 15)
plt.rcParams['figure.dpi'] = 140
plt.style.use('dark_background')
plt.axis('off')
plt.xlim(-100, 200)
plt.ylim(-60, 240)

xs = np.linspace(0, 100, k)

def sq(x, minx, maxx, miny, maxy, invert=False):
 x = (x - minx) / (maxx - minx)
 if invert:
 x = 1 - x
 if x < 0.5: # ascent
 fx = 2*x**2
 else: # descent
 fx = (1 - 2*(1-x)**2)
 return fx * (maxy - miny) + miny

def f(x, left, right, width, height):
 if x < left or x > right: # left + right
 return np.abs(np.random.normal(0, 0.5))
 if x < left + width: # ascent
 return np.abs(np.random.normal(0, 1.0)) + sq(x, left, left + width, 0, height)
 if x > right - width: # descent
 return np.abs(np.random.normal(0, 1.0)) + sq(x, right - width, right, 0, height, True)
 else: # middle
 return np.random.exponential(2) + height

for i in range(1, n+1):
 data = [f(x, 25, 75, 10, 5) + i*2 for x in xs]
 plt.plot(xs, data, color="white", zorder=n-i, linewidth=1)
 plt.fill_between(xs, data, color="black", zorder=n-i)

plt.show()

## Literatur

- https://en.wikipedia.org/wiki/Comment_(computer_programming)
- Keyes, Jessica (2003). Software Engineering Handbook. CRC Press. ISBN 978-0-8493-1479-7.
- Roedy Green, [How To Write Unmaintainable Code](https://web.archive.org/web/20120306115925/http://freeworld.thc.org/root/phun/unmaintain.html) ([Originalseite](https://www.mindprod.com/jgloss/unmain.html))
- https://mikegrouchy.com/blog/yes-your-code-does-need-comments
- https://stackoverflow.com/questions/184618/what-is-the-best-comment-in-source-code-you-have-ever-encountered