As we enter the age of Big Data, there is a great deal of good that we, as Data Scientists can do, but also the potential for great harm.  There is now a free online course on this topic.

You will find here several articles on the validity and fairness of our analyses, and on privacy preservation.  A high level overview of the ideas can be found in this keynote at IEEE Big Data.  All these thoughts can be succinctly summarized in terms of a code of ethics that all data scientists should live by.

The Data Scientist's Code of Ethics

  1. When collecting/analyzing data, I will not surprise the subject of the data.
    • Informed consent is the standard
    • Data Destruction pledge
  2. I will own the outcome of my data analysis.
    • Address issues of algorithmic bias
    • Correct data errors as best as possible
    • Consider societal impact
vline.jpg

Validity

Privacy

Fairness