An algorithm is a structured description on how to calculate things. Some of the most prominent examples of algorithms have been around for more than 2500 years like Euklid’s algorithm that gives you the greatest common divisor or Erathostenes’ sieve to give you all prime numbers up to a given maximum. These two algorithms do not contain any kind of value judgement. If I define a new method for selecting prime numbers – and many of those have been publicized! – every algorithm will come to the same solution. A number is prime or not.
But there is a different kind of algorithmic processes, that is far more common in our daily life. These are algorithms that have been chosen to find a solution to some task, that others would probably have done in a different way. Although obvious value judgments done by calculation like credit scoring and rating immediately come to our mind, when we think about ethics in the context of calculations. However there is a multitude of “hidden” ethic algorithms that far more pervasive.
On example that I encountered was given by Gary Wolf on the Quantified Self Conference in Amsterdam. Wolf told of his experiment in taking different step-counting gadgets and analyzing the differing results. His conclusion: there is no common concept of what is defined as “a step”. And he is right. The developers of the different gadgets have arbitrarily chosen one or another method to map the data collected by the gadgets’ gyroscopic sensors into distinct steps to be counted.
So the first value judgment comes with choosing a method.
Many applications we use work on a fixed set of parameters – like the preselection of a mobile optimized CSS when the web server encounters what it takes for a mobile browser. Often we get the choice to switch to the “Web-mode”, but still there are many sites that would not allow our changing the view unless we trick the server into believing that our browser would be a “PC-version” and not mobile. This of course is a very simple example but the case should be clear: someone set a parameter without asking for our opinion.
The second way of having to deal with ethics is the setting of parameters.
A good example is given by Kraemer et. al in their paper. In medical imaging technologies like MRI, an image is calculated from data like tiny elecromagnetic distortions. Most doctors (I asked some explicitly) take these images as such (like they have taken photographs without much bothering about the underlying technology before). However, there are many parameters, that the developers of such an algorithmic imaging technology have predefined and that will effect the outcome in an important way. If a blood vessel is already clotted by arteriosclerosis or can be regarded still as healthy is a typical decision where we would like be on the safe side and thus tend to underestimate the volume of the vessel, i.e. prefer a more blurry image, while when a surgeon plans her cut, she might ask for a very sharp image that overestimates the vessel’s volume by trend.
The third value judgment is – as this illustrates – how to deal with uncertainty and misclassification.
This is what we call alpha and beta errors. Most people (especially in business context) concentrate on the alpha error, that is to minimize false positives. But when we take the cost of a misjudgement into account, the false negative often is much more expensive. Employers e.g. tend to look for “the perfect” candidate and by trend turn down applications that raise their doubts. By doing so, it is obvious that they will miss many opportunities for the best hire. The cost to fire someone that was hired under false expectations is far less than the cost of not having the chance in learning about someone at all – who might have been the hidden beauty.
The problem of the two types of errors is, you can’t optimize both simultaneously. So we have to make a decision. This is always a value judgment, always ethical.
All three judgments – What method? What parameters? How to deal with misclassification? – are more often than not made implicitly. For many applications, the only way to understand these presumptions is to “open the black box” – hence to hack.
Given all that, I would like to demand three points of action:
– to the developers: you have to keep as many options open as possible and give others a chance in changing the presets (and customers: you must insist of this, when you order the programming of applications);
– to the educational systems: teach people to hack, to become curious about seeing behind things.
– to our legislative bodies: make hacking things legal. Don’t let copyright, DRM and the like being used against people who re-engineer things. Only what gets hacked, gets tested. Let us have sovereignty over the things we have to deal with, let us shape our surroundings according to our ethics.
At the last re:pubica conference I gave a talk and hosted a discussion on “Algorithm ethics” that was recorded. (in German):
The most common words in the Tweets tagged #qseu13 posted over the weekend. Here you find another visualization: [Wordcloud]
Last weekend the 4th Conference on Quantified Self took place in Amsterdam. Quantified Self is a movement or direction of thought that summarizes many aspects of datarization of the live of people by themselves. The term “QS” was coined by Kevin Kelly and Gary Wolf, who hosted the conference. Thus it cannot be denied that some roots of QS lie in the Bay-Area techno-optimistic libertarianism best represented by Wired. However a second root stems from people who started quantifying themselves to better deal with manifest health problems – be it polar disorder, insomnia or even Parkinson and cancer. In both aspects the own self acts as object and subject to first analyze and then shape itself. Both have to do with self-empowerment and acting on our human condition.
“For Quantified Self, ‘big data’ is more ‘near data’, data that surrounds us.”
Quantified Self can be viewed as taking action to reclaim the collection of personal data, not because of privacy but because of curiosity. Why not take the same approach that made Google, Amazon and the like so successful and use big data on yourself?
Tweets per hour during the conference weekend. Of course our physical life finds its expression in data ….
Since many QS-people use off-the-shelf gadgets, it is not only important to get full access to the data collected but also transparency on the algorithms that are implemented within. Like Gary Wolf pointed out, if two step-counters vary in their results, it tells us one thing: there is no common concept of ‘What is a step?’. These questions of algorithm ethics become more pressing as our daily life becomes more and more dependent on algorithms but we would usually not have a chance to see into that “black box” and the implicit value judgements that are programmed into it. (I just gave a talk on that specific topic at re:publica last Monday which I will post here later). I think that in no field the problems of algorithms taking ethic decisions becomes more obvious than when data deals immediately with yourself.
What self is there to be quantified?
What is the “me”? What is left, when we deconstruct what we are used to regard as “our self” into quanta? Is there a ghost in the shell? The idea of self-quantification implies an objective self that can be measured. With QS, the rather abstract outcomes of neuroscience or human genetics become tangible. The more we have quantitatively deconstructed us, the less is left for mind/body-dualism.
On est obligé d’ailleurs de confesser que la Perception et ce qui en dépend, est inexplicable par des raisons mécaniques. G. W. Leibniz
As a Catholic, I was never fond that our Conscious Mind would just be a Mechanical Turk. As a mathematician, I feel deep satisfaction in seeing our world including my very own self becoming datarizable – Pythagoras was right, after all! This dialectic deconstruction of suspicious dualism and materialistic reductionism was discussed in three sessions I attended – Whitney Boesel’s “The missing trackers”, Sarah Watson’s “The self in data” and Natasha Schüll’s “Algorithmic Selfhood”.
“Quantifying yourself is like art: constructing a kind of expression.”
Many projects I saw at #qseu13 can be classified as art projects in their effort to find the right language to express the usually unexpresseble. But compared to most “classic” artists I know, the QS-apologetes are far less self-centered (sounds more contradictory than it is) and much more directed to in changing things by using data to find the sweetspot to set their levers.
What starts with counting your steps ends consequently in shaping yourself with technological means. Enhancing your bodily life with technology is the definition of becoming a Cyborg, as my friend Enno Park points out. Enno got Cochlea-implants to overcome his deafness. He now advocates for Cyborg rights – starting with his right to hack into his implants. Enno demands his right to tweak the technology that became part of his head.
Self-hacking will become as common as taking Aspirin to cure a headache. Even more: we will have to get literate in the quantification techniques to keep up with others that would anyway do it for us: biometric security systems, medical imaging and auto-diagnosis. To express ourselves with our data will become part of our communication culture as Social Media have today. So there will be not much of an alternative left for those who have doubts about quantifying themself. “The cost of abstention will drive people to QS.” as Whitney Boesel mentioned.
Top Twitterers for #qseu13-conference: 1) Whitney Erin Boesel, 2) Maneesh Juneja 3) that’s me ;)
Crystals like the flourite, calcite, or garnet here show properties, that can easily be expressed in mathematical terms. This inspired the legendary Pythagoras an his students to postulate the whole world to be genuinly mathematical. Social behavior seams to be random, however data science can help us detect laws and patterns, that can be expressed in mathematical functions like the shape of the crystals.
Our senses are adapted to detect of our environment, what is necessary for our survival. In that way, evolution turns St. Augustin’s postulate of our world as being naturally conceivable to our minds from its head onto the feet. What we define as laws of nature are just the mostly linear correlations and the most regular patterns we could observe in our world.
When I had my first computer with graphical capabilities (an Atari Mega ST) in 1986, I, like everybody else, started hacking fractals. Rather simple functions produced remarkably complex and unpredictable visualizations. It was clear, that there might be many more patterns and laws to be discovered in nature, as soon as we could enhance our minds and senses with the computer – structures and patterns way to subtle to be recognised with our unarmed eye. In that way, the computer became, what the microscope or the telescope hat been to the researchers at the dawn of modernity: an enhancement of our mind and senses.
“Number is an extension and separation of our most intimate and interrelating activity, our sense of touch” (McLuhan)
The origin of the word digital stems from digitus, Latin for the finger. Counting is to separate, to cluster and summarize – as Beda the Venerable did with his fingers when he coined the term digit. With the Net, human behavior became trackable in unprecedented totality. Our lives are becoming digitized, everything we do becomes quantified that is, put in quants.
With the first graphically capable computers, we could suddenly experience the irritating complexity of the fractals. Now we can put almost anything into our calculations – and we find patterns and laws everywhere.
What is quantified, can be fed into algorithms. Algorithms extend our mind into the realm of data. We are already used to algorithms recommending us merchandise, handling many services at home or in business, like supporting our driving a car by navigating us around traffic jams. With data based design and innovation processes, algorithms take part in shaping our things. Algorithms also start making ethical judgments – drones that decide autonomously on the taking or sparing the life of people, or – less dramatic but very effectiv though – financial services granting us a better or worse credit score. We have already mentioned “Posthuman Advertising” earlier.
The world is not only recognisable, the world in every detail is quantifiable. Our datarized word is the final victory of the Pythagoreans – all and everything to be expressed in mathematics. Data science in this way leads us to a similar revolution of mind, than that of the time of Copernicus, Galileo and Kepler.
Mathematics is usually not regared as a science but as part of philosophy – although it has some relation to the “real world” – as shown in this 18 century cut.
There is a reason why we differentiate science and the humanities. And although sociology, experimental psychology and even history nowadays deploy many scientific methods, the difference is still fundamental. Humantites deal with correlations; the causalities are way further speculative than the “laws of nature” that are formulated in physics or chemistry. Also the data that supports social research is always and inherrently biased, no matter how much care we take in sampling, representativeness and other precautions we might take.
In her remarkable talk at Strataconf, Kate Crawford warned us, that we should always suspect our “Big Data” sources as highly biased, since the standard tools of dealing with samples (as mentioned above) are usualy neglected when the data is collected.
Nevertheless, also the most biased data gives us valuable information – we just have to be careful with generalizing. Of course this is only relevant for data relating to humans using some kind of technology or service (like websites collecting cookie-data or people using some app on their phone). However, I am anyway much more interested in the humanities’ side of data: Data describing human behavior, data as an aditional dimension of people’s lives.
Taken all this, I suggest to call this field of behavior data “Data Humanities” rather than “Data Science”.