Algorithms are more powerful than news editors

24 Mar

EliPariserAt SXSW a session that caught a lot of attention was with the CEO of Upworthy, Eli Pariser, on the future of Journalism. He said some astounding things including the fact that, on news “algorithms are now more powerful than editors”. What he meant was not only the fact that algorithms now determine what news you are likely to receive but also, incredibly, how that news is written.

Algorithmic news delivery
With so many news sources online, from Facebook, Twitter, Google and lots of other news specific sites, like Upworthy, what news you see is now likely to have been preselected for you by algorithms that get to know what you want and aggregate data to make decisions about what people like you want. An ensemble of algorithms, invisible but potent, determine what you’re fed. This may sound frightening but some argue that this is far better than the self-selected editorial class responsible for what you see on TV news and newspapers. Any editorial process is subject to bias, inherent in the editorial group. Algorithms, arguably, can be designed to be more objective. Sure, it shifts the bias from editors to algorithm designers but at least there’s continuous improvement.

News junkies have never had it better
With 24 hour news channels and continuous feeds on the web “news junkies have never had it better”. If anything it’s a matter of realtime, editorial aggregation from multiple online sources.
Upworthly LogoUpworthy has 50-60 million users a months and is now more powerful than a lot of the most powerful editors in traditional media. Their key metrics have moved from unique visitors and page views to what they call “attention minutes” based on importance, satisfaction and quality. But the guerilla on the online news front is still Facebook. They can choose to tweak their algorithms to attack any competitor, as they have such a massive audience. Whatever the outcome, there is no doubt that news is now data and data can be mined, repurposed, repackaged and delivered on a massive scale.

Algorithmic news production
Stats Monkey ImageMore shocking is the fact that news is already being written by algorithmic software. Stats Monkey took baseball data and statistical models, one of the most data-driven sports on the planet, mined that data for key plays and players, then hauled in weather reports and strung it together into a factual report of the game. They added narrative arcs and styles, so that these stories had an angle – convincing win leading from the start, come from behind to win, to and fro to narrow win and so on. They could be written as straight reports, more humerous, from one side or the other. Quotes can be pulled in to make it seem as if it is written by a real journalist. You can even choose different narrative styles aimed at different audiences. The project came out of a joint effort by Medill School of Journalism and the McCormick School of Engineering at Northwestern University through the Center for Innovation in Technology, Media and Journalism.

Narrative Science
Narrative Science LogoOut of this came Narrative Science, with its quill product. They have moved beyond sport into financial reporting, as it is quite lucrative. The have refined and finessed the process. It’s software mines the data, gets the facts, determines the angles, builds the structure and polishes the narrative. It’s often hard to tell whether the piece was written by a person or machine and the pieces can be syndicated out.

News and learning
Interesting stuff and it makes one wonder whether the same process can’t be used for knowledge. Currently, as a teacher or learner, you have to curate your own content, that comes in lumps of pre-set media – Wikipedia articles, papers, YouTube videos, images, graphics, diagrams, photographs. Imagine a software programme that searches, finds, filters and reconstructs knowledge , personalised for your own needs. We have, at present, algorithmic software that delivers software based on ensembles of algorithms that understand who you are and what you need next on your learning journey. The next step is to automate the build of the content itself.

Conclusion
Things are moving fast. In news we’ve gone from a fixed time, once a day newspaper or TV news programme, delivered in real time, to rolling news on TV, to web delivered news and now algorithm determined delivery of that news, even algorithmic software that produces news stories. As this gets better, it may well be the case that the news delivered by software has such good market intelligence from its instantaneous data mining that it is, by definition, better than a human writer. We may even see smart narrative arcs and styles that are beyond that of the standard hack. As Eli Pariser says, he fully expects “a piece of software to win the ‘Pulitzer Prize’”.

CogBooks White Paper – Big Data

20 Dec

Screen Shot 2013-12-20 at 14.39.09

Big Data, at all sorts of levels in learning, reveals secrets we never imagined we could discover. It reveals things to you the user, searcher, buyer and learner.
It also reveals thing about you to the seller, ad vendors, tech giants and educational institutions. Big data is now big business, where megabytes mean megabucks. Given that less 2% of all information is now non-digital, it is clear where the data mining will unearth its treasure- online.

As we do more online, searching, buying, selling, communicating, dating, banking, socializing and learning, we create more and more data that provides fuel for algorithms that improve with big numbers. The more you feed these algorithms the more useful they become.

To download this White Paper, click here.

Adaptive learning – Eight Key Questions

13 Dec

Jim Thomson gave a talk on adaptive learning at Online Educa in Berlin this month. Titled ‘Adaptive Learning: Eight Key Questions’

Why is adaptive learning so important? he asked and argued that students need improved learning effectiveness through personalized support and reduced learning times through personalised pathways. He also argued that instructors need enhanced data and tools for helping students and increased automation of teaching tasks. In addition, University / Organizational Administrators need higher completion rates, reduced delivery costs and increased use of data driven methods

The eight key questions he asked, to help you to define the type of adaptive learning tool more specifically, were:

1. When are the data gathered and recommendations made ?

2. What data are used to drive adaptation?

3. What is adapted?

4. What methods drive adaptation?

5. What end use applications?

6. How extensive and scalable?

7. How open is the content model?

8. How accessible are the data and architecture?

Donald Clark – Why Adaptive Learning?

11 Dec

Donald Clark explains why algorithms and data can be used to great effect in delivering content in a personalised manner to learners, as well as providing optimal paths through learning content using powerful back end software.

Source: UFI charitable trust

Jim Thompson – Personalised Adaptive Learning at Turing Festival

3 Dec

Turing 2013 took place at the height of the Edinburgh Festival, the world largest arts and cultural event. The Festival brought together digital technology and the web within the world’s largest arts and creative gathering in a celebration of digital culture and creativity. Jim Thompson, CEO of CogBooks, was invited to speak at this prestigious event, presenting Personalized Adaptive Learning. It was chaired by another CogBooks Director, Donald Clark. 

Jim explains how adaptive learning is a game-changer in education, as smart software guides the learner through a network of content, constantly checking to see that each learner gets the content most suited to them at every moment of the learning experience. 

Can maths find you love? eHarmony’s love algorithm

18 Nov

Eharmony algorithm Could maths find you love? The dating site eHarmony, who claim to have been responsible for a staggering half a million marriages, use algorithms for just that purpose. They claim they are responsible for over 500 marriages a day and have data from 44 million people looking for love. What’s new are 200 items they collect in questionnaires from their premium customers. They claim that this data harvests six variables:

1. Agreeableness

2. Closeness with partner

3. Sexual and romantic passion

4. Spirituality

5. Extraversion and openness

6. Optimism and happiness

The algorithm then looks for similar scores.

Psychology of matching
But it’s a complex business this match making, as the psychology literature shows. Agreeable, open and optimistic people may just be better at getting on with anyone and it is not clear that dyadic (matching) effects have any real predictive quality. However, Dr. Gonzaga, the Chief Scientist at eHarmony, claims that studies he’s performed shows that couples who match are satisfied four years later.

Lack of controlled trails
This is all very well but what psychologists want to see is a controlled trial and an interesting issue has arisen around the reason for a lack of any randomly, controlled trials. eHarmony claim that it is just too difficult and unethical to randomly pair people who are looking for love, a betrayal of their trust in the service. They have a point. Fragile people looking for partners are not fodder for algorithmic tests.

Data gathering
If you look closely at the six variables, they go beyond the traditional well-established personality traits in psychology, especially on ‘sex and passion’, which eHarmony claims is a key variable. And who would argue? For general users they collect data on hundreds of traits, such as time spent on the site and response times to emails, also geographical data, as people in, say Manhattan, won’t go far for dates. This makes sense, as real people in the real world collect data on potential partners, mostly through awkward questioning on a first date, when it’s too late. EHarmony are in the Big Data game, like Google, Facebook and Amazon, gathering data and feeding that data through algorithms that make recommendations, based on their predictive power. They are right in looking to harvests Big Data, rather than take the tradition statistical route. Their data set may me messier but it will be large and it’s large data that counts. We are clearly only in the foothills of algorithmic matching but eHarmony seem to be leading the trek upwards and have established a good base camp, upon which further work can be based.

Love and learning
Does this have any import for learning? I think it does. We’ve gone through a hideous period in educational theory when ‘learning styles’ dominated the debate. We now know that these are dangerous fictions, with no real evidence-base, that pigeonhole learners and may actually inhibit learning. The good news is that the data one can gather on personality may well be useful in learning as it appears that learning styles do not exist, personality traits do.

In matching learners with learning experiences, data is also clearly useful.

1. One set of learner data could be personality type as this has a strong causal effect on actual motivation and behaviour, learning being a bundle of motivations and behaviours.

2. Algorithms can also be used to determine the learner’s background educational attainment i.e. what the learner already knows, the equivalent of competence levels from formative assessment

3. During the learning process data can be gathered, similar to eHarmony’s approach, such as time on task, response times and so on. This can be used to guide the learner, dynamically, towards more useful and compatible content.

4. In peer-to-peer learning and assessment, dating algorithms may well be adapted to for use in matching peers.

5. Data on failure may be equivalent to data on failed first dates or relationships and used for improvement of interpersonal skills for the next attempt.

Looking for knowledge may not that different from looking for love!

Big Data by Mayer-Schonberger & Cuckier

7 Nov

Big Data: A Revolution That Will Transform How We Live, Work and ThinkWhen Andrew Ng, the founder of Coursera, looked at the data from his ‘Machine Learning’ MOOC, he noticed that around 2000 students had all given the same wrong answer – they had inverted two algebraic equations. What was wrong, of course, was the question. This is a simple example of an anomaly in a relatively small but complete data set that can be used to improve a course. The next stage is to look for weaknesses in the course in a more systematic way using algorithms designed to look specifically for repeatedly failed test items. This example comes from Big Data by Mayer-Schonberger & Cuckier. This is the only educational example in the book, and although it is not a book about education, it is rich in examples which immediately suggest parallels in an educational context.

Big data, big numbers, big solutions
Big Data reveals secrets we never imagined we could discover. It reveals things to you the user, searcher, buyer and learner. It also reveals thing about you to the seller, ad vendors, tech giants and educational institutions. Big data is now big business, where megabytes mean megabucks. Given that less 2% of all information is now non-digital, it is clear where the data mining will unearth its treasure- online. As we do more online, searching, buying, selling, communicating, dating, banking, socializing and learning, we create more and more data that provides fodder for algorithms that improve with big numbers. The more you feed these algorithms the more useful they become.

Among the fascinating examples, is Google’s success with big data in their translation service, where over a trillion word data-set provides the feed for translations between over a dozen languages. Amazon’s recommendation engine looks at what you bought, what you didn’t buy, how long you looked at things and what books are bought together. This big data driven engine accounts for a third of all Amazon sales. With Netflix, their recommendation engine accounts for an astonishing three quarters of all new orders. Target, the US retailer, know (creepily) when someone is pregnant without the mother-to-be telling them. This led to an irate father threatening legal action when his daughter received a mail voucher for baby clothes. He returned a few days later, sheepishly apologisingapologizing!

Big data, big problems
Yet the authors here are not blindly evangelistic or naïve. The more data we produce, willingly, unwillingly or accidentally, the more we need to put our laws and regulations to the test, especially around privacy. Technology is always ahead of the sociology. Organizations that are transnational, state security organizations and many others need to reflect on the consequences of the data explosion. Indeed, there’s two full chapters on the dangers of big data on ‘risks’ and ‘control’. In education we need to be specifically  cautious, not only about who can use what data but also the unintended consequences, when dealing with reputations of institutions, teachers and the futures of learners. We cannot allow data to be too dictatorial.

Flip stats
What I especially liked about the book is the way it forces you to flip your thinking on traditional statistics, where scarce data is king and exactitude is the goal, to massive sets of unstructured data, where data mining and smart algorithms are the new monarchs. When data is scarce we tend to sample and randomize but this comes at a price – loss of detail. When N=ALL we can mine the data and messiness gets diluted in the big numbers. Note than in education N=ALL need not mean ‘big’. Datasets in courses, even across years, can be relatively small. The flip is that there’s no need to sample when you have massive amounts of data, the numbers do the talking.

This takes us back to a form of pure empiricism, where we deduce from data what is the case and use this in decision making. In this sense it’s a form of pure science, with data as the mother load and algorithms doing the heavy lifting. The shift from causality to correlation is another interesting mental reboot, where the big data algorithms are often simply pattern matching or using algorithms that identify correlation, not causation. There is, of course, a danger here, of inductive thinking, inferences based simply on what has happened before, have been brought to the fore several times in recent stock-market crashes, bubbles and financial crises. Some smart deduction and causal analysis may also be necessary to counter the inductive trend.

It is no accident that this is happening now. The technological changes that have enabled the big data revolution have been more than massive data production. It is also the result of the plummeting cost of storage and processing along with advances in smart algorithms that put big data to good use. The fact that we have mobile devices and an increasing number of people spend an increasing amount of time online has provided the raw data.

Big data and learning
In medicine, we can spot potential epidemics, give more accurate diagnoses, better treatments and provide preventative strategies. In astronomy, we can identify quicker and more accurately, objects in space. In commerce, knowledge of the buyer, so that more relevant recommendations can be made. So how relevant is big data to learning?

Perhaps we should be a bit realistic about the word ‘big’ in an educational context, as it is unlikely that many, other than a few large multinational, private companies will have the truly ‘big’ data. Skillsoft, Blackboard and others may be able to muster massive data sets, but a typical school, college or university may not. When we read books like this on big data we are talking about data sets which are many orders of magnitude bigger.

Nevertheless, data on learners across an institution or number of institutions may be useful, in terms of performance, possible course improvements, drop-outs and so on. At this organizational level, it is vital that institutions gather data that is much more fine-grained than just assessment scores and numbers of students who leave.

Even within delivered courses, large data sets (I prefer ‘large’ to ’big’ in this context) may be useful in course design and delivery. When smart algorithms are applied to these data sets, real improvements in course design and even real-time delivery of courseware, can be implemented.

Finally, at the level of the individual learner, data can be used to identify what that learner knows and doesn’t know to predict what they need to learn next, all calculated in real time.

Conclusion
This is not a book for tecchies interested in the detail; it is an introductory, explanatory, case-driven text. It is readable, well structured and doesn’t shy away from the downsides of big data. If you’re an educationalist new to the idea, this is a useful introduction, as it has a good spread of examples and non-esoteric discussions around ideas such as correlation v causation, data v algorithms, predictive analytics and so on. Any area of human endeavor that produces online data will, inevitably, start to use that data to spot flaws and improve productivity. Education is no exception. Some would say it is the area of greatest need.

Predictive Analytics: the Power to Predict Who Will Click, Buy, Lie, or Die by Eric Siegel

28 Oct

Predictive Analytics: the Power to Predict Who Will Click, Buy, Lie, or DieYou have been predicted…
Nice subtitle. Your credit risk, insurance quotes, sales potential, even your likelihood to marry, commit crimes, contract diseases, even die – is being predicted. When you get cold called on the telephone, junk mailed or subjected to online ads, you’re being predicted. Or at least they’re trying. While you sleep, organizations, or at least their computers, are trying to predict what you will do. Note that this is different from forecasting, which works ate the macro level, we’re talking here about you as an individual.

The power to predict used to be the realm of astrologers and tarot card readers. But as Siegel says “I’m a Scorpio, and Scorpios don’t believe in astrology”. The new age of prediction lies in computing, where huge amounts of data are collected, smart algorithms do their smart analysis to produce probabilistic outcomes. It’s not that we can predict the future with certainty but with a probability, the mathematical expression of certainty.

The Prediction Effect
Siegel’s principle is The Prediction Effect: A little prediction goes a long way. The perfect example is the Internet itself, almost wholly supported by the predictive analytics behind targeted advertising. But the meat in the sandwich comes in the three case study chapters. First up is Dan Steinberg’s work at Chase Bank around credit risk, which made the bank amazing profits. The downside, and I suspect the thoughts struck you by now, is that it also failed to predict the various financial crises that seem to pale the predictive age. The final case study focuses on the predictive power of the persuaders – retailers and advertisers.

Jeopardy!
Siegel’s showcase example is IBM’s ‘Watson’, a computer that beat the two all-time ‘Jeopardy!’ champions, across the 26 years of the show’s history, named, not after Sherlock’s sidekick, but an ex-IBM CEO, Watson was a 90 server, 2800 processors, 15 terabyte RAM computer that could execute 80 trillion operations a second. Jeopardy presents answers and you, as a contestant, have to identify the question. This is a fiendishly difficult task, as the clues are complex natural language constructions, often deliberately obtuse to make the show more entertaining. Interestingly, a few months before Watson appeared on the show a contestant, Roger Craig, a predictive modelling expert, won a record amount and came back a year later to win the Tournament of Champions. He used predictive algorithms to optimize his study time. He poured past Jeopardy! Data into his computer and wrote a system that learnt from his performance, selecting questions where he was weakest and focus on where he needed most help.

Predictive analytics
Siegel introduces us to linear models, where characteristics, as variables, are simply weighed and added together. Rule models take us through a set of rules to determine outcomes. Whatever the method, all methods are looking for a single predictive score for the individual, which acts as a guide for decision and action. This is all about decision making. But what makes predictive analytics really sing is what he calls the ‘ensemble effect’, multiple algorithms working in tandem.

Machine learning
This gives predictive analytics its edge. Some use decision trees, other variables and equations. Whatever the method, the parameters are constantly tweaked toward better results. The problem is not that machines can’t learn but that they over-learn. This can lead to odd results. Remember always that the algorithms know nothing conceptually. They have no concept of a mortgage or date. Of course predictions can also go wrong, catastrophically wrong. The Mars Climate Orbiter disappeared when the code received data in metric but was expecting Imperial.

Learning
An important point about predictive analytics is that software literally learns to predict, using the data that is pouring back from you as users. Learning lies at the heart of Predictive Analytics and will surely be subject to predictive analytics in due course? If computers have already learnt how to predict your behaviors, and they do, can they learn how to predict what and how you learn?

For the moment, predictive analytics has been used by a number of universities; University of Alabama, Iowa State University, Arizona State University, Oklahoma State University and the University of Eindhoven, to predict drop-outs and intervene to prevent it.

Conclusion
The book wavers between good, clear analysis and, at times rather wayward examples, where the author gets carried away. The Jeopardy! Example is a case in point. As Jared Lanier retorted in Technology Review, this was more of a stunt than a serious piece of research or progression in predictive analytics. It was all about trying to create the impression that ‘Watson’ was “the existence of a sui generis entity’. This is ‘celebrity’ analytics, not real science.

Nine Algorithms that changed the future by John MacCormick

21 Oct

Nine Algorithms That Changed the Future: The Ingenious Ideas That Drive Today's Computers

You’re reading this from a network, using software, on a device, all of which rely fundamentally on algorithms. This book allows you to see the vast portion of the software iceberg that lies beneath the surface, doing its clever but invisible thing, the real building blocks of contemporary computing – algorithms. We use algorithms all the time when we do even simple arithmetic. Multiply two numbers such as 23×4 and you’ll use a mental algorithm you probably learnt at school. Whenever you search, buy, do online banking, online dating, get online recommendations, see online ads; algorithms are doing their devilishly clever work. MacCormick deliberately chooses, not the classic algorithms you will find in coding courses around sorting and so on, but nine that do astounding things in the real world. These are the algorithms that continue to shape the future whenever you search, buy and socialize online.

So what are MacCormick’s nine favorite, real-world algorithms?

9 algorithms that changed the future

1. Search Engine Indexing: Finding needles in the world’s biggest haystack. Search for something on the web and you’re ‘indexing’ billions of documents and images. Not a trivial task and it needs smart algorithms to do it at all, never mind in a tiny, tiny fraction of a second.

2. PageRank: The technology that made Google one of the biggest companies in the world. Google has moved on, or at least greatly refined the algorithm(s) mentioned here. Nevertheless, the multiple algorithms that rank results when you search are very smart.

3. Public Key Cryptography: Sending secrets on a postcard, is a description of how encryption works and keeps your credit card details safe when buying stuff. Amazon, Ebay, PayPal, credit cards and the entire world of online retail would not exist without this algorithm.

4. Error Correcting Codes: Mistakes that fix themselves, so that sound, pictures and videos can be saved, stored and retrieved without loss, even from a CD or DVD but especially across networks, where these clever algorithms maintain quality.

5. Pattern Recognition: Learning from Experience on postcode readers, faces, license plates, in translation, speech recognition – pattern matching plucks out meaning from data. Mobile devices especially need to use these algorithms when you type on virtual keyboards or use handwriting software.

6. Data Compression: Something for nothing when we zip files, compress for transmission, decompress for use. Lossless and lossy compression, and decompression, magically squeeze big files into little files for transfer.

7. Databases: The quest for consistency describes how databases work. Again, the advent of big data means that the balance, in some contexts, has swung away from algorithms, towards the power of massive data sets. Nevertheless, when you use a database you use some clever algorithms.

8. Digital Signatures: Who really wrote this software? Some very smart algorithms are used to create signatures for individual users.

9. What is computable? Or more accurately what can’t be computed. Things don’t crash as often as they used to because algorithms catch the problems. However, some things, even using proof by contradiction don’t help and software has its limits. There are undecidable problems that computers can never solve.

This last of the nine, is a deliberate counterpoint, but it reads like an anomaly or personal enthusiasm, rather than a natural ninth algorithm. There’s a certainty about his abstract logic that is unwarranted, as we have quantum computing and philosophical arguments that allow us to question these certainties. It would have been much better to have written a chapter that shows the weaknesses of the algorithmic approach, such as our mistaken reliance on them in financial predictions, creating illusory certainties, the production of false positives and so on.

His first eight algorithms are interesting and pretty much describe the major, practical algorithms that did and continue to shape the software and online behaviors of people when they are online, although I’d have made number nine the smart file sharing algorithms behind Napster and Bittorrent that irreversibly changed the music and other industries.

Tricks of the algorithmic trade
The algorithms he presents really are works of art that have been designed, tweaked and finessed in response to experiment with real target populations. They work because they’ve been proven to work in the real world. Of course, what’s seen as an algorithm is likely to be multiple algorithms with all sorts of fixes and tricks. I particularly liked the way he reveals what he calls the ‘tricks’ of the trade, such as checksum, prepare then commit, random surfer, hyperlink, leave it out, nearest neighbour, word location, repetition, shorter symbol, pinpoint, same as earlier, metaword, padlock, twenty question – the smart tricks of the trade that make algorithms really sing. At these points the book gets beyond the rhetoric to the real facts.

Applicability to learning
Almost all of his chosen algorithms are already being used by teachers, learners and researchers. The first two on search and Google rankings are obvious examples used by every teacher and learner, daily. This is a real learning tool, what I’d call a MOOP (Massive Open Online Pedagogy). Perhaps indirectly, whenever you use Amazon to buy a book, you use key cryptography. Pattern matching is also used whenever you write, either in a search query or with predictive text or a spellchecker or translator. Error correction and data compression click in whenever rich media are used. This is what has enabled another MOOP (Massive Open Online Pedagogy) – video in learning. Databases are used for many forms of content storage and, although you may not know it most of the time, whenever you access a learning management system, VLE or learning content, you will have been using algorithm-driven databases. In other words, algorithms already lie at the heart of online learning, albeit in an almost invisible and indirect way.

We are now entering an age of algorithms in learning, where targeted algorithms can be used to improve the teaching and learning experience. It is clear that we use mental algorithms, almost every time we think, especially in a learning context. The hyperlink is another MOOP (Massive Open Online Pedagogy), that powers non-linear learning, where link analysis is used not only by Google but also by Facebook, Amazon and almost every other major online service. Knowing what you click on drives the major commercial models. In page ranking it improves the accuracy and reliability of the search query, in learning it can be used to draw inferences about the learner to present the optimal next item in their learning journey. It can also be used to improve the design of the course, even automatically as the network of content readjusts, whenever a new screen is presented, based on past data. Software algorithms can therefore be used to match new knowledge from a network of learning experiences to your existing, knowledge network. Algorithms can be used to deal with personal behavioral habits around how long you were on task, spaced practice and so on. Learning is complex, algorithms are smart. Too much linear e-learning is too simple for the complex task of learning. We have seen how algorithms have been used in simulations and games to improve the user experience and performance. This is bound to happen in other species of online learning.

To see a useful breakdown of the Types of adaptive and algorithmic learning, Jim Thompson has written an excellent primer. For a general discussion of Algorithmic learning see this article by Donald Clark.

Conclusion
It’s an accessible book, as he’s writing for the lay person using analogies and simple progressions, upping the level of complexity as he goes. As he gradually reveals the secrets of his nine algorithms you can’t fail to admire the elegance of these carefully constructed, magic, mathematical spells. They are stunningly clever.

Bibliography
Types of Adaptive Learning by Jim Thompson
Algorithmic learning by Donald Clark

Automate This: How Algorithms came to rule the world by Christopher Steiner

13 Oct

Automate This: How Algorithms Took Over Our Markets, Our Jobs, and the WorldWhen algorithms go wrong, they go badly wrong. Steiner starts his book with the story of the Amazon books priced at millions of dollars when two pricing algorithms went to war and tried to outbid each other. The stock market crash in May 2010 was another, when stocks dropped dramatically on the back of an automated trading glitch. But the book shows that these are outliers and that as the online revolution has accelerated, the often invisible application of algorithms has crept into a vast range of our online activities.

A story unique to this book, compared to others I’ve read and reviewed, is his brief history of algorithms. This includes the Sumerians, Euclid, the origins of the term (Al Khwarismi), Fibonacci, Leibniz, Gauss, Laplace and Boole. There’s nothing new under the sun but in the 21st century ubiquitous computers and the internet has taken algorithms into the homes and minds of everyone who uses the web.

Algorithms in finance
His first detailed case study is a stock trading programme that used algorithms to beat the market and make millions for its inventor and investors. Thomas Peterffy changed trading forever by introducing this approach to the trading floor and it has led to a “virtual world of warring algorithms in what has become of Wall Street and our money”.

Algorithms in entertainment
This is followed by a less convincing case for algorithms in movies and music, where they are used to spot talent, identify hit movies and predict box office figures. After a summer of blockbuster turkeys this year, I’m not sure that this has withstood the harsh world of empirical verification.

Algorithms are being used to identify who you are, what you do and more importantly what is delivered to you in services you sues, such as Google, Facebook and Amazon. Huge businesses are built on algorithms. This, along with large amounts of data, is Google’s essential asset in search. Algorithmic recommendtaion engines account for one third of all Amazon sales. With Netflix, it’s nearer to 75%. Algorithms now lie at the heart of many of the biggest and most successful online, business models.

Algorithmic learning
The book is not about the use of algorithms in learning but this may turn out to be one of their most potent applications. It does not store knowledge and skills in a linear, alphabetic or hierarchical manner. It makes sense, therefore, to deliver learning, driven by algorithms sensitive to this need. For a more detailed paper on this subject see out paper on Algorithmic learning ,LINK>.

 More interesting is something that is already beginning to happen in earnest – algorithmic journalism and writing. You’ve probably already read sports stories written by computers, as they draw on data gathered at sports events, then existing databases, weather reports and so on, to produce a convincing report. We know they are convincing, as they’ve been tested against real, written articles, and we can’t tell the difference.

Could learning content be constructed in the same way? The big costs in the production of online courseware are the writers and interactive designers. That’s where the skills shortages are. My friend Steve Rayson, who runs a sizeable e-learning company, feels that this “ would take away the real limit in the growth of my business and online learning. It must eventually happen”.

Conclusion
The book’s finale is a rather uplifting story of how bright young mathematicians, economists and coders, no longer yearn to work on Wall Street or in banks but in start-ups, incubators and business  creation. This has been a long time coming but at last human talent is being directed, not towards the mere management of money, but the creation of new ways of creating jobs and shaping the future. The question remains, as well described by Lanier and others, that the Age of the Algorithm may destroy more jobs than it creates. Nevertheless, for the moment, it holds the promise of getting us out of boom-bust cycles where maths was forever blowing financial bubbles, into maths that make things work. I had forgotten that HAL stands for ‘Heuristically programmed ALgorithmic computer’. Turns out that HAL has become a reality. Indeed we deal with thousands of useful HALs every time we go online.

Follow

Get every new post delivered to your Inbox.

Join 26 other followers