Analytics and AI in our society
Whenever new technologies have emerged, there has always been a tendency to be skeptical of these innovations. This same reflex applies to Artificial Intelligence (AI) and analytics, as the possibilities they offer are unfortunately often overshadowed by their negative aspects. It is not surprising that AI bad-buzz stories are shared more rapidly on social media than success stories. From racist monsters, to face recognition algorithms that distinguish gay from hetero, and self-driving cars ending up in cliffs. Now that AI and analytics have become part of modern-day life, it is time to start taking an objective look at the positive aspects of AI.
The huge amount of data available cannot be processed only by individuals anymore. Complex analytics that are delivering faster and better results are currently playing a very important role in influencing decisions made and services delivered. We see AI playing a vital role across industries: from reducing waste, fraud, and abuse to optimizing sales, marketing, and customer service operations. Analytics and AI helps in improving disease identification, predicting weather patterns, and analyzing soil for best crop usage. It is even used to predict which food we are going to eat, which partner we are going to marry, and knows that we are pregnant before our family knows. In the field of AI and data analytics, the boundary between creepy and cool is paper thin.
Trusted analytics and the trust gap
Unexpected behaviors or results from AI initiatives can lead to generalized mistrust in data and analytics. Algorithms can be destructive when they produce inaccurate or biased results, an inherent concern amplified by the black box often faced by the user. The CEO Outlook performed by KPMG in 2019 reported that 66% of leaders surveyed overlooked insights provided by computer-driven data analysis because they were contrary to their experience or intuition. No-one wants to say, “because the machine said so”. No-one wants to get AI wrong. It’s usually because they don’t understand them that people are not confident in the data, do not trust the researchers doing the analytics or even the techniques they use. Moreover, our survey showed that trust decreases across the data lifecycle. There is more trust during the first stages of the data cycle and data preparation than at the end in the implementation and measurement of the model’s effectiveness. Interestingly, the survey also reported that more investment in technology does not significantly help close the trust gap.
There are various concerns in the effective use of AI in organizations.
- AI systems do not always act in the way that they were programmed. Even if business processes are conducted by AI systems, the organization must be able to react to and manage situations if the AI system breaks down or shows unplanned behavior. Furthermore, there may be unanticipated consequences if an AI system learns certain decision-making functions by having access to and learning from data not considered by the data scientist. It is important to keep expertise on board to handle such events but also to validate the outputs provided by AI.
- Another commonly cited example is the one of chatbots launched by big tech players over the last years, that unexpectedly learned racist vocabulary from previous offensive tweets posted by other Twitter users. This is a data quality concern; we need to be conscious of the data provided to the model. Another illustration of a data quality issue concerns a European travel website using pictures labelled by Indian people as “nice holiday” pictures. Indian people mainly identified big hotels with marble and air conditioning as vacation pictures instead of sunny pictures. Data was therefore labelled inappropriately.
- Then looking at optimization problems (e.g., optimally assigning the sick to hospital rooms, packages to delivery vehicles, caregivers to patients, students to schools), a correct formulation of your objective is key. Take the example of assigning students to schools, a challenging optimization problem, which has been the subject of more than 50 years of scientific research. It is clear that one ideal algorithm does not exist, and design decisions have to be taken, such as: which objective(s) do we consider and what weight is given to any of them? There is a wide range of (potentially conflicting) objectives. Do we want to maximize the “benefit” for society as a whole from a regulatory/socio-demographic point of view (e.g., benchmarks with respect to the percentage of indicatorleerlingen[1], as is the situation in certain parts of Flanders and Belgium) or the “benefit” for parents by minimizing their driving distances, for instance? Formulating the objectives differently, or even slightly adjusting the weight of an objective can significantly influence the outcome of the algorithm. When building a model, we must be aware of the assumptions behind it. In the field of predictive analytics, decisions on the evaluation of the performance and effectiveness of your (classification) model are crucial. Depending on the context, you either want to minimize the false negatives or the false positives by making a trade-off. Particularly in the medical field, we generally minimize the false negative rate at the expense of the false positive rate. Indeed, it is considered safer not to miss a diseased patient and, thus, potentially diagnose patients with the disease even though the patient is actually healthy. When a patient is diagnosed as positive, there is still a probability (the so called “false positive rate”) that the model provided an incorrect evaluation. Conversely, alcohol tests are generally designed to minimize false positives. People can always ask for a second test if the test has indicated that they are drunk although they are not. However, following this rule, the number of truly drunk people on the road will be minimized.
- Customers will not use a model if it is not resilient, i.e., not optimized over the long term. A traditional GPS in a car will be outperformed by a smartphone GPS application that adjusts rapidly to a new context and road modifications.
In the coming years, the key differentiators between companies will no longer be their model performance, but the trust they have ultimately established amongst their employees, customers, and other stakeholders. Organizations will not be able to leverage automated decisions if their employees do not trust the tools that support those decisions. Customers will not agree to provide their data if they are uncertain that the algorithms are operating in their best interest or if they do not trust the purpose of data collection. The shareholders will not invest in a company if they are not confident in their integrity and ethics (which result from the ethical use of the built AI solutions).
What is trust?
The benefits of AI will only fully emerge when algorithms become explainable (and hence, understandable) in simple language, to anyone. The trust gap exists because there is no transparency around AI. Instead, there is an inherent fear of the unknown surrounding this technology. Gaining trust also involves understanding the AI models and protecting them (and their underlying data) from different types of adversarial attacks or unauthorized use. Lastly, be innovative, encourage experimentation without the fear of failure.
Stay tuned! In a next blog post, will dive more deeply into the question on how to build trust in Analytics and AI.
[1] Title attributed to a student when he fulfils some specific socio/demographic conditions. It is a Flemish concept. More information can be found here: https://onderwijs.vlaanderen.be/nl/wanneer-inschrijven