Bottom Line Up Front

Artificial Intelligence (AI) is a term without a standardized definition. In the broadest sense, AI means using computer systems to perform tasks that typically require human intelligence. There are numerous technologies that would be considered AI. A few of the more well-known ones include:

  1. Robotic Process Automation

  2. Machine Learning

  3. Natural Language Processing

  4. Chatbots

Overview

Artificial Intelligence (AI) is one of the most talked about topics in business and data science today. But the term has no universally accepted definition. As a result, the term AI is widely abused. Perhaps the best way to rid the hype and mysticism about this topic is to simplify it and support it with accessible examples and use cases. The purpose of this post, therefore, is two-fold:

  1. To give a basic explanation of the term AI

  2. To give an overview of five common technologies considered to be AI

AI, in a Nutshell

Artificial Intelligence simply means using computer systems to perform tasks that typically require human intelligence. The top of mind example of AI for most people is self-driving cars. While this certainly is an example, it is a sophisticated use case. Simple tools like autocorrect, autofill, and chatbots are also examples of AI, though they often aren’t considered as such. In fact, AI is all around us and is a major factor in our daily lives. The recommendations pages on Netflix, the routes determined on Google Maps, the spam filters in email services… all of these are AI.

Because there are so many uses cases under the AI umbrella, a good way to think about AI is to consider its use cases in terms of the type of data used: structured or unstructured. Structured data is derived from standardized, repeatable procedures and lends itself well to automation. Unstructured data is derived from multiple, often unique procedures and requires significantly more effort to develop tools with.

Structured vs. Unstructured Data

A good way to understand the difference between structured and unstructured data is to imagine a scenario wherein both are generated and used. Considerer, then, an insurance firm...

The process customers use to create an account with an insurance firm is inherently structured. One can go to the firm’s website and provide the needed data (name, address, plan requested, etc.). One could also meet an agent in person or over the phone, who will solicit the exact same information. As a result, the data created in opening an account tends to be in similar structures and sequence. Because the account opening process generates structured data, there are several Robotic Process Automation (RPA) opportunities available to use it.

Unlike opening an account, the process used to file an insurance claim can be extremely complicated. Consider, for example, a claim on a home fire. Such a claim could be initiated over the phone but would then require reports from multiple agencies to be uploaded or faxed to the firm (e.g. police report, appraiser’s damage assessment, etc.). The reports submitted could come in several types (pictures, videos, written reports), all of which containing different information. Unlike opening an account, in which information requests are standardized, insurance claims are highly unstructured. Such claims easily involve dozens of file types with different information.

Robotic Process Automation

Robotic Process Automation (RPA) is the use of programs to automate human action. It is often used where procedures are too expense or inefficient to be performed by people. To clarify this, consider, again, the example of opening an account at an insurance firm.

Insurance firms conduct a handful of due diligence checks on customers before opening accounts for them (e.g. verifying they are indeed a living person, checking to see if they have other accounts in the firm, etc.). Insurance agents could therefore copy and paste a given applicants info into several background check tools. But how inefficient would it be to pay someone to copy and paste all day? To mitigate this possibility, insurance firms often rely on RPA tools to do these basic tools for them.

There a handful of companies who sell subscriptions to RPA tools (e.g. Blue Prism, UIPath, WorkFusion). Most of these tools are easy to use and require little or no programming experience. If you work at a firm wherein you are repeatedly performing routine tasks over and again, it might be worth looking into these tools.

Machine Learning

Machine Learning (ML) involves using algorithms, APIs, development and training toolkits, and data to design, train, and deploy models into applications, processes, or other machines. The models generated from ML tools are usually geared towards predicting an outcome (e.g. credit worthiness of a customer) or classifying something (e.g. categorizing MRI data by “normal” or “abnormal”). ML models can be divided into three groups:

  1. Supervised learning

  2. Unsupervised learning

  3. Reinforced learning

Supervised learning models predict the value of an outcome measure based on a number of input measures. These types of models are increasingly useful in professional sports, wherein they heavily influence draft rankings. Not long ago, draft rankings were determined exclusively through the intuition of scouts, coaches, and owners. Now, measured data about individual athletes (e.g. past performance, size, speed, etc.) is studied. From this data, predictions are made (how well will they do on a given team and in a given position). These predictions heavily influence the demand (rank) of a given athlete.

Unsupervised learning models describe the associations and patterns among a set of input measures (wherein there is no outcome measure). These types of models are widely used in marketing and advertising teams. The infamous Cambridge Analytica incident is a good example. Like many marketing and/or intelligence firms, Cambridge Analytica used social media data to cluster users into groups based beliefs and political affiliations.

Reinforced learning models learn how to best interact with an environment, based on a reward function (describing if the result is good or not). For example, banks use reinforced learning models to nominate stocks for investment. These stocks are reviewed by a professional investor who categorizes them “buy” or “don’t buy.” The results from the investor’s review are then tied back into the model to generate more quality recommendations based on the feedback. Thus, the model is reinforced or “trained.”

Natural Language Processing

Natural language processing (NLP) uses text analysis to understand sentence structure and meaning, sentiment, and intent. NLP tools are widely used in fraud detection tools. The example most people are familiar with (and grateful for) is the spam filter attached to email services. Among other techniques, these tools study the text within an email subject line to make a recommendation on if it is a legitimate email. Speech recognition tools are often built on top of the code used for NLP. Because this is such a big topic, however, I will address it in a separate post.

Chatbots

Chatbots are computer programs designed to simulate conversations with customers, usually over the internet. They are one of the of the most popular applications of AI currently. The business case for chatbots can be quite compelling. They can improve customer experience and simultaneously reduce cost by creating efficiencies and automation. There are, however, many factors that need to be considered when designing and implementing chatbot. If not done properly, chatbots can irritate customers, damage the company’s brand and reputation, and potentially add a tremendous burden to development teams. Like NLP, we have dedicated a white paper specifically to this topic.

Concluding Remarks

AI is such a broad topic that is can be difficult to try to distill into a single paper. In its broadest sense, AI means using computer systems to perform tasks that typically require human intelligence. There are dozens of technologies that could be considered AI; this white paper only outlines a few. Trying to create a framework for categorizing the technologies which could be considered AI is a daunting task. To simply it, I suggest categorizing them by the types of data used: structured or unstructured. Structured data lends itself well to robotic process automation. To use unstructured data, more sophisticated tools (e.g. machine learning) are often called upon.