Augmenting business processes and technology stacks with Artificial Intelligence (AI) has become mainstream in most industries in the age of Digital 4.0. AI has become an umbrella term of various different techniques that allow for a computer program to learn (Machine Learning, ML) from the data available and predict the future, using a combination of Natural Language Processing (NLP), data science, neural networks, and image and video processing. The underlying algorithms and statistical models for various AI techniques are packaged into software libraries that are freely available, for example in Pythonʼs Scikit-learn, or Googleʼs TensorFlow, or Apache Sparkʼs MLlibs.
For a business, implementing AI involves iteratively finding the AI techniques that are best suited for a use case and available datasets, and then finding a way to integrate these techniques into a repeatable business process. We break this iterative process down into six high-level stages: explore → experiment → dev test → integrate → roll out → govern.
Here we’re exploring the business use-cases and the datasets available within our business domain which can support machine learning. Typically this exploration is more of a tech-business exercise reliant on Excel, with an objective to do initial litmus test on the use-case and its supporting data.
Output: Litmus test the business use-case(s) and the available data sets, identify gaps in available datasets.
Timing: Should generally be a quick exercise taking no more than a few days to a couple of weeks.
Here we’re taking initial datasets identified in Exploration and doing a more detailed data science exercise on it. R-studio and Jupyter notebooks are in the sweet spot for this type of work, where a Data Scientist is trying to structure the raw input data into machine learning features – identifying the types of algorithms that can be suitable given the size, shape and sparcity of the data – typically doing so in the comfort of their own laptop or a VM on-prem or on the cloud, working with a snapshot(s) of data.
Output: Circle down on ML pipeline (features selection & engineering, baseline accuracy, model selection) that are most appropriate based on outcomes required by business.
Timing: Can take anywhere from a couple of weeks to a couple of months.
Here we’re executing the entire process of acquiring data from source locations (APIs, S3 buckets, SFTP servers, etc.), executing the ML pipeline, and then positioning the output data to the output source locations (APIs, S3 buckets, SFTP servers, etc.).
Output: Actionable insights are produced and sent to the insight consumer (an application, some report, API, etc.).
Timing: The whole Dev Test cycle can take anywhere from a couple of weeks to 2-3 months depending on the complexity of connecting into input and output data sources.
Watch-out: Experienced teams generally run the Experimentation and Dev Test phases together in order to maximise capacity and deliver faster.
This is where rubber hits the road – integrating the entire dev-test pipeline and configuring (DevOps) to run on your IT infrastructure (cloud, on-prem, etc.). This step tends to take longer than it should, typically because there are many more stakeholders involved, including your infrastructure teams IT teams, security, external penetration testing vendors, and so on.
Output: An IT and risk/compliance signed-off project ready to run inside real business processes.
Timing: Anywhere from 1 to 6 months.
Watch Out: One of the most common reasons things slow down in this stage of ML programs is due to the newer stakeholders who get involved from this stage onwards and have their own transformation programs running. Sometimes it takes extra effort to align all the timing and proposed IT approaches.
A lot of times this gets ignored by pure-tech players, but in reality insights are useful when they integrate into a business process. Integration takes care of technical aspects, but a true roll-out needs to take care of other, more softer, aspects of technology roll-out, such as people training and awareness, or any relevant updates to the existing business processes via convenient job aids and such.
Output: Business and IT stakeholders become completely aware of what insights are being generated by the ML project and how their work is affected by the change in the overall business functions and processes.
Timing: A good team of BAs can sort this out in a few days, with the proper awareness workshops and communications.
Roll-out and govern really go hand-in hand. The purpose of Govern is to clearly layout how the ML insights and insight generation process will be governed from a change management point of view going forward. Insights generation processes require more technical change management oversight, whereas insights require a more business-heavy governance, to make sure the insights continue to deliver business value.
Output: On-going business stakeholder confidence in the value that the insights continue to deliver.
Timing: On-going BAU for 3-5 years, until the end of the specific ML program.
Syntasa has created a soup-to-nuts software platform to help organisations manage their AI programs from one command centre, leverage their existing internal teams of Data Scientists & Data Engineers (as well as their access to external AI labs and boutique data science consultancies), and lower the cost of Managed Services vendors. This platform plugs in nicely into clientsʼ existing data lakes, and can natively provide a mature MLOps framework.
If any of this resonates with you, email me (Apoorv Kashyap) and share your thoughts or experience implementing AI projects.