Before we understand the AI project cycle, let us understand what any project cycle is. For that we need to take an example of a project.
Assume you have been given a science project of developing a 3-D model of the water cycle. What would be the stages of doing this project?
I would like you to think for a moment before you read further so that you can identify the gaps between what you think you do and what you actually do. Trust me, all of us need to take these steps to complete this project successfully. Someone might skip a step but then they would need to come back to it later.
Steps to make a Water Cycle model
If I had to do this project here are the steps I would take:
- Write down the outcome that I want from the project. Yes, the final output that you need must be clear before you can work towards achieving that output.
- Make a rough sketch of the project on a piece of paper. I would try to make it as detailed as possible in terms of what all I need to depict. For example, if I want to show mountains, how high would it be? If I want to show a water body what type would it be – a river or lake and ocean?
- Next I will take some time to think of the materials that I would like to use for the model. While considering the materials, I will take into account what would be easier for me to work with. What are the materials that are easily available and I can get help from maybe my parents or siblings or friends if I am stuck somewhere. For instance, will it be easier for me to use mud to create the mountains or crumpled up paper?
- I need to source the materials. So I will make a list of materials with their quantity. A laundry list could look something like this:
- Newspapers
- Cardboard box
- Coloured papers – blue, white, green and brown
- Watercolor – Blue, white, yellow, green, red, black, Brown
- Paintbrush and an old tumbler for keeping water
- Glue
- Pencil and a ruler
- After I have sourced all the materials, I will start creating the actual model. I will make a list of sequential steps I need to take to create all the parts of the model. For example, I need to make mountains, water bodies, clouds, arrows, foliage, etc. You get the idea.
- While creating the model, I will keep referring to the rough sketch that I made in step 2 so that I do not stray too much from what I intended to do in the first place. Of course, remember that you can make suitable modifications because theoretical and practical phases are always different. In fact, they should be different just to make you realise that when you practically do things there are problems that give rise to new and sometimes better ideas. Be open to those ideas.
- After completing the model, I would probably ask others to take a look and see if it actually looks good before sharing it with my teacher.
If I want to use the right terms for all these steps, it would be this:
Requirement gathering —> Planning —> Designing —> Sourcing —> Developing —> Review
Any project cycle typically follows these steps. But why project “cycle?” Or rather, where is the cycle?
Well, in a typical project, the review phase leads to some new requirements and then the whole cycle starts from the first phase. Of course, this time the stages are shorter because only the suggested changes need to go through the cycle.
The initial 2-3 steps of requirement gathering, planning and designing might vary in order, depending upon the project at hand. For instance, there might be some projects that need designing first so that you are able to plan how to execute or what materials to source.
Now that you understand the fundamentals of how projects are executed end to end, let’s dive into the AI project cycle.
AI project cycle
An AI project cycle typically follows these phases:
- Project planning
- Data acquisition and exploration
- Data modelling
- Deployment and after
Phase 1: Project planning
AI project planning includes understanding the scope of the project. For this, you need to understand business requirements. Some of the questions that you need answered in this phase are:
- What is the outcome I want to achieve through this project?
- Who will benefit from this AI project?
- Who Will be responsible for executing the project?
There are many frameworks available for defining the scope of a problem. 4W Canvas is the most popular and easiest to understand. Read this blog post I wrote about using 4W Canvas to define scope of your problem.
But it’s not sufficient to understand the business requirements. You also need to plan how the project requirements would be met. In this stage you need to answer questions like:
- How much data will be needed?
- Who will be responsible for gathering data?
- What strategy is to be used to ensure successful project completion? There are many IT project methodologies. This is the time to determine the one that will be used. Some methodologies include waterfall, agile.
- What is the benchmark for measuring success?
Phase 2: Data acquisition and exploration
Strictly speaking, we are still in the planning phase, but specifically about data and that’s why it is considered the next step in the AI project cycle. This stage is unique to AI projects because data and its suitability is so crucial to them. In this phase you need to brainstorm about data, its frequency, suitability and safety.
Data
- What data is needed?
- What metadata is needed? Metadata is data about data. You know that the training data must be labelled or its meta data created so that the AI system can train easily. It is important to be clear at this stage about the meta data to be created for each piece of data.
- What should be the data format – text, images, audio, video, documents, etc.
Data frequency
Some projects need data just once; after training on them they are good to go. Other projects might need data on a regular basis.
For instance, if you are creating an AI system that helps the HR department of a company, the system needs data about all the employees of the organisation initially to start training. As new employees are added or some dynamic changes occur in existing employees like salary increase or change in profile, those data can be added to the system.
On the other hand if you are creating an AI-assisted business analytics model, it needs continuous data related to sales, marketing, regulatory changes in the market, competitors, etc. You can see that the frequency of data needed for the HR AI system is only when the data changes or new employees are added. On the other hand the business analytics model would need data to be fed at regular intervals like daily or weekly or monthly or quarterly.
Data ownership
It is important to identify the people who would be responsible for creating the training datasets and then maintaining it. You know that periodic checking of data is essential to ensure unethical data is eliminated. So the responsibility of who will take care of this periodic checking needs to be fixed right now.
Data safety and security
Finally, security and safety of data needs to be taken care of.
- What are the security measures?
- How safe is the data from hackers and unauthorised users?
- Who is responsible for keeping the data safe and secure.?
- What steps need to be taken to test the safety and security of data periodically?
Data acquisition
Once you know what data you need and how frequently, you need to actually acquire the data. Here you need to decide where you will get the data and how the needed data will be extracted. There are many training datasets available in public domain. Some projects can make do with them. Other AI models might need you to curate internal data (think customer surveys, internal project documentations, etc.) or acquire data through other sources.
Remember that when we talk of AI projects, the data we get are real life data that might not be exactly in the form that we want. For instance, from a 10 hour video footage we will need to extract information like people who have entered the premises, before it can be entered into the database.
Data exploration
Data exploration is the process of verifying that the data you have collected is as per the requirements established during data planning. Besides ensuring that the data is according to our specifications and error free, we also need to make a preliminary check that it will actually give the output we desire.
AI projects are very expensive and time consuming so it is prudent to perform some exploratory data analytics at this stage.
Once the data has been validated you need to prepare the data to be fed into the database. Remember the list of metadata we prepared for each piece of data that we need in our database? This is the time to create that metadata and enter it in the database. Also, data might have anomalies. For example, the quality is not very good or some data is missing. Through various tools these anomalies must be removed before the data can be accepted into the database. In this phase you might find yourself discarding lots of data if you have not been careful in the previous stages.
Phase 3: Data modelling
This is probably the most straightforward stage of the whole AI project cycle. In this stage the right algorithms are selected and an AI model for working with the data is built. The selection of the model depends upon the data being used and outcome desired. Examples of popular AI models:
- Neural networks
- Decision trees
- Random forest
- Logistic regression
- Expectation maximisation algorithm
- Hierarchical clustering algorithm
Phase 4: Deployment and after
Once we have chosen and trained the AI model, then comes the stage of project deployment. But before any deployment can take place there needs to be a testing or evaluation. The deployment is divided into these three steps or stages:
Project evaluation testing
Before deployment, the AI model must be tested to see if it is meeting the goals established in the first phase. The testing is also done to ensure that the AI model works not just with the data on which it is trained but new data that is added to it. This is essential for AI projects because, as you know, new data keeps getting added to the system, whatever the frequency.
Project deployment
Finally, the AI model gets deployed at the client site. Deployment logistics include deciding on
- Where the data is hosted
- Who takes charge of the system
- Who is responsible for its continued functioning and system tuning
And many more.
AI model tuning
You know that new data is continuously fed back to the AI model. For instance, if you have deployed an object recognition system, the new objects that it recognizes or fails to recognize are stored in the system as part of training data.
AI models typically have inbuilt systems to keep tuning themselves according to the new data sets. But periodically you need to ensure that the AI system is still meeting the established business goals. For instance, ensuring ethical and unbiased outcome is an important business goal that must be met at all times, at all costs. Since you have no control over the new data stored by the model, periodic checks should be done to ensure that the data is indeed meeting the established goals in the planning phases.
Final thoughts
Post project deployment a review must be undertaken to document the learnings from the whole project cycle. These learnings must be documented so that subsequent AI projects can be executed with lesser problems and more success.
Brilliant.please share some usecases too.
Hey Deepika, thanks for stopping by my blog.
Do you want the use cases from a student’s perspective or a professional’s perspective?
Thanks,
Shweta.
Very well written and comptehensively explained. Thanks.
Glad you found it useful.
Thanks,
Shweta.
It’s excellently explained. Can you share some more projects like this. I need them for my students.
Thanks for your kind words Manoj. Dropping you an email on this.