Hi guys!
I am so glad to be back.
Today, I will take you through what a typical AI project team looks like and the different roles that members of such a team must play to acheive a successful project. I will also discuss how to build the necessary skills to fulfil those roles as AI professionals.
While working on substantial projects, it is necessary to form a team of members who collaborate and divide the work among themselves according to their specialties.
For example, a website development project team usually consists a UI/UX designer, front end web developer and a back-end developer who work to develop different sections of the website. In the same way, an AI project team is made up of different members who must perform their respective role towards delivering a successful project.
Before we move forward, there is an important guideline that we must understand and follow. In the theory called ‘Maslow’s hierarchy of human needs’ It states that man’s basic needs must first be met before pursuing other secondary needs. This means that humans must have the bare necessities to survive like food and shelter before he can proceed in the search for higher level needs like love, sense of belonging or self-actualization.
In the field of AI, the data science hierarchy of needs runs the show.
Similar to Maslow’s hierarchy, the ‘Data Science Hierarchy of Needs’ is a pyramid of needs that must be satisfied in an artificial intelligence organization or project team. There are different levels in the pyramid with the lowest level representing the first category of tasks an AI team must focus on to succeed, and the highest level representing the very last category of tasks to be executed.
In other words, an AI team cannot perform movement and storage duties without first collecting data. And without stored data, exploration and transformation cannot successfully take place.
Unfortunately, many AI startups fail to realize the importance of this hierarchy, and so they go on to hire the talent for the right roles at the wrong time. These companies rush forward to hire ML engineers who find themselves trying to wear several hats simultaneously in a bid to handle all the tasks originally meant for different roles.
We need to consider the data hierarchy of needs as the godfather who dictates who we invite to the AI team.
Data Engineer
The first person invited to work on an AI project as a part of an AI team is a data engineer.
The data engineer is the man or woman who is responsible for dealing with the lowest three levels on the pyramid.
They are in charge of collecting data from all the necessary sources, transporting, and storing the data efficiently for future use. They are prepare datasets at the requests of other members of the team.
To conceptualize the job of a data engineer, I will use a farm analogy to explain.
In a farm, there are farmers who plant crops and take care of the farm intending to allow the plants to grow and produce fruits during harvest.
Then another group of workers called the kitchen staff visit the farm during harvest to gather the fruits and transport them to the pantry where they are sorted and stored accordingly. They also cut, clean the fruits at the request of cooking staff to prepare for a meal.
While the farmer can be likened to a web developer who develops a web application, deploys it on a platform like Google Playstore, and expects the app to return data, the kitchen staff represent data engineers who gather this data, store, arrange and prepare the data for AI tasks. We can also deduce that the fruits from the farm characterize the data generated by applications and other sources.
Since Data engineers specialize in handling data, they must be comfortable working with databases. Databases are digital storage facilities for storing data in a structured manner.
They need to know how to use SQL to interact with these databases and query information from them.
Other skills include Python (or a good object-oriented language like Java), database design, cloud services (e.g. AWS).
An important concept that data engineers need to know is ‘distributed systems’. Distributed systems frameworks like Hadoop and Spark were developed as solutions to manage the exponential growth of data effectively.
Data Analyst
The next team member invited to the data party is the Data Analyst.
This guy or lady is responsible for the fourth level in the pyramid. They sift through collections of data to reveal patterns and find valuable insights.
Going back to the farm analogy, we can compare the data analysts to the pantry chefs at a 5-star restaurant who prepare simple meals that do not require elaborate preparation like salads and sandwiches.
Just as the pantry chefs must work with the food items labelled and stored in the pantry, data analysts have to make use of the data prepared for their task by the Data Engineers.
Skills required by a data analyst include database management skills, SQL, and spreadsheet analysis skills.
Basic machine learning skills are excellent to have, as the utilize those techniques with statistics to discover trends in data.
Finally, they need to have good communication and data visualization skills to enable them communicate their findings to management.
Data Engineer/ Machine Learning Engineer
The Data Scientist or Machine Learning Engineer is the next guest on the list.
In charge of the topmost levels of the pyramid, these people must work on the foundation created by the team members called to the team before their arrival.
Without the work of the Data Engineer, an ML Engineer will lack the proper dataset needed for the project, leading the progress of the work to plateau at the prototype stage.
Since they use fancy machine learning algorithms to create models that perform amazing tasks like language translation and stock price prediction, they can be likened to an Executive chef who uses sophisticated cooking techniques to create masterpiece dishes.
The people who work in this role focus on learning machine learning algorithms and how to implement working solutions using these algorithms in a language such as Python or R. They need to be familiar with the libraries developed ML and deep learning such as Sci-Kit learn and PyTorch as listed below.
Most online courses focus on teaching the technical skills required for this role, since they create the fanciest parts of the AI project. However, it is important that beginners are aware of the importance of other roles in AI.
One more thing would I would advise potential Data Scientists to learn is patience and perseverance. While these are not technical skills, they are every bit as important, because machine learning projects require several trials before the desired result can be achieved.
Honorable mention #1 - ML Dev Ops Engineer
ML Dev Ops engineers are to integrate the AI models into existing and new software, testing the applications, and monitoring any changes to the system during and after deployment. They function as packaging managers who wrap up the prepared dishes effectively in containers for seamless delivery to the customers.
Skills they require include the knowledge of CI/CD, Docker, Kubernetes, and other containerization software.
# Honorable Mention #2 - Domain Specialist
The last but not the least is the Domain Specialist. He or she is an expert at the subject area of discussion. If the AI team is working on a project for a financial institution, the domain specialist is that person who has expert experience in the finance industry. If the project is for sports, a sports expert becomes the domain specialist. One of the team members who already has the required expertise can fulfil that role as well.
Without a domain specialist, the AI project will be unproductive. Picture a bunch of painters working on a sketched art piece in the darkness.
Yes!
That is exactly what starting an AI project without a domain specialist looks like — a mess.
Therefore, beginners from other fields outside computer science are often encouraged to pick up AI and analysis skills. Their previous experience becomes an invaluable asset that sets them apart from the crowd. They are empowered to apply AI to push past barriers in their first area of expertise and function as the light needed to guide technical teams working on solutions in their discipline.
In case, the importance of a domain specialist is not yet clear to you, let me drive it home. Have you ever tried forex trading without proper training and guidance? If you have, you must know that it is a risky endeavor that most likely leads to great losses. So always try to get knowledge about the problem space before using machine learning.
Now that you have learned about all the roles, I will present you with four steps to follow towards building the AI career of your dreams.
- Identify the skills required and take courses
There are numerous online courses that teach these skills. To find the best resources for a particular topic, ask Quora and check out ratings and reviews of courses. Some resources are free, others are not. While paid courses often provides richer content, some free tutorials do justice to the material.
- Work on projects
If it feels daunting at first, which it will, follow guided projects from courses, YouTube channels, and blogs. Participate in competitions on Kaggle and Zindi. Create a GitHub account, visit other people’s GitHub accounts to see what sorts of projects they have worked on and learn from them.
- Join communities and network
This is important.
While taking courses, introduce yourself in the forums and interact with learners. Do not be afraid to ask questions, most coding forums are filled with nice and helpful people from all over the world. Form study groups with new friends and check up on everyone’s progress periodically. LinkedIn and Twitter are amazing platforms for networking. Connect with people in the area of your interest,engage with their content, and ask questions to build strong professional relationships. These are the people who will put out postings for your dream job, refer you, and give you insider information regarding opportunities.
- The last step is to document your learning journey.
Create content.
When you learn something, teach others in your own words and share your lessons. You can do this through blog posts, Medium, YouTube videos, LinkedIn, Twitter, and even Instagram. This way you get to solidify your knowledge and expose yourself to amazing opportunities.
Thank you so much for reading and I will see you in the next written piece where I will discuss about the need for diversity in AI.
If you have any questions or suggestions, please go ahead and leave a comment below or you can contact me through an email.
Meanwhile, I would like to recommend a couple of videos that would help solidy your understanding of this post and expose you to experiences of people who are currently leading successful careers in AI. Links to the videos are listed below.
Data Analyst vs Data Scientist vs Data Engineer by Damsel in Data
How to learn data science in 2021 (the minimize effort maximize outcome way) by Tina Huang
How I Became a Data Analyst (without a related degree) by Stefanovic