Tech

The real problem behind the data science skills gap is not what you might think


The watch lacks skill
Image: Robert Kneschke / Adobe Stock

The data science skills gap isn’t here because there aren’t enough people who can train and analyze data models. There are many talented data modelers who understand conceptual data modeling, logical data models, and more. The real challenge is finding people who can collect the data, prepare it, clean it, and put their models into production.

I’m referring to professionals who understand how to query and connect to databases, know how to implement object stores and can contain models, convert them into APIs, and embed them in edge devices. In short, people who can apply real-world applications to their data sets.

Here’s what’s causing the shortage: Data scientists are almost as skilled in software engineering as they are in data modeling. Businesses need people who know how to produce their output so that it can be used in real-life use cases, not just people who can build an effective model. That’s why Gartner identified AI engineering as the top strategic technology trend for 2022, in which IT professionals focus on running AI models.

Fortunately, colleges and universities have the tools needed to provide a great environment to learn the technical side of data science, and they hold the key to mitigating skills shortages. data science today.

UNDERSTAND: Hiring Toolkit: Data Scientist (TechRepublic Premium)

It’s time they use it to open the door to the next generation of data science professionals.

Play catch

So far, they’ve only managed to open up a bit.

Too many professors still focus a lot on the theoretical and mathematical aspects of data science and not so much on the practical expertise needed to put data science into practice. Perhaps it is because they feel their role is to advance science, not necessarily train people for a profession. While that’s important, there needs to be a balance between the two. Indeed, things are getting better, and many colleges and universities are starting to offer a limited number of courses on how to apply data science and modeling to applications.

But they need to develop their curricula faster to meet demand. That’s hard, as it can sometimes take several years to create and approve a new course. That’s unacceptable when technology is evolving rapidly every few months. The disconnect between what is taught and what is needed continues.

Meanwhile, companies with the appropriate resources and knowledge are trying to compensate. Many are hiring experienced database administrators and recent college graduates and training them in real-world model implementation and data engineering.

There are limitations to this approach. First, an organization that lacks the skills to implement real-world models will not have the expertise needed to train a group of incoming scientists in those skills. After all, they can’t teach what they don’t know. Second, training can be time-consuming, resource-intensive, and undermine an organization’s efforts to become faster and more efficient.

This is unsustainable or unfeasible for most companies, especially smaller organizations that may not have the means to train their employees appropriately. It also isn’t fair to students, who have come to the workforce at a disadvantage.

But colleges and universities don’t have to spend years creating new courses. Instead, they can use the open source tools they already have available to incorporate hands-on learning into their existing computer science courses.

Create a data engineer

Higher education institutions have invested heavily in open source technology for several years and are using this software to creatively solve many challenges. They are attracted by interoperability, security, and cost-effectiveness, among other benefits.

But they also understand that more companies are leveraging open source than ever before. In reality, 95% of recent survey respondents of Red Hat says that open source is critical to their organization’s overall enterprise infrastructure. Indeed, open source is new normal for IT. This makes the teaching and use of open source technology extremely important.

We’ve seen a number of colleges and universities offer courses on topics like learning how to use Python or Jupyter Notebooks. Some have even incorporated these tools into their everyday classroom settings. Now, it’s time to go even further by creating a framework that brings together these and other tools and ties the theoretical aspects of model training with the more practical aspects of development. software.

That’s not hard to do, thanks to the open and flexible nature of open source software. Different technologies can easily come together to create a cohesive whole and give students a more complete view of how their work can be used to real effect. economy in one application.

For example, a university that teaches and uses Python and uses Jupyter Notebooks could combine the use of these tools in a single classroom setting. Professors can create a specialized section of the course that not only shows students how to work with Jupyter Notebooks, but also how to pass that work on to developers. They can also show how an application developer using Python can incorporate their data models into their application. Students can even be taught the basics of how Python works without training to become an application developer.

Essentially, colleges and universities can apply the principles of both science and engineering in a single class. Students can learn how to experiment with their models and how to put those models into motion, taking them from concept to implementation.

Fill the skill gap

The competition between businesses to find talented data scientists is showing no signs of slowing down. Based on EY, organizations are still struggling to fill data-centric roles due to ineffective upskilling programs, talent shortages and more. Even powerhouse organizations like NASA are struggling to find the right people for the right data science roles.

The easiest and fastest way to fill this growing skills gap is for colleges and universities to expand the range of some of their current courses. They should consider combining their software engineering and operations lectures with their existing data science offerings. This will provide students with a more holistic – and helpful – perspective that will help them better prepare for what lies ahead while also giving businesses the talent they are looking for.

Guillaume Moutier is a senior principal data engineering architect at Red Hat.

Guillaume Moutier is a Senior Principal Data Engineering Architect at Red Hat Cloud storage and data services, focusing its work on data services, AI/ML workloads, and data science platforms. Having served as project manager, architect and CTO for large organizations, he is constantly seeking and driving new and innovative solutions, always with a focus on usability and business alignment. brought about by 20 years of IT architecture and management experience.



Source link

news7g

News7g: Update the world's latest breaking news online of the day, breaking news, politics, society today, international mainstream news .Updated news 24/7: Entertainment, Sports...at the World everyday world. Hot news, images, video clips that are updated quickly and reliably

Related Articles

Back to top button