Data Science Assessment: how to analyse a project’s viability

5 SECOND SUMMARY:
  • Data science is a trendy field since it applies new techniques, methods and technologies to old problems, which unlock the full potential of the data and uncover hidden patterns
  • To ensure the success of Data Science (DS) projects, it is necessary to carry out a viability analysis at the beginning. Each company has its requirements, timelines, methods of accessing data, specifications, etc. For that reason, we cannot start a project without understanding the organisation’s current state and which direction it wants to follow. Discover how to analyse a project’s viability through our Data Science Assessment.

The value of data is recognised across all sectors, and exploiting it is a differentiating factor that can deliver serious competitive advantage or even change the nature of a business. Data science is a trendy field, since it applies new techniques, methods and technologies to old problems, unlocking the full potential of data and uncovering hidden patterns.

To ensure the success of Data Science (DS) projects, a viability analysis must first be carried out (Data Science Assessment). Each company has specific requirements, timelines, methods of accessing data and data specifications, etc. For this reason, we cannot start a project without understanding the current state of the organisation and the direction it wants to follow. It implies alignment with the current state of our customer platforms, as well as needs, expectations and requirements, keeping problem-solving as our main focus.

Data Science Assessment

For this reason, it will be necessary to assess (i) the degree of maturity of data collection and processing, (ii) data volumes and scale needs, and (iii) the technologies used in the organisation:

Maturity of data collection and processing – Understanding whether the organisation has well-developed data collection and storage capabilities to generate the intended knowledge through the data. Generally, three levels can be identified; (i) the entry-level, where it will be necessary to improve the collection and processing of data so that DS tools can be developed; (ii) the intermediate level, where the client already has mechanisms for data collection, storage and processing that allow developments in the DS area; and (iii) advanced level. In this state, the organisation already has some DS processes in the respective business area.

Volume and scale needs – In the scale chapter, we discover where the data is stored and what its scalability is (e.g., will we work with distributed computing or locally?).

Technologies used in the organisation – It is important to understand the technological background for every organisation so that we can decide together with our customers what the best solution for each situation is likely to be, taking advantage of the pros and minimising the cons of such technologies, depending on the situation.

In addition to the technological component, the business component is also always a key driver of success. It is typically the connector between the current state (AS IS) and the vision of the future (TO BE). Understanding the business context will be of major importance at different stages of this process. In the initial phase, understanding the challenges and elements that directly or indirectly affect the results, depending on the type of project, can cover legal concerns, process flows, sector specificities, technical language, etc. In the data analysis phase, this business context can facilitate the identification of relations between features or evaluate the existence of patterns. The correct alignment of technology and business is primordial, not only because of the reasons above but also to facilitate understanding the following phases.

To better understand the next point on our journey, it is important to consider characteristics such as:

Quality and quantity of data – The available data will be critical for defining not only the problems that can be solved, but also the processes and effort required for developments needed by each client. Also, whether we can collect/store/process additional data to that currently collected/stored/processed?

How the model will be “fed” – Understand the origin of data required for training, as well as the triggers and queries made to the model. (e.g., are such queries from the front-end, back-end or both?)

Context of model usage – How the model will be later used will be of great importance for the definition of the project. (e.g. what is the frequency of requests made, are queries made in real-time or in batch? Is a five-minute wait time acceptable to get the result or will it have to deliver it in real-time?)

Nature of the problem – The challenges requested may need the development of a model, the preparation of a monitoring and maintenance process, or the extraction of relevant information, such as data mining, showing relationships between variables or patterns in the data.
All these specificities, in particular the structure and respective content of the data, make each challenge unique. For this reason, uncertainty is a constant factor for every project, especially in the initial phase in which the characteristics of the data and the existing reality in the organisation are not known in depth.

With all this in mind, our team is ready to help solve any complex challenge. We have defined a process to mitigate the natural uncertainty of Data Science projects, while ensuring that we move forward within the scope of the project and add value for the company we are working with, supported by these agile methodologies.

  1. We analyse project viability from business and technical perspectives and define the success criteria.
  2. Then we build and compare different models, finding the one that best meets these criteria.
  3. Finally, we use all the insights we have collected to plan and put the deployment and monitoring of our solution into production. Because we know that a DS solution needs constant monitoring and care.

Our consistent process of improvement pushed Xpand IT to receive the prestigious “Microsoft Partner of the Year Award” for the second consecutive year in 2022, positioning us as verified experts on Microsoft tools.

Our Data Science process

With all this in mind, our team is ready to help solve any complex challenge. We have defined a process to mitigate the natural uncertainty of Data Science projects, while ensuring that we move forward within the scope of the project and add value for the company we are working with, supported by these agile methodologies.

  1. We analyse project viability from business and technical perspectives and define the success criteria.
  2. Then we build and compare different models, finding the one that best meets these criteria.
  3. Finally, we use all the insights we have collected to plan and put the deployment and monitoring of our solution into production. Because we know that a DS solution needs constant monitoring and care.

Our consistent process of improvement pushed Xpand IT to receive the prestigious “Microsoft Partner of the Year Award” for the second consecutive year in 2022, positioning us as verified experts on Microsoft tools.

Tiago MonteiroData Science Assessment: how to analyse a project’s viability

Read more in

Data Science

Readers also checked out

Do you want to receive amazing news about the IT industry's hot topics and the best articles about state-of-the-art technology?
Subscribe to our newsletter and be the first one to receive information to keep you constantly on edge.