Data Science Assessment: how to analyse a project’s viability

  • Data science is a trendy field, since it applies new techniques, methods and technologies to old problems, to unlock your data’s full potential and uncover hidden patterns.
  • To ensure the success of any data science project, you must carry out a viability analysis right at the start. Each company has its requirements, timelines, ways of accessing data, specifications, etc. For this reason, you can’t start a project without understanding your organisation’s current status and the direction it wants to take.

The significance of data is widely acknowledged across various sectors, and leveraging it represents a pivotal factor that can confer a substantial competitive advantage or even transform the fundamental nature of a business. The field of data science, being contemporary in its approach, employs innovative techniques, methods, and technologies to address long-standing challenges, unlocking the complete potential of data and revealing latent patterns.

To ensure the success of Data Science projects, a thorough viability analysis must precede any other stage. Each company possesses unique requirements, timelines, methods for accessing data, and data specifications. Consequently, commencing a project necessitates a comprehensive understanding of the current state of the organization and its intended direction. This involves alignment with the existing state of our customer platforms, in addition to addressing needs, expectations, and requirements, with a primary focus on effective problem-solving.

Data Science Assessment

For this purpose, you must first evaluate (i) the level of maturity in data collection and processing, (ii) the volumes of data and scaling requirements and (iii) the technologies utilised within the organisation:

  • Maturity of data collection and processing – This involves assessing whether the organisation possesses well-established capabilities for collecting and storing data to gain the intended insights. Generally, there are three levels here: (i) the entry level, where data collection and processing enhancements are required to develop suitable data science tools; (ii) the intermediate level, where the client already has mechanisms for data collection, storage and processing ready to progress towards the data science domain; and (iii) advanced level, where the organisation already incorporates some data science processes within its respective business areas.
  • Volume and scale needs – Here we ascertain the location of data storage and its scalability.
  • Technologies used in the organisation – Understanding the organisation’s technological infrastructure crucial for working out the best solution for each situation in collaboration with our client. This involves capitalising on the advantages and mitigating the disadvantages of the technologies involved, depending on the circumstances.

Along with the technological component, the business component is a pivotal driver of success. It typically acts as the bridge between the current state (AS IS) and the envisioned future state (TO BE). Understanding the business context has major significance at various stages of the process. During the initial phase, knowing the challenges and factors that directly or indirectly affect the outcomes – dependent on the project type – may touch on legal considerations, process flows, sector-specific nuances, technical terminology, and more. In the data analysis phase, business context can help with identifying relationships between features or the evaluation of patterns. The correct alignment of technology and business is paramount, not only for the aforementioned reasons but also to streamline comprehension in subsequent phases.

To gain a deeper insight into the next phase of our journey, we must consider key attributes, including:

  • Quality and quantity of data – The data available plays a pivotal role in delineating not only the challenges that can be addressed but also the processes and efforts required for the necessary developments tailored to each client. Additionally, it prompts the question of whether it is feasible to collect, store and process additional data beyond the current scope.
  • How the model will be “fed” – It is vital to know the origin of data needed for training, along with the triggers and queries directed towards the model. For instance, determining whether queries originate from the front end, back end or both.
  • Context of model usage – The intended use for the model holds significant sway in project definition. Factors such as the frequency of requests, real-time versus batch queries, and the acceptability of a potential five-minute waiting time for results become crucial considerations.
  • Nature of the problem – The challenges to overcome may require the development of a model, the establishment of a monitoring and maintenance process or the extraction of relevant information using techniques like data mining, discovering relationships between variables or patterns in the data.

Each of these specifics, particularly the structure and respective content of the data, imparts a unique character to every challenge.Consequently, uncertainty emerges as a constant factor in every project, notably in the initial phase where the intricacies of the data and the existing organisational reality are not yet fully understood.

Our Data Science process

Our team is well-prepared to address the most intricate challenges. We have developed a method to combat the inherent uncertainty of data science projects, while ensuring progress within the defined scope, and delivering substantial value to our client, underpinned by agile methodologies.

  1. We make a comprehensive analysis of project viability, examining it from business and technical perspectives, and meticulously defining success criteria.
  2. We construct and compare diverse models, identifying the one that aligns most closely with the established criteria.
  3. We use the insights gathered to meticulously plan and execute the deployment and monitoring of our solution in a production environment, recognising the ongoing need for constant monitoring and care in maintaining a data science solution.

Our unwavering commitment to continuous improvement has propelled Xpand IT to receive the esteemed „Microsoft Partner of the Year Award“ for two years. This accomplishment solidifies our status as recognised experts in Microsoft tools.

Tiago MonteiroData Science Assessment: how to analyse a project’s viability


Data Science

Readers also checked out

Do you want to receive amazing news about the IT industry's hot topics and the best articles about state-of-the-art technology?
Subscribe to our newsletter and be the first one to receive information to keep you constantly on edge.