Datacise Logo in colour

Data science skill areas and competencies

Skill Areas

Skill Competencies

Data Privacy and Stewardship

  1. Assess risks and enact data protection policies and procedures.
  2. Ensure safe and secure management of sensitive data, models and infrastructures.
  3. Apply appropriate data controls, such as encryption, (pseudo) anonymisation, and synthetic data.
  4. Risk management around environment and infrastructure.
  1. Act with integrity, giving due regard to legal and regulatory requirements.
  2. Be aware of the actions that should be taken to respond to potential data loss in line with organisational, legal and regulatory procedures.
  1. Incorporates the FAIR Guiding Principles for scientific data management and stewardship into practices, where appropriate and practicable.
  2. Identify opportunities for efficient and creative reuse of data including facilitation.
  3. Understand the relationship between technical standards and regulation/governance, and their benefits for interoperability and knowledge sharing.

Definition, Acquisition, Engineering, Architect, Storage and Curation

  1. Source and access data appropriate for the problem.
  2. Critically analyse the availability of appropriate data and resources to meet project requirements.
  3. Critically evaluate and synthesise data.
  4. Ensure data governance processes are followed.
  5. Identify data characteristics (volume, velocity and variety).
  6. Identify infrastructure requirements for data storage and analysis.
  7. Show familiarity or experience with tabular and non-tabular data (e.g. unstructured and streaming data).
  1. Source and access data appropriate for the problem.
  2. Construct data sets, potentially drawing from multiple disparate sources using data linkage.
  3. Perform data profiling and characterisation to understand the surface properties of the data.
  4. Handle missing data, through principled inclusion/exclusion criteria and/or imputation methods.
  5. Take a systematic approach to data curation and the application of data quality controls.
  6. Identify the most appropriate solutions (e.g. cloud vs on-premise) in response to business and project needs.
  1. Plan the deployment of data products with their end-users.
  2. Develop monitoring and maintenance processes.
  3. Deliver secure, stable and scalable data products to meet the needs of the organisation, e.g. Application Programming Interfaces (APIs), derivative datasets, dashboards,
    reports and do so according to modern software development best practices.
  4. Design and deliver data products that meet appropriate accessibility standards for their users.

Problem Definition and Communication with Stakeholders

  1. Identify and elicit project requirements.
  2. Determine success criteria and frame these in the context of the business.
  3. Clearly articulate the problem statement.
  4. Identify and critically evaluate assumptions.
  5. Recognise and quantify biases and identify solutions to manage and mitigate these.
  6. Assess risk.
  7. Demonstrate sector/domain knowledge and/or knowledge of how data science can deliver value to these sectors/domains.
  1. Communicate in an effective manner for diverse audiences, including technical colleagues, subject matter experts and leadership.
  2. Effectively manage the expectations of diverse stakeholders with conflicting priorities to mediate equitable solutions.
  3. Use relevant communication techniques (written, oral or visual), appropriate for the audience.
  4. Build appropriate and effective business relationships.
  5. Show experience in human factors considerations with respect to data-driven solutions.

Problem Solving, Analysis, Statistical Modelling, Visualisation

  1. Identify viable solutions based on requirements and data available.
  2. Identify and provide guidance to technical and non- technical stakeholders on the most appropriate solution.
  3. Apply appropriate technical and project management methodologies appropriate for the organisation and project.
  1. Identify appropriate solutions, including statistical and machine learning approaches.
  2. Identify and evaluate appropriate evaluation metrics, including computational performance and accuracy.
  3. Manipulate data with due regard for differences in characteristics.
  4. Creation and evaluation of new data features.
  1. Apply appropriate solutions, including statistical and machine learning approaches. Demonstrate competence in a modern programming language.
  2. Use appropriate analysis platforms and tools.
  3. Adopt a systematic approach to exploratory data analysis to embrace and manage ambiguity and uncertainty.
  4. Critically analyse data and analytical results.
  5. Adopt appropriate methods to visualise data and communicate complex findings.

Evaluation and Reflection

  1. Ongoing monitoring of project performance and outcomes.
  2. Identify and feed forward lessons learned.
  3. Participate in and lead collaborative project evaluations, e.g. retrospectives.
  1. Identify and manage the risks of erroneous and biased data.
  2. Act with integrity with respect to legal and regulatory requirements.
  3. Uphold principles of ethical and safe use of data and AI technologies.
  4. Implement data use procedures to ensure sensitive data is only used for its agreed purpose.
  5. Implement data retention strategies in line with regulatory and legal requirements.
  1. Evidence of incorporating the principles of open science and/or reproducible research within the organisation, and perhaps beyond.
  2. Competence programmatic approaches to undertaking data science work.
  3. Apply the scientific method in delivering solutions.
  4. Ensure high technical standards, in line with software development best practices; for example, software testing, version control, Continuous Integration and Continuous Delivery.
  5. Apply automation to promote reproducibility analyses.
  1. Learn from experience through self-assessment of one’s own responses to practice situations.
  2. Identify learning opportunities to maintain knowledge and skills in the relevant area of data science.
  3. Take ownership of ongoing professional development.
  4. Contribute to knowledge-sharing across their organisation and/or the wider community.
  5. Contribute to the management and empowerment of the broader team.
  6. Engage with the latest developments across industry and academia and incorporate these into solutions.

Recommended learner pathway

No previous knowledge
Data Custodian & Domain Researcher
Data Scientist
Domain Data Research Scientist
Senior Domain Data Research Scientist

Recommended learner pathway

No previous knowledge
Data Custodian & Domain Researcher
Data Scientist
Domain Data Research Scientist
Senior Domain Data Research Scientist