One of the largest job advertisement website Glasdoor has identified Data Scientist as the top job in their listings for three consecutive years. However, while the demands for Data Scientists is on the rise, we experience quite a noticeable gap in understanding Data Science as a profession, let alone having a comprehensive training schemes for data scientists. To address this gap, The EDISON Project set an aim to contribute to understanding and building the data science profession through creating EDISON Data Sciecne Framework. In this interview, we share with you the first-hand insights from the Project Director and Senior Researcher at System and Network Engineering Group, University of Amsterdam Yuri Demchenko.

Can you briefly summarize the ideas behind EDISON project? What are its main outcomes and its relevance to a data-fueled economy?

The EDISON project (2015-2017) was focused on coordinating and supporting activities to foster creation of the Data Science profession in Europe (and beyond) that involved interaction with multiple stakeholders from academia, universities, standardisation bodies and professional organisations. The main outcome of the EDISON project is the EDISON Data Science Framework (EDSF) that includes the following components:

  • Data Science Competence Framework (CF-DS),
  • Data Science Body of Knowledge (DS-BoK), and
  • Data Science Model Curriculum (MC-DS), and Data Science Professional Profiles (DSPP).

The EDSF provides a conceptual basis for the Data Science Profession definition, targeted education and training, professional certification, organizational capacity building, and organisation and individual skills management and career transferability.

The definition of the Data Science Competence Framework (CF-DS) is a cornerstone component of the whole EDISON framework. CF-DS provides a basis for the Data Science Body of Knowledge (DS-BoK) and Model Curriculum (MC-DC) definitions, and further for the Data Science Professional Profiles definition and certification.

The CF-DS incorporates many of the underpinning principles of the European e-Competence Framework (e-CF3.0) and provides suggestions for e-CF3.0 extension with the Data Science related competences and skills. The CF-DS and DSPP have also adopted and intend to comply with the structure of European ICT Professional Profiles and European Skills, Competences, Occupations (ESCO) Framework. Corresponding information is provided in both documents CF-DS and DSPP.

This presented Data Science Competence Framework definition is based on the analysis of existing frameworks for Data Science and ICT competences and skills, and supported by the analysis of the demand side for Data Scientist profession in industry and research. The presented CF-DS Release 3 is extended with the skills and knowledge subjects/units related to competences groups. The document also refined the Data Science workplace) skills definition that includes the Data Science professional skills (Acting and thinking like Data Scientist) and the definition of the general “soft” skills often referred to as 21st Century skills.

Currently EDSF is maintained by the EDISON Community initiative (coordinated by University of Amsterdam) with the Github working area.

What are the target professional groups for the EDISON project implementation? Is there a variety of professional roles, domains and uses for which EDISON is applicable?

The Data Science Professional Profiles (DSPP) defines the whole set of professional profiles related to Data Science, Data Management and Governance, and Data Stewardship. DSPP defines 22 profiles from desk and support workers to enter data to Big Data infrastructure management, Data Science and Analytics professionals, and organisational management profiles (e.g. Chief Data Scientist, Chief Data Officer, etc.). The EDSF is also applicable and provides a set of tools to define other Data Science and Analytics (DSA) enabled professions in other science, industry and business domains and sectors.

How can EDISON be extended or adapted for particular or specific uses?

EDSF has a modular organisation and all documents are extensible with continuous work in progress and regular releases. Extensibility points are defined for each of document:

• Data Science Competence Framework (CF-DS),
• Data Science Body of Knowledge (DS-BoK), and
• Data Science Model Curriculum (MC-DS), and Data Science Professional Profiles (DSPP).

We are currently running the call for the contribution to the next release 4 to be issued with the deadline 30 September 2019. Check EDISON community to read more: EDISON community

Would you imagine EDISON (or a subset of it) as a basis for training entrepreneurs that seek to startup in the domains of Big Data and business analytics?

One of tasks in the future/ongoing EDSF development is to define the DSA (Data Science and Analytics) training profiles for managers of the data driven companies. Recent research and developments created tools and methodologies to created tailored curricula based on required professional profiles and competences/skills gap defined based on individual or team benchmarking.

The EDSF contains also definition of the Data Science workplace skills (also called transversal skills) and 21st Century skills that are widely applicable for data driven companies, Industry 4.0 and digital transformation.

Are there some competences and skills in EDISON that are essential for the business aspects of data-intensive companies? Possibly, there are skills that are essential for managers for understanding their own capabilities, cost and business implications?

There are example of using EDSF for different domains. A number of currently running projects use EDSF for different research and business domains:

• ELIXIR, RItrain, CORBEL – Bioinformatics Research Infrastructure
• MATES – Digitalisation of Maritime Industry
• FIRsFAIR – definition of the Data Stewardship curriculum

Many other projects are influenced by the EDISON methodology and EDSF conceptual model. [Conclusive, the skills and competences from the EDISON Framework are applicable in a wide range of fields and relevant for starting entrepreneurs].

Can you briefly tell us on the future roadmap of EDISON?

[The University of Amsterdam (UvA) team, initial EDSF developer, will work as an interim coordinator and faciltator with the view to create the community delegated coordination group that will oversee wider EDSF development and implementation.

Participation in the EDISON/EDSF initiative and Open Source project is open to any party who can contribute with the framework development, implementation, promotion and sponsoring or funding.

The github project serves as a hub for all future activities on the EDSF development, call for contribution and search for new funding and/or sponsorship.

The content of the wiki will grow with the time and will integrate the EDISON project legacy including the DataSciencePro community portal

Interviewed by: Miguel-Ángel Sicilia Urbán, The University of Alcala