Short Term Consult...

Short Term Consultant - Data Scientist

Digital Entrepreneurship Program, World Bank Group Innovation & Entrepreneurship Unit





The World Bank Group's Innovation and Entrepreneurship Unit is hiring one or two data scientists with experience and strong skills in data analysis, data presentation, data management, and development of data-driven applications.

Appointment Type: Short Term Consultant (STC)
Specialization: Data Scientist
Unit: Trade and Competitiveness, World Bank Group
Location: Washington, D.C.
Application Deadline: April 11, 2016
Appointment End Date: June 30, 2016 (unless extended further)

To apply, please send a cover note and your CV to Alberto Sanchez Rodelgo at


Recent decades have seen a massive expansion of global trade and investment, generating new opportunities for private sector job creation and growth, and transforming the development landscape for World Bank Group clients. The joint Trade and Competitiveness (T&C) Global Practice — combining teams from the World Bank, IFC, and MIGA — provides a comprehensive array of solutions to support the efforts of client governments to integrate into the global economy and develop competitive private sectors. The Global Practice works to increase international trade and investment and boost competitiveness not as an end in itself, but as a means of achieving the World Bank Group goals of ending extreme poverty and increasing shared prosperity. This involves working closely with public and private sectors, a wide range of global partners, and across the World Bank Group.

T&C brings together more than 500 leading technical experts in the field of trade, investment, innovation, and private sector development, with extensive policy expertise, sector-specific knowledge, and practical experience in implementation. With half of its staff global specialists, and the other half serving in the regions, T&C has a wide footprint across the globe. Offering an integrated package of solutions—including policy advice, technical assistance, financing, and capacity building—the practice brings global knowledge to designing and implementing of projects tailored to the specific needs of client countries, whether fast-growing emerging economies, middle income countries, or fragile or conflict-affected states.

A newly formulated T&C Roadmap recognizes the crucial role of knowledge and integrated solutions in meeting client needs as well as twin goals of the WBG. To make it easier for internal and external users to access, use, and share trade and competitiveness related data and indicators, T&C proposes to create different services that will aggregate and visualize data from multiple sources and present it in combination with other knowledge and social resources.

Duties and Responsibilities

T&C is looking to hire 1 or 2 data scientists with experience and strong skills in data analysis, data presentation, data management, and development of data-driven applications. The candidate must have strong experience developing practical applications and in analyzing data from various sources that vary in scale and complexity.

There are currently 2 primary assignments (to be developed by one consultant or to be split among 2 consultants); the hired consultant(s) may also be assigned additional tasks –

Assignment 1 (Machine Learnt Investment Reform Database)

The Investment Climate unit of the World Bank Group’s Trade and Competitiveness Global Practice (GTCIC) works with developing country governments on improving their business environments for domestic and foreign investors. As part of its research program, the unit is seeking to track and benchmark changes in countries laws, regulations and administrative procedures concerning business activity and investment. It is seeking to update selected aspects of an online database Investing Across Borders through machine learning and web scraping methods.

The objective of the assignment is to utilize web scraping and machine learning to identify and track reforms in the client countries' investment climates. This would include identification of new legislation posted online, availability of new business support programs, or institutional reforms communicated through local media.

The initial assignment will focus on 3-5 pilot countries. Specific tasks could include:

  • Identification of tax and non-tax incentives awarded to investors
  • Detection of investor disputes and associated resolutions through mediation/arbitration
  • Identification of new foreign investment projects, especially in least developed countries where such data is not tracked by available global databases
  • Monitoring of new laws / regulations / directives concerning investment climate
  • Others

Assignment 2 (Digital Entrepreneurship Ecosystem Diagnostics Toolkit)

The Innovation & Entrepreneurship (I&E) Unit within the World Bank Group’s Trade & Competitiveness Global Practice, focuses on improving the capabilities and growth prospect of firms, developing the ecosystems that directly support them, and on developing the broader innovation and entrepreneurship systems.

I&E is developing a customizable Digital Entrepreneurship Ecosystem Diagnostics Toolkit to assist clients with strengthening their digital entrepreneurship ecosystems. The Toolkit will identify:

  • The opportunity for the startup and growth of innovative digital technology enterprises; and
  • Activities and policies that could support digital technology entrepreneurs in a particular location.

The objective of the assignment is to build a customizable, intuitive online tool (available initially to internal staff only), and to manage all the data that will be published/surfaced through that tool, based on an existing inventory of indicators. In addition, the data scientist will work on related, similar projects, help identify new data publishing opportunities, manage data related sub-projects, and other tasks as needed.

Scope of work/list of deliverables

The consultant(s) will produce the following:

For the Machine Learnt Investment Reform Database

The initial assignment is to use machine learning techniques in 3-5 pilot countries to identify relevant legal, regulatory and administrative information on countries investment climates. The tasks will include:

  • Create tools to pull data from multiple web sources (some, but not all, data may be available through APIs)
  • Read multiple documents with legal or regulatory information
  • Scan for certain keywords like investment, restrictions, incentives, and identify section header, related text and other sections that the particular section may be referring to.
  • In the identified sections search for country names and specific pre-identified terms
  • Present the information in a template form
  • In some cases the documents may be a scanned image and OCR techniques will have to be used to convert them into text, before analysis
  • Organize, curate, and harmonize the data in a database
  • Develop tools to query the data and publish the outcome through a web based interface
  • Write data validation routines to validate data

For the Digital Entrepreneurship Ecosystem Diagnostics Toolkit

  • Indicators inventory
  • Data sources inventory
  • Relevant data visualization
  • Data management/maintenance
  • Relevant scripts to gather/maintain/publish data
  • Data validation scripts
  • Data standardization practices
  • Database documentation
  • Indicators website

Selection Criteria

  • Master’s degree in a relevant discipline, plus a minimum of 3 years relevant work experience; or a Bachelor's degree in a relevant discipline and 5 years of work experience
  • Experience in various aspects of data science: a creative problem solver with a strong understanding of statistics and research design
  • Good knowledge of the concepts and experience on Natural Language Processing techniques such as word2vec, LDA, etc.
  • Familiarity with digital technology and entrepreneurship data a plus
  • Excellent working knowledge of either R or Python; knowledge of Java and C/C++ advantageous
  • Familiarity with Shiny desirable – we currently use Shiny to develop similar applications
  • Working knowledge of SQL and NoSQL systems and related approaches to large scale computing using cloud based technologies
  • Working knowledge of web-based interactive visualization tools such as D3 and Tableau
  • Ability to work with unstructured data and perform data and text mining desirable
  • Experience and proficiency working with public statistics and data management. Prior involvement in open data and open government communities is an advantage
  • Expertise with the use of APIs and web development standards
  • Experience in Hadoop, Pig and Hive is an advantage
  • Excellent writing and editing skills, with a strong command of English and an ability to convey complex ideas in a clear, direct, and lively style
  • Ability to use strong interpersonal and teamwork skills to cultivate effective, productive client relationships and partnerships and generate excitement around data
  • Ability to solve problems, with strong investigative and research skills
  • Sensitivity to a multicultural environment


The work will be conducted from World Bank HQ in Washington, D.C.


The initial assignment is through June 30, with the possibility of extension afterwards.

Please login to post comments.