JOB OPPORTUNITY: Data Scientist
Tanzania Data Lab (dLab) was established strategically to sustain the work and impact of Data Collaboratives for Local Impact (DCLI) program in Tanzania. We envision an Africa where data is frequently and effectively used to inform policy and decision-making at all levels. Our organization’s mission is to strengthen data ecosystems and data usage in innovation, policy and decision-making on health, economic growth, and gender in Tanzania and Africa.
Reports to: Project Manager
Duty station: DLab/TCC
To utilize analytical, statistical, and programming skills to collect, analyse, and interpret large data sets which will be collected throughout the project.
Duties and Responsibilities
- To support the project leader in the implementation of the project
- Work with stakeholders throughout the organization to identify opportunities for leveraging company data to drive usual business solutions.
- Mine and analyse data from council databases to drive optimization and improvement of decision making
- Assess the effectiveness and accuracy of new data sources and data gathering techniques.
- Develop custom data models and algorithms to apply to data sets
- Use predictive modelling to increase and optimize customer experiences, revenue generation, ad targeting and other business outcomes.
- Develop company A/B testing framework and test model quality.
- Coordinate with different functional teams to implement models and monitor outcomes.
- Develop processes and tools to monitor and analyse model performance and data accuracy for the council
Qualification and Experience
A bachelor’s degree in statistics, math, computer science, or economics. 2-4 years of experience manipulating data sets and building statistical models, and is familiar with the following software/tools:
- Coding knowledge and experience with several languages: C, C++, Java,
- Knowledge and experience in statistical and data mining techniques: GLM/Regression, Random Forest, Boosting, Trees, text mining, social network analysis, etc.
- Experience querying databases and using statistical computer languages: R, Python, SLQ, etc.
- Experience using web services: Redshift, S3, Spark, DigitalOcean, etc.
- Experience creating and using advanced machine learning algorithms and statistics: regression, simulation, scenario analysis, modeling, clustering, decision trees, neural networks, etc.
- Experience analyzing data from 3rd party providers: Google Analytics, Site Catalyst, Coremetrics, Adwords, Crimson Hexagon, Facebook Insights, etc.
- Experience with distributed data/computing tools: Map/Reduce, Hadoop, Hive, Spark, Gurobi, MySQL, etc.
- Experience visualizing/presenting data for stakeholders using: Periscope, Business Objects, D3, ggplot, etc
Skills and Competences
- Statistics and machine learning, coding languages, databases, machine learning, and reporting technologies
- Excellent written and verbal communication skills for coordinating across teams.
- Strong problem solving skills with an emphasis on product development.
- Experience using statistical computer languages (R, Python, SLQ, etc.) to manipulate data and draw insights from large data sets.
- Experience working with and creating data architectures.
- Knowledge of a variety of machine learning techniques (clustering, decision tree learning, artificial neural networks, etc.) and their real-world advantages/drawbacks.
- Knowledge of advanced statistical techniques and concepts (regression, properties of distributions, statistical tests and proper usage, etc.) and experience with applications.
- A drive to learn and master new technologies and techniques.
Send your resume and portfolio to firstname.lastname@example.org before 1 March 2021.