About The Position
Zebra has peta-scale of rich medical data - including images, as well as many other types of structured, semi-structured and unstructured data. These data assets fuel Zebra’s game changing algorithmic products, and hence are the company’s most treasured asset.
We are looking for a senior, outstanding analytic data scientist, who will take ownership of this data - from exploring, understanding, documenting and ensuring its quality, mapping it into useful data model, and helping our algorithms learn from it - as well as developing insights in the data which can be the basis of new medical research or commercial products.
You’ll be working with some of the best algorithm researchers, data engineers and medical doctors in a very multi-disciplinary environment which offers endless learning and growth opportunities - from learning how to code in a new language, using new types of databases, developing machine learning models, and of course learning about the human body and how it works.
- Research different data assets, understand their value, distribution and quality (structures, semi structured and unstructured).
- Own data quality - from detecting issues to fixing those which can be fixed automatically using scripts and other tools
- Develop the internal logical data model, and co-write the code which populates the data model from raw data.
- Build the training data sets on which our algorithm will train, based on a thorough understanding of the medical phenomenons in questions, as well as how algorithms learn from examples
- Analyze the effectiveness of our algorithms, visualize their results, detect blind spots using statistical/ML/heuristics
- Develop machine learning and NLP based products which generate insights from the data - ranging from understanding doctor’s free text reports to detecting important organs in CT scans.
- At least BSc / BA in the areas of Industrial Engineering / Statistics / Mathematics /Bioinformatics; MSc - an advantage
- 3+ years experience researching, querying, cleaning, and visualizing large, complex datasets containing unstructured data
- Background in bio-informatics, bio-engineering, biology or the medical field - very strong advantage
- Experience with machine learning, computer vision, and/or NLP
- Python or other scripting languages
- Good grasp of the field of statistics
- Highly skilled in SQL
- Highly skilled in at least one visualization/BI tool - qlikview, sisense, etc.
- Experience with No-SQL databases (mongoDB, Elasticsearch, Hadoop, spark)
- Solid grasp of how machine learning works; experience working alongside other data scientists or algorithm developers
- Deep passion for working with data as well as people
- Track record of excellence and drive for results
- Organized, detail-oriented
- Very curious and fast learner, hands-on approach
- Independent, communicative - able to articulate findings and insights clearly, both verbally and in writing