Bachelors, Masters or PhD level in a discipline such as: computer science, machine learning, applied statistics, mathematics, engineering or artificial intelligence
5+ years of deep technical experience in distributed computing, machine learning, and statistics related work
Programming experience in languages such as: Python, R, Scala, SQL
Proven application of advanced analytical, data science and statistical methods in the commercial world
Knowledge of distributed computing or NoSQL technologies is a bonus
Client-facing skills e.g. working in close-knit teams on topics such as data warehousing, machine learning
While we advocate for using the right tech for the right task, we often leverage the following technologies: Python, PySpark, the PyData stack, SQL, Airflow, Databricks, our own open-source data pipelining framework called Kedro, Dask/RAPIDS, container technologies such as Docker and Kubernetes, cloud solutions such as AWS, GCP, and Azure, and more.
Exceptional time management to meet your responsibilities in a complex and largely autonomous work environment.
Demonstrated leadership (thought leadership or people leadership e.g. managed project teams or direct reports)
Willingness to travel
Strong presentation and communication skills, both verbal and written, in English and Portuguese, with the ability to adjust your style to suit different perspectives and seniority levels