We are looking for a data engineer, preferably with some audio processing experience, to join our ML team. You will transform raw data into consumable formats for machine learning. The role involves building infrastructure to harness the data streams that flow into our servers and collating them accordingly. As an extension of the job role, you'll also get to enjoy working with our data scientists to explore statistical methods. In all, youâll be owning the data collation platform.
To succeed in this data engineering position, you should have a strong ability to build a platform and automation of services that collate and organize data from different sources. You are a strong programmer with attention to detail and possess good analytical skills. Also, knowledge of audio processing would be a huge bonus.
About Viva
Viva Translate is seeking passionate individuals to join our team of ML researchers & engineers from top institutions such as Google and Stanford. We are creating a world where language is no longer a barrier to work and opportunity, and we are starting across Latin America.
Viva is building an AI tool that helps people read, write, and speak better across English, Spanish, and Portuguese. If you share our vision for a borderless future and are a motivated builder, idealist, and explorer, we would love to hear from you.
\n- Analyze and organize raw audio
- Build data systems and pipelines
- Prepare data for predictive modeling
- Explore ways to enhance data quality, reliability, and security
- Develop analytical tools
- Collaborate with data scientists & architects, and human transcribers & translators
- Our ML tech stack includes Python/Django, AWS, CI/CD, Terraform, BERT, Spacey
- 1+ years of experience as a data engineer or in a similar role
- Knowledge of programming languages (e.g., Python)
- Hands-on experience with SQL database design (e.g. Postgres)
- Effective communication with team members of diverse technical backgrounds
- Degree in Computer Science, IT, or similar fieldsUsing project management tools (e.g., GitHub, Asana)
- Experience in handling audio streaming data
- Prior experience of building data platforms Experience with Cloud providers (e.g. AWS, GCP, Azure)
- Experience with distributed/streaming data-processing technologies and frameworks (e.g. Scala, Apache Spark, Databricks, Apache Kafka, Redpanda, CockroachDB)
- Fluent in Spanish and/or Portuguese
- Continuous growth - we take pride in setting new standards and taking ownership in our work
- Data-driven - Decisions made together based on data and logic
- Open integrity - we promote a low ego environment that treasures transparency, empathy, and feedback
- Play - we champion diversity and creativity, and want everyone not be afraid to fail & fail quickly
- Fully remote team ð
- 3+ in-person retreats (past locations include Mexico, Colombia & Ecuador) annually
- Join an early-stage startup (12 people & growing ð)
- Home office stipend
- Health & fitness benefits
- Learning stipend - we are here to support your personal & professional development journey