Linguistic Engineer
Description
Meta's Reality Labs Wearables division is developing future products in augmented reality and virtual reality. Our work in conversational AI, computer vision, advanced optics, eye tracking, machine learning and reasoning will enable and empower consumers and businesses in new ways. As a Linguistic Engineer (LE), you will play a pivotal role in delivering datasets, models, and knowledge that power machine learning systems for the multimodal assistant product. Your responsibilities will include creating high-quality datasets, ensuring consistent models and representations across languages and domains, and developing data infrastructure to support AI assistant experiences. You will collaborate closely with ML teams working on components such as ASR/TTS, NLU, NLG, Dialog, LLMs, and Knowledge Graph, contributing to the advancement of cutting-edge AI technologies. The ideal candidate brings technical, analytical, and collaboration skills, with proven experience in building datasets for ML applications. While experience as an ML practitioner is beneficial, the primary focus is on dataset development and partnership with ML teams. Proficiency with data tools, pipelines, and analytics is essential. Multilingual abilities and a keen interest in NLP and/or conversational AI systems are highly valued, as is enthusiasm for working at the forefront of their development. Experience in any subfield of computational linguistics is a plus, though not a strict requirement.
Responsibilities
Build datasets, pipelines, and models to support machine learning applications Directly contribute to product development by creating rules, prompts, and implementing data patches Evaluate model and product quality, closing the feedback loop through continuous assessment and iteration Collaborate effectively with project stakeholders and cross-functional teams Identify best practices and drive improvements across data systems and processes Lead projects from conceptualization through launch, ensuring ongoing enhancement and support Design and execute product experiments to inform development and innovation Tackle complex problems and navigate ambiguity to deliver impactful, innovative solutions Manage and prioritize multiple work streams to meet project goals and deadlines
Qualifications
Minimum 2 years of professional experience as a data scientist, software engineer, computational linguist, or in a similar technical role Proficient in programming and data analysis using languages and platforms such as Python, SQL, and PHP/Hack Hands-on experience with text analysis, scripting, and working with both relational and NoSQL databases Demonstrated success in shipping multiple products across various platforms Bachelor’s degree in Linguistics, Computational Linguistics, Computer Science, Data Science, Information Systems, or a related field, or equivalent practical experience At least 5 years of professional experience as a data scientist, software engineer, or computational linguist, including hands-on work with machine learning and integrating knowledge graphs with lexicons or ontologies Understanding of how data interacts with and influences machine learning models Proven experience managing large scripting projects, such as combining language data from multiple sources and computing complex metrics over extensive datasets Skilled in designing and executing data experiments to inform product and model development Familiarity with programming best practices, including version control, unit testing, and code quality standards Fluency in one or more Indian/Asian languages Advanced coursework or research in Linguistics, Computer Science, Data Science, Computational Linguistics, Information Systems, or a related discipline