at Squad
<h4>Team Summary</h4> <p>Our distributed team is looking for an experienced Applied Scientist with a strong background in Large Language models to develop high-performance Generative AI features across Cloud and Edge environments.</p> <h4>Job Summary</h4> <p>In this role you will drive the transition from research to production by optimizing local inference through model compression and quantization for private, real-time Edge performance, while also engineering scalable RAG architectures and multi-agent systems for Cloud deployment. Your daily responsibilities encompass the full research lifecycle, including formulating hypotheses, generating synthetic datasets, fine-tuning LLMs, and validating safety and alignment, ultimately culminating in technical reports.</p> <h4>Responsibilities and Duties</h4> <ul> <li>Design and implement advanced methods in prompt orchestration, fine-tuning (SFT/RLHF/DPO), and autonomous agentic workflows</li> <li>Curate high-quality training data from large-scale text and multi-modal sources</li> <li>Identify patterns in model hallucinations and visualize evaluation metrics for clear interpretation</li> <li>Tune hyperparameters and improve inference speed/accuracy through PEFT (LoRA/QLoRA) and advanced prompt engineering</li> <li>Collaborate with Product and Data Engineering teams to seamlessly integrate LLM features into the broader ecosystem</li> <li>Track and report progress using industry-standard benchmarks (MMLU, HumanEval, etc.) and custom i