Dropbox Dash brings your files, messages, and team’s knowledge together in one place, so you can ask questions and get useful answers that are actually grounded in your company’s context. Under the hood, that experience relies heavily on one deceptively simple capability: reliably judging which results are relevant to a query at scale. Relevance judges are used across multiple pipelines like ranki...
This case study highlights a critical challenge in AI deployment: balancing performance, cost, and reliability as models evolve. Dropbox’s approach demonstrates how systematic optimization frameworks like DSPy can mitigate the brittleness of manual prompt engineering, turning a fragile process into a repeatable workflow. The strongest version of this narrative is its emphasis on measurable outcomes—reducing NMSE by 45%, cutting malformed outputs by 97%, and enabling orders-of-magnitude scaling—w...
