Weights and Biases Weave
W&B Weave is a platform for evaluating, monitoring, and iterating on AI agents and applications, started with "one line of code." Weave Evaluations enable visual comparison of runs, automatic versioning of datasets and scorers, an interactive playground, and leaderboards. Scorers include pre-built ones (toxicity, hallucination), custom Python scoring functions, human feedback collection, and third-party scorers from providers such as RAGAS and LangChain.