Skip to main content

Intelligence Layer Release 3.0.0

· 3 min read
Johannes Wesch
Sebastian Niehus

What's new with version 3.0.0

3.0.0

Dear developers, we’re thrilled to share a host of updates and improvements across our tracing and evaluation frameworks with the release of the Intelligence Layer 3.0! These changes are designed to enhance functionality and streamline your processes. To help you navigate these updates since release 1.0, we’ve organized them by topics, offering a clearer view of what’s new in each functional area. For the full list of changes, please refer to the changelog on our GitHub release page.

Python 3.12 Support

The Intelligence Layer now fully supports Python 3.12!

Tracer

We introduced an improved tracing format based on the OpenTelemetry format, while being more minimalistic and easier to read. It is mainly used for communication with the TraceViewer, maintaining backwards compatability. We also simplified the management of Span as well as TaskSpan and removed some unused tracing features. In future releases the old format will slowly be deprecated.

Evaluation

Better Support for Parameter Optimization

To make the comparison of workflow configurations, such as combinations of different models with different prompts, more convenient, and enable better parameter optimization, we added the aggregation_overviews_to_pandas method. This method converts multiple Aggregation objects into a pandas dataframe, ready for analysis and visualization. The new parameter_optimization.ipynb demonstrates the usage of the new method.

New Incremental Evaluator

There are use cases where you want to add some more models or runs to an already existing evaluation. Prior to this update, this meant that you had to re-evaluate all the previous runs again, potentially wasting time and money. With the new IncrementalEvaluator and IncrementalEvaluationLogic it is now easier to keep the old evaluations and adding new runs to them without performing costly re-evaluations. We added a how-to guide to showcase the implementation and usage.

New Elo Evaluation

We added the EloEvaluationLogic for implementing your own Elo evaluations using the Intelligence Layer! Elo evaluations are useful if you want to compare different models or configurations by letting them compete directly against each other on the evaluation datasets. To get you started, we also added a ready-to-use implementation of the EloQaEvaluationLogic, a how-to guide for implementing your own Elo evaluations, and a detailed tutorial notebook on Elo evaluation of QA tasks.

Argilla Rework

We did a major revamp of the ArgillaEvaluator to separate an AsyncEvaluator from the normal evaluation scenario. This comes with easier to understand interfaces, more information in the EvaluationOverview and a simplified aggregation step for Argilla that is no longer dependent on specific Argilla types. Check the how-to for detailed information.

Breaking Changes

For a detailed list see our GitHub release page.

  • Changes related to Tracers.
  • Moved away from nltk-package for graders.
  • Changes related to Argilla Repositories and ArgillaEvaluators.
  • Refactored internals of Evaluator. This is only relevant if you subclass from it.

These listed updates aim to assist you in easily integrating the new changes into your workflows. As always, we are committed to improving your experience and supporting your AI development needs. Please refer to our updated documentation and how-to guides linked throughout this update note for detailed instructions and further information. Happy coding!