Skip to main content

Evaluate

With our evaluate-endpoint you can score the likelihood of pre-defined completions. This is useful if you already know the output you would expect or want to test the likelihood of a given output. The major advantage is that the evaluate-endpoint is significantly faster than the complete-endpoint.

Code Example

import os
import math
from aleph_alpha_client import Client, Prompt, EvaluationRequest
# If you are using a Windows machine, you must install the python-dotenv package and run the two below lines as well.
# from dotenv import load_dotenv
# load_dotenv()

client = Client(token=os.getenv("AA_TOKEN"))

prompt_text = "An apple a day"
possible_completions = [
" keeps the doctor away",
" is healthy"
]

responses = []

for possible_completion in possible_completions:
params = {
"prompt": Prompt.from_text(prompt_text),
"completion_expected": possible_completion
}
request = EvaluationRequest(**params)
response = client.evaluate(request=request, model="luminous-extended")
responses.append(response)

print("""The probability for possible_completion_1 is: {score_1}
The probability for possible_completion_2 is: {score_2}""".format(
score_1 = math.exp(responses[0].result["log_probability"]),
score_2 = math.exp(responses[1].result["log_probability"])
))
# prints:
# The probability for possible_completion_1 is: 0.344773947962664
# The probability for possible_completion_2 is: 0.00030276034681883494
# Note: 34% can be considered a very high probability.

Code Example Attention Manipulation

You can also score the likelihood of pre-defined completions given a prompt with manipulated attention. In the following example, we want to suppress the word "German" to understand the influence it has on the possible completions. First, let's look at the example without attention manipulation.

from aleph_alpha_client import Client, Prompt, EvaluationRequest
import math
import os

# If you are using a Windows machine, you must install the python-dotenv package and run the two below lines as well.
# from dotenv import load_dotenv
# load_dotenv()

client = Client(token=os.getenv("AA_TOKEN"))

prompt_text = """Q: Is there a speedlimit on the German highways?
A:"""

possible_completions = [" Yes", " No"]

responses = []

for possible_completion in possible_completions:
params = {
"prompt": Prompt.from_text(prompt_text),
"completion_expected": possible_completion,
}
request = EvaluationRequest(**params)
response = client.evaluate(request=request, model="luminous-extended")
responses.append(response)

print(
"""The probability for possible_completion_1 is: {score_1}
The probability for possible_completion_2 is: {score_2}""".format(
score_1=math.exp(responses[0].result["log_probability"]),
score_2=math.exp(responses[1].result["log_probability"]),
)
)

# prints:
# The probability for possible_completion_1 is: 0.27199709306616365
# The probability for possible_completion_2 is: 0.08830453620868192

If we now suppress the word "German", the completion "Yes" becomes more likely.

from aleph_alpha_client import Client, Prompt, TextControl, EvaluationRequest
import math
import re
import os

# If you are using a Windows machine, you must install the python-dotenv package and run the two below lines as well.
# from dotenv import load_dotenv
# load_dotenv()

client = Client(token=os.getenv("AA_TOKEN"))

prompt_text = """Q: Is there a speedlimit on the German highways?
A:"""

possible_completions = [" Yes", " No"]

matching_string = re.search("German", prompt_text)


begin_match = matching_string.regs[0][0]
end_match = matching_string.regs[0][1]

control = TextControl(start=begin_match, length=end_match - begin_match, factor=0.01)

responses = []

for possible_completion in possible_completions:
params = {
"prompt": Prompt.from_text(prompt_text, controls=[control]),
"completion_expected": possible_completion,
}
request = EvaluationRequest(**params)
response = client.evaluate(request=request, model="luminous-extended")
responses.append(response)

print(
"""The probability for possible_completion_1 is: {score_1}
The probability for possible_completion_2 is: {score_2}""".format(
score_1=math.exp(responses[0].result["log_probability"]),
score_2=math.exp(responses[1].result["log_probability"]),
)
)
# prints:
# The probability for possible_completion_1 is: 0.3093283638911958
# The probability for possible_completion_2 is: 0.08862406892214847

If you need more information on the parameters you can use, please checkout our HTTP API.