Skip to main content

Retrieval Augmented Generation (RAG)

Large language models, like our Luminous model family, have amazing capabilities out of the box. Still, you can make the most of this technology, if you connect your own documents to the generative model. In this example, we will show how you can do this in five simple steps:

  1. Split the documents in smaller chunks that preserve the semantic meaning.
  2. Use the semantic-embedding endpoint to make your documents searchable.
  3. Store the embeddings to avoid recalculation for each user request.
  4. Query the database and return the retrieved documents.
  5. Use the retrieved document for downstream tasks.

You can bring as many documents as you like. Here we will use the Preamble of the EU Charta and the Acquis of the Schengen Treaty.

Make Your Documents Searchable

In order to effectively search through your own documents, it is important to ensure that they can be easily compared to each other. Our asymmetric embeddings are designed to help find the pieces of your documents that are most relevant to a query shorter than the documents in the database. Here we will use short queries and longer splits of law texts.

In the first part of this tutorial, we will focus on the lower part of the graphic. First, we need to split the documents, because there is a token limit of 2048 tokens per embedding request and we cannot embed longer texts. In general, we recommend that a chunk size between 200 and 1000.

As we are working with semantic embeddings, it is very important to preserve most of the semantic structure of the text. We found that the semantics of a longer text are often found in paragraphs and other text structures like lists or bullet points. Therefore, splitting is highly dependent on the structure of the documents you want to embed. In this tutorial we will use the Python library text-splitter.

import os 
import itertools
from aleph_alpha_client import Client
from semantic_text_splitter import HuggingFaceTextSplitter

# If you are using a Windows machine, you must install the python-dotenv package and run the two below lines as well.
# from dotenv import load_dotenv
# load_dotenv()

TOKEN = os.getenv("AA_TOKEN")

# the documents we will embed (Preamble of the EU Charta, Acquis of the Schengen Treaty)
docs = [
"The peoples of Europe, in creating an ever closer union among them, are resolved to share a peaceful future based on common values.\n\nConscious of its spiritual and moral heritage, the Union is founded on the indivisible, universal values of human dignity, freedom, equality and solidarity; it is based on the principles of democracy and the rule of law. It places the individual at the heart of its activities, by establishing the citizenship of the Union and by creating an area of freedom, security and justice.\n\nThe Union contributes to the preservation and to the development of these common values while respecting the diversity of the cultures and traditions of the peoples of Europe as well as the national identities of the Member States and the organisation of their public authorities at national, regional and local levels; it seeks to promote balanced and sustainable development and ensures free movement of persons, goods, services and capital, and the freedom of establishment.\n\nTo this end, it is necessary to strengthen the protection of fundamental rights in the light of changes in society, social progress and scientific and technological developments by making those rights more visible in a Charter.\n\nThis Charter reaffirms, with due regard for the powers and tasks of the Community and the Union and the principle of subsidiarity, the rights as they result, in particular, from the constitutional traditions and international obligations common to the Member States, the Treaty on European Union, the Community Treaties, the European Convention for the Protection of Human Rights and Fundamental Freedoms, the Social Charters adopted by the Community and by the Council of Europe and the case-law of the Court of Justice of the European Communities and of the European Court of Human Rights.\n\nEnjoyment of these rights entails responsibilities and duties with regard to other persons, to the human community and to future generations.\nThe Union therefore recognises the rights, freedoms and principles set out hereafter.",
"AGREEMENT\nbetween the Governments of the States of the Benelux Economic Union, the Federal Republic of Germany and the French Republic on the gradual abolition of checks at their common borders\nThe Governments of the KINGDOM OF BELGIUM, the FEDERAL REPUBLIC OF GERMANY, the FRENCH REPUBLIC, the GRAND DUCHY OF LUXEMBOURG and the KINGDOM OF THE NETHERLANDS,\nhereinafter referred to as \"the Parties\",\n\nAWARE that the ever closer union of the peoples of the Member States of the European Communities should find its expression in the freedom to cross internal borders for all nationals of the Member States and in the free movement of goods and services,\n\nANXIOUS to strengthen the solidarity between their peoples by removing the obstacles to free movement at the common borders between the States of the Benelux Economic Union, the Federal Republic of Germany and the French Republic,\n\nCONSIDERING the progress already achieved within the European Communities with a view to ensuring the free movement of persons, goods and services,\n\nPROMPTED by the resolve to achieve the abolition of checks at their common borders on the movement of nationals of the Member States of the European Communities and to facilitate the movement of goods and services at those borders,\n\nCONSIDERING that application of this Agreement may require legislative measures which will have to be submitted to the parliaments of the Signatory States in accordance with their constitutions,\n\nHAVING REGARD to the statement by the Fontainebleau European Council on 25 and 26 June 1984 on the abolition of police and customs formalities for people and goods crossing intra-Community frontiers,\n\nHAVING REGARD to the Agreement concluded at Saarbrücken on 13 July 1984 between the Federal Republic of Germany and the French Republic,\n\nHAVING REGARD to the Conclusions adopted on 31 May 1984 following the meeting of the Transport Ministers of the Benelux States and the Federal Republic of Germany at Neustadt an der Aisch,\n\nHAVING REGARD to the Memorandum of the Governments of the Benelux Economic Union of 12 December 1984 forwarded to the Governments of the Federal Republic of Germany and the French Republic,\nHAVE AGREED AS FOLLOWS:\n\nTITLE I\nMEASURES APPLICABLE IN THE SHORT TERM\n\nArticle 1\nAs soon as this Agreement enters into force and until all checks are abolished completely, the formalities for nationals of the Member States of the European Communities at the common borders between the States of the Benelux Economic Union, the Federal Republic of Germany and the French Republic shall be carried out in accordance with the conditions laid down below.\n\nArticle 2\nWith regard to the movement of persons, from 15 June 1985 the police and customs authorities shall as a general rule carry out simple visual surveillance of private vehicles crossing the common border at reduced speed, without requiring such vehicles to stop.\nHowever, they may carry out more thorough controls by means of spot checks. These shall be performed where possible off the main road, so as not to interrupt the flow of other vehicles crossing the border.\n\nArticle 3\nTo facilitate visual surveillance, nationals of the Member States of the European Communities wishing to cross the common border in a motor vehicle may affix to the windscreen a green disc measuring at least eight centimetres in diameter. This disc shall indicate that they have complied with border police rules, are carrying only goods permitted under the duty-free arrangements and have complied with exchange regulations.\n\nArticle 4\nThe Parties shall endeavour to keep to a minimum the time spent at common borders in connection with checks on the carriage of passengers by road for hire or reward.\nThe Parties shall seek solutions enabling them by 1 January 1986 to waive systematic checks at their common borders on passenger waybills and licences for the carriage of passengers by road for hire or reward.\n\nArticle 5\nBy 1 January 1986 common checks shall be put in place at adjacent national control posts in so far as that is not already the case and in so far as physical conditions so permit. Consideration shall subsequently be given to the possibility of introducing common checks at other border crossing points, taking account of local conditions.\n\nArticle 6\nWithout prejudice to the application of more favourable arrangements between the Parties, the latter shall take the measures required to facilitate the movement of nationals of the Member States of the European Communities resident in the local administrative areas along their common borders with a view to allowing them to cross those borders at places other than authorised crossing points and outside checkpoint opening hours.\nThe persons concerned may benefit from these advantages provided that they transport only goods permitted under the duty-free arrangements and comply with exchange regulations.\n\nArticle 7\nThe Parties shall endeavour to approximate their visa policies as soon as possible in order to avoid the adverse consequences in the field of immigration and security that may result from easing checks at the common borders. They shall take, if possible by 1 January 1986, the necessary steps in order to apply their procedures for the issue of visas and admission to their territories, taking into account the need to ensure the protection of the entire territory of the five States against illegal immigration and activities which could jeopardise security.\n\nArticle 8\nWith a view to easing checks at their common borders and taking into account the significant differences in the laws of the States of the Benelux Economic Union, the Federal Republic of Germany and the French Republic, the Parties undertake to combat vigorously illicit drug trafficking on their territories and to coordinate their action effectively in this area.\n\nArticle 9\nThe Parties shall reinforce cooperation between their customs and police authorities, notably in combating crime, particularly illicit trafficking in narcotic drugs and arms, the unauthorised entry and residence of persons, customs and tax fraud and smuggling. To that end and in accordance with their national laws, the Parties shall endeavour to improve the exchange of information and to reinforce that exchange where information which could be useful to the other Parties in combating crime is concerned.\nWithin the framework of their national laws the Parties shall reinforce mutual assistance in respect of unauthorised movements of capital.\n\nArticle 10\nWith a view to ensuring the cooperation provided for in Articles 6 to 9, meetings between the Parties' competent authorities shall be held at regular intervals.\n\nArticle 11\nWith regard to the cross-border carriage of goods by road, the Parties shall waive, as from 1 July 1985, systematic performance of the following checks at their common borders:\n- control of driving and rest periods (Council Regulation (EEC) No 543/69 of 25 March 1969 on the harmonisation of certain social legislation relating to roard transport and AETR),\n- control of the weights and dimensions of commercial vehicles; this provision shall not prevent the introduction of automatic weighing systems for spot checks on weight,\n- controls on the vehicles' technical state.\nMeasures shall be taken to avoid checks being duplicated within the territories of the Parties.\n\nArticle 12\nFrom 1 July 1985 checks on documents detailing transport operations not carried out under licence or quota pursuant to Community or bilateral rules shall be replaced at the common borders by spot checks. Vehicles carrying out transport operations under such arrangements shall display a visual symbol to that effect when crossing the border.\nThe Parties' competent authorities shall determine the features of this symbol by common agreement.\n\nArticle 13\nThe Parties shall endeavour to harmonise by 1 January 1986 the systems applying among them to the licensing of commercial road transport with regard to cross-border traffic, with the aim of simplifying, easing and possibly replacing licences for journeys by licences for a period of time, with a visual check when vehicles cross common borders.\nThe procedures for converting licences for journeys into licences for periods of time shall be agreed on a bilateral basis, account being taken of the road haulage requirements in the different countries concerned.\n\nArticle 14\nThe Parties shall seek solutions to reduce the waiting times of rail transport at the common borders caused by the completion of border formalities.\n\nArticle 15\nThe Parties shall recommend to their respective rail companies:\n- to adapt technical procedures in order to minimise stopping times at the common borders,\n- to do their utmost to apply to certain types of carriage of goods by rail, to be defined by the rail companies, a special routing system whereby the common borders can be crossed rapidly without any appreciable stops (goods trains with reduced stopping times at borders).\n\nArticle 16\nThe Parties shall harmonise the opening dates and opening hours of customs posts for inland waterway traffic at the common borders.\n\nTITLE II\nMEASURES APPLICABLE IN THE LONG TERM\n\nArticle 17\nWith regard to the movement of persons, the Parties shall endeavour to abolish checks at common borders and transfer them to their external borders. To that end they shall endeavour first to harmonise, where necessary, the laws, regulations and administrative provisions concerning the prohibitions and restrictions on which the checks are based and to take complementary measures to safeguard internal security and prevent illegal immigration by nationals of States that are not members of the European Communities.\n\nArticle 18\nThe Parties shall open discussions, in particular on the following matters, account being taken of the results of the short-term measures:\n(a) drawing up arrangements for police cooperation on crime prevention and investigation;\n(b) examining any difficulties that may arise in applying agreements on international judicial assistance and extradition, in order to determine the most appropriate solutions for improving cooperation between the Parties in those fields;\n(c) seeking means to combat crime jointly, inter alia, by studying the possibility of introducing a right of hot pursuit for police officers, taking into account existing means of communication and international judicial assistance.\n\nArticle 19\nThe Parties shall seek to harmonise laws and regulations, in particular on:\n- narcotic drugs,\n- arms and explosives,\n- the registration of travellers in hotels.\n\nArticle 20\nThe Parties shall endeavour to harmonise their visa policies and the conditions for entry to their territories. In so far as is necessary, they shall also prepare the harmonisation of their rules governing certain aspects of the law on aliens in regard to nationals of States that are not members of the European Communities.\n\nArticle 21\nThe Parties shall take common initiatives within the European Communities:\n(a) to achieve an increase in the duty-free allowances granted to travellers;\n(b) in the context of Community allowances to remove any remaining restrictions on entry to the Member States of goods possession of which is not prohibited for their nationals.\nThe Parties shall take initiatives within the European Communities so that VAT on tourist transport services within the European Communities is collected in the country of depature on a harmonised basis.\n\nArticle 22\nThe Parties shall endeavour both among themselves and within the European Communities:\n- to increase the duty-free allowance for fuel in order to bring it into line with the normal contents of bus and coach fuel tanks (600 litres),\n- to approximate the tax rates on diesel fuel and to increase the duty-free allowances for the normal contents of lorry fuel tanks.\n\nArticle 23\nIn the field of goods transport the Parties shall also endeavour to reduce stopping times and the number of stopping points at adjacent national control posts.\n\nArticle 24\nWith regard to the movement of goods, the Parties shall seek means of transferring the checks currently carried out at the common borders to the external borders or to within their own territories.\nTo that end they shall take, where necessary, common initiatives among themselves and within the European Communities to harmonise the provisions on which checks on goods at the common borders are based. They shall ensure that these measures do not adversely affect the necessary protection of the health of humans, animals and plants.\n\nArticle 25\nThe Parties shall develop their cooperation with a view to facilitating customs clearance of goods crossing a common border, through a systematic, automatic exchange of the necessary data collected by means of the single document.\n\nArticle 26\nThe Parties shall examine how indirect taxes (VAT and excise duties) may be harmonised in the framework of the European Communities. To that end they shall support the initiatives undertaken by the European Communities.\n\nArticle 27\nThe Parties shall examine whether, on a reciprocal basis, the limits on the duty-free allowances granted at the common borders to frontier-zone residents, as authorised under Community law, may be abolished.\n\nArticle 28\nBefore the conclusion of any bilateral or multilateral arrangements similar to this Agreement with States that are not parties thereto, the Parties shall consult among themselves.\n\nArticle 29\nThis Agreement shall also apply to Berlin, unless a declaration to the contrary is made by the Government of the Federal Republic of Germany to the Governments of the States of the Benelux Economic Union and the Government of the French Republic within three months of entry into force of this Agreement.\n\nArticle 30\nThe measures provided for in this Agreement which are not applicable as soon as it enters into force shall be applied by 1 January 1986 as regards the measures provided for in Title I and if possible by 1 January 1990 as regards the measures provided for in Title II, unless other deadlines are laid down in this Agreement.\n\nArticle 31\nThis Agreement shall apply subject to the provisions of Articles 5, 6 and 8 to 16 of the Agreement concluded at Saarbrücken on 13 July 1984 between the Federal Republic of Germany and the French Republic.\n\nArticle 32\nThis Agreement shall be signed without being subject to ratification or approval, or subject to ratification or approval, followed by ratification or approval.\nThis Agreement shall apply provisionally from the day following that of its signature.\nThis Agreement shall enter into force 30 days after deposit of the last instrument of ratification or approval.\n\nArticle 33\nThis Agreement shall be deposited with the Government of the Grand Duchy of Luxembourg, which shall transmit a certified copy to each of the Governments of the other Signatory States."

client = Client(token=TOKEN)
tokenizer = client.tokenizer("luminous-base")
text_splitter = HuggingFaceTextSplitter(tokenizer)

splitted_docs = [text_splitter.chunks(doc, 300) for doc in docs]
splitted_docs = list(itertools.chain.from_iterable(splitted_docs))

In the second step, we will create embeddings for all your documents. Obtaining these embeddings makes your documents comparable and allows you to find the most similar text pieces to the user's query. If you want to embed larger document collections, it makes sense to use our AsyncClient to speed up the embedding process. As we have "document-query"-pairs for this use case, we want to use the Document representation for the documents we want to search in.

There are also cases where the user's input is a similar document (e.g., when comparing rather standardized documents like RFPs). Then we have "document-document"-pairs. In this case, you should use Symmetric representation. To find out more about our semantic embeddings, please refer to this part of our documentation.

import os
from aleph_alpha_client import AsyncClient, Prompt, SemanticRepresentation, SemanticEmbeddingRequest
import asyncio

# if you are using this code in a jupyter notebook, uncomment the two lines below
# import nest_asyncio
# nest_asyncio.apply()

# our async function to embed the documents quicker (especially useful for larger document bases)
async def embed_docs(requests):

async with AsyncClient(token=os.getenv("AA_TOKEN")) as async_client:

responses = await asyncio.gather(
*(async_client.semantic_embed(req, model="luminous-base") for req in requests)
result = [response.embedding for response in responses]
return result

# creating the requests to embed the documents
requests = []
for text in splitted_docs:
embedding_params = {
"prompt": Prompt.from_text(text),
"representation": SemanticRepresentation.Document,
"compress_to_size": 128,
req = SemanticEmbeddingRequest(**embedding_params)

# if you are on the windows machine, use the following code to turn off the warnings
# def turn_off_windows_async_warning():
# if == "nt":
# asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy())
# turn_off_windows_async_warning()

embeddings =

# if you are using this code in a jupyter notebook, comment the line above and uncomment the line below
# embeddings_vectors = await embed_docs(requests)

Once we have created the embeddings, we can search through them. However, it's inefficient to regenerate the embeddings for the documents every time a user submits a query. Therefore, we need to store the embeddings in a database for quick access. This can be done by using a database specifically designed to store vectors.

Storing the Embeddings

You can use any type of vector store you like and save your embeddings on your local machine, your own servers, or any cloud environment. For this example, we will use Qdrant and save the files to our disk. If you are not familiar with Qdrant, please refer to their documentation. In the code snippet below, we set up a collection and then pass the embeddings and some metadata to the database.

from qdrant_client import QdrantClient
from qdrant_client.http.models import Distance, VectorParams, Batch

q_client = QdrantClient(path="your/db/path")

vectors_config=VectorParams(size=128, distance=Distance.COSINE),

ids = list(range(len(splitted_docs)))
payloads = [{"text": text} for text in splitted_docs]


After having created the vector database, we want to query the database to get the most relevant documents to a user query.

Querying the Database

Now we will focus on the upper part of the first graphic at the beginning of this tutorial. We will embed the incoming user query and run it against the database. The result will be a list of documents ranked in order of relevance to the user's query.

In case the structure of the user query is similar to the documents you embedded in your database, you should use the Symmetric representation. To find out more about our semantic embeddings, please refer to this part of our documentation.

query_text = "Do I need to pass border controls under the Schengen treaty as a national of the European Union?"

def embed_query(query):
embedding_params = {
"prompt": Prompt.from_text(query),
"representation": SemanticRepresentation.Query,
"compress_to_size": 128,
req = SemanticEmbeddingRequest(**embedding_params)
response = client.semantic_embed(req, model="luminous-base")
return response.embedding

query_embedding = embed_query(query_text)

# retrieve the three most relevant documents
search_result =

top_search_result_text = search_result[0].payload["text"]
top_search_result_score = search_result[0].score

print(f"""Most relevant text: {top_search_result_text}
Highest Score: {top_search_result_score}""")

# prints:
# Most relevant text: Article 2
# With regard to the movement of persons, from 15 June 1985 the police and customs authorities shall as a general rule carry out simple visual surveillance of private vehicles crossing the common border at reduced speed, without requiring such vehicles to stop.
# However, they may carry out more thorough controls by means of spot checks. These shall be performed where possible off the main road, so as not to interrupt the flow of other vehicles crossing the border.

# Article 3
# To facilitate visual surveillance, nationals of the Member States of the European Communities wishing to cross the common border in a motor vehicle may affix to the windscreen a green disc measuring at least eight centimetres in diameter. This disc shall indicate that they have complied with border police rules, are carrying only goods permitted under the duty-free arrangements and have complied with exchange regulations.

# Article 4
# The Parties shall endeavour to keep to a minimum the time spent at common borders in connection with checks on the carriage of passengers by road for hire or reward.
# The Parties shall seek solutions enabling them by 1 January 1986 to waive systematic checks at their common borders on passenger waybills and licences for the carriage of passengers by road for hire or reward.
# Highest Score: 0.5908173093381568


Now that we have retrieved the most relevant document, we can use it to actually answer the question. To do so, we will use the complete-endpoint.

from aleph_alpha_client import CompletionRequest

prompt_text = f"""{query_text}

Context: {top_search_result_text}

### Response:"""

params = {
"prompt": Prompt.from_text(prompt_text),
"maximum_tokens": 100,
"stop_sequences": ["\n"],
request = CompletionRequest(**params)
response = client.complete(request=request, model="luminous-supreme-control")
completion = response.completions[0].completion

f"""Prompt: {prompt_text}
Completion: {completion.strip()}"""
# Prompt: Do I need to pass border controls under the Schengen treaty as a national of the European Union?

# Context: Article 2
# With regard to the movement of persons, from 15 June 1985 the police and customs authorities shall as a general rule carry out simple visual surveillance of private vehicles crossing the common border at reduced speed, without requiring such vehicles to stop.
# However, they may carry out more thorough controls by means of spot checks. These shall be performed where possible off the main road, so as not to interrupt the flow of other vehicles crossing the border.

# Article 3
# To facilitate visual surveillance, nationals of the Member States of the European Communities wishing to cross the common border in a motor vehicle may affix to the windscreen a green disc measuring at least eight centimetres in diameter. This disc shall indicate that they have complied with border police rules, are carrying only goods permitted under the duty-free arrangements and have complied with exchange regulations.

# Article 4
# The Parties shall endeavour to keep to a minimum the time spent at common borders in connection with checks on the carriage of passengers by road for hire or reward.
# The Parties shall seek solutions enabling them by 1 January 1986 to waive systematic checks at their common borders on passenger waybills and licences for the carriage of passengers by road for hire or reward.

# ### Response:
# Completion: No, as a national of the European Union, you do not need to pass border controls under the Schengen treaty.

Explaining the Output

All of this is pretty cool but in order to make the outputs of these system traceable and trustable we need to go a step further. Therefore, we will use our explain-endpoint to investigate which part of the retrieved document influenced the completion the most.

To do so we need to define two things:

  1. Prompt: The text piece that should show the explanations.
  2. Target: The text piece that we want explanations on.

In this RAG setup we are mostly interested in the relevance the retrieved documents had on the output and omit structural elements of the prompt. Therefore, we will only put the retrieved text into the prompt. For the target, we will simply use the completion from the model.

from aleph_alpha_client import ExplanationRequest
import numpy as np

exp_req = ExplanationRequest(Prompt.from_text(top_search_result_text), completion, prompt_granularity="sentence")
response_explain = client.explain(exp_req, model="luminous-supreme-control")

explanations = response_explain.explanations[0].items[0].scores

for item in explanations:
start = item.start
end = item.start + item.length
print(f"""EXPLAINED TEXT: {search_result[0].payload["text"][start:end]}
SCORE: {np.round(item.score, decimals=3)}""")

# With regard to the movement of persons, from 15 June 1985 the police and customs authorities shall as a general rule carry out simple visual surveillance of private vehicles crossing the common border at reduced speed, without requiring such vehicles to stop.
# SCORE: -0.908
# EXPLAINED TEXT: However, they may carry out more thorough controls by means of spot checks.
# SCORE: -0.26
# EXPLAINED TEXT: These shall be performed where possible off the main road, so as not to interrupt the flow of other vehicles crossing the border.
# SCORE: -0.138
# To facilitate visual surveillance, nationals of the Member States of the European Communities wishing to cross the common border in a motor vehicle may affix to the windscreen a green disc measuring at least eight centimetres in diameter.
# SCORE: 2.308
# EXPLAINED TEXT: This disc shall indicate that they have complied with border police rules, are carrying only goods permitted under the duty-free arrangements and have complied with exchange regulations.
# SCORE: 0.438
# The Parties shall endeavour to keep to a minimum the time spent at common borders in connection with checks on the carriage of passengers by road for hire or reward.
# SCORE: 2.014
# EXPLAINED TEXT: The Parties shall seek solutions enabling them by 1 January 1986 to waive systematic checks at their common borders on passenger waybills and licences for the carriage of passengers by road for hire or reward.
# SCORE: -0.589

The output shows us the relevance of text elements on the granularity level of sentences and the most relevant sentences to the completion from the model are those with the highest positive number.

To summarize this tutorial, we split the documents, embedded them, saved the embeddings in a database, and ran a query against the database and used the retrieved document to answer the actual user question. Finally, we used our Explainability method to make the information from the completion traceable and trustable.