Welcome SAP-RPT-1! What is it? How can you try it out?

SAP announced SAP-RPT-1 at TechEd 2025 in Berlin. What is it good for, and how can you try it out? In this blog post, I’ll walk you through the announcement and show you how to get started, both in a no-code approach and in Python.

What is SAP-RPT-1?

RPT stands for Relational Pre-Trained Transformer which SAP wants us to pronounce as ‘rapid one’. To make sure this is pronounced correctly, they even write out phonetic transcription [ˈræpɪd] [wʌn] - so no excuses here 😉.

If you are looking for a quick executive summary, check out Philipp Herzig’s Keynote presentation which explains that RPT-1 is a new type of AI model that is optimized for making predictions on tabular data. Unlike LLMs which generate the next word (token) for a text sequence, RPT-1 predicts the next field in a table row. Putting this in a practical example, Philipp mentions that RPT-1 can predict, for example, delivery times or customer churn, essentially replacing traditional ML models like XGBoost or Random Forest with a single RPT-1 model.

For now, SAP-RPT-1 comes in 3 flavors:

SAP-RPT-1-Small: A small model optimized for super-fast predictions
SAP-RPT-1-Large: A larger model optimized for highest accuracy
SAP-RPT-1-OSS: An “open source” version for everybody to learn (Github / Huggingface)

The way Philipp talked about the OSS version, I assume that this is also the version which is available in the RPT Playground, your starting point to try out RPT-1 in a no-code environment.

Trying out SAP-RPT-1 in the RPT Playground

The easiest way to get started with RPT-1 is the RPT Playground. SAP has prepared some data, but you can also upload your own tabular data (CSV files) and use RPT-1 to make predictions on it. The trick for making predictions with RPT-1 is to mask the field you want to predict with [PREDICT]. Once you click on the “Predict” button, RPT-1 will fill in the masked field with its prediction.

While SAP’s prepared use cases work well, you’ll get a better sense of RPT-1’s capabilities by trying it with your own data. To help you get started, I’ve prepared the example data from the coding tutorial (allowing you to predict sales groups from Product, Price, Customer, Country) in a CSV file for you to download. You can easily modify this file in Excel (or any other editor), save as CSV and try your own data in the RPT Playground. If you’d prefer to start simple, try out the example CSV file I have prepared for you.

If you’re not a coder, simply skip the coding section and jump right to the conclusion.

Trying out SAP-RPT-1 in Python

If you want to get your hands dirty with code, I have prepared a Jupyter Notebook that shows in detail how to use RPT-1.

Prerequisites

Assuming you have access to the RPT Playground and that you have downloaded the Jupyter notebook, you need to fetch your access token from the bottom of the playground page.

To avoid storing your credentials in the notebook, copy & paste the token (RPT_TOKEN) into a .env file in the same folder where you have stored the Jupyter notebook. It should look like this:

RPT_TOKEN="eyJhbGciOiJIUzI1" # truncated, your token is a lot longer

Once you’ve done that, you’re ready to run the notebook.

Preparing Test Data

Here’s the test data in the format RPT-1 expects:

Code

payload = {
    "prediction_config": {
        "target_columns": [
            {
                "name": "SALESGROUP",
                "prediction_placeholder": "[PREDICT]"
                # "task_type": "classification" or "regression" can be specified here if needed
            }
        ]
    },
    "index_column": "ID",
    "rows": [
        {
            "ID": "1001",
            "PRODUCT": "Tablet",
            "PRICE": 599.00,
            "CUSTOMER": "TechStart Inc",
            "COUNTRY": "USA",
            "SALESGROUP": "[PREDICT]"
        },
        {
            "ID": "1002",
            "PRODUCT": "Standing Desk",
            "PRICE": 325.50,
            "CUSTOMER": "Workspace Solutions",
            "COUNTRY": "Germany",
            "SALESGROUP": "[PREDICT]"
        },
        {
            "ID": "1003",
            "PRODUCT": "Workstation",
            "PRICE": 1450.00,
            "CUSTOMER": "Enterprise Systems Ltd",
            "COUNTRY": "Canada",
            "SALESGROUP": "Enterprise Solutions"
        },
        {
            "ID": "1004",
            "PRODUCT": "Laptop Pro",
            "PRICE": 1899.99,
            "CUSTOMER": "Business Corp",
            "COUNTRY": "UK",
            "SALESGROUP": "Enterprise Solutions"
        },
        {
            "ID": "1005",
            "PRODUCT": "Gaming Laptop",
            "PRICE": 1250.00,
            "CUSTOMER": "Digital Ventures",
            "COUNTRY": "USA",
            "SALESGROUP": "Enterprise Solutions"
        },
        {
            "ID": "1006",
            "PRODUCT": "Smart Watch",
            "PRICE": 299.99,
            "CUSTOMER": "Gadget Store",
            "COUNTRY": "Australia",
            "SALESGROUP": "Consumer Electronics"
        },
        {
            "ID": "1007",
            "PRODUCT": "Ergonomic Chair",
            "PRICE": 445.00,
            "CUSTOMER": "Office Outfitters",
            "COUNTRY": "France",
            "SALESGROUP": "Office Furniture"
        },
        {
            "ID": "1008",
            "PRODUCT": "Storage Array",
            "PRICE": 3500.00,
            "CUSTOMER": "CloudTech Systems",
            "COUNTRY": "Singapore",
            "SALESGROUP": "Data Infrastructure"
        },
        {
            "ID": "1009",
            "PRODUCT": "Network Switch",
            "PRICE": 175.50,
            "CUSTOMER": "ConnectIT",
            "COUNTRY": "Japan",
            "SALESGROUP": "Networking Devices"
        }
    ]
}

For better readability, here is the data in table format:

Code

import pandas as pd

sample_data = pd.DataFrame(payload["rows"])
sample_data

	ID	PRODUCT	PRICE	CUSTOMER	COUNTRY	SALESGROUP
0	1001	Tablet	599.00	TechStart Inc	USA	[PREDICT]
1	1002	Standing Desk	325.50	Workspace Solutions	Germany	[PREDICT]
2	1003	Workstation	1450.00	Enterprise Systems Ltd	Canada	Enterprise Solutions
3	1004	Laptop Pro	1899.99	Business Corp	UK	Enterprise Solutions
4	1005	Gaming Laptop	1250.00	Digital Ventures	USA	Enterprise Solutions
5	1006	Smart Watch	299.99	Gadget Store	Australia	Consumer Electronics
6	1007	Ergonomic Chair	445.00	Office Outfitters	France	Office Furniture
7	1008	Storage Array	3500.00	CloudTech Systems	Singapore	Data Infrastructure
8	1009	Network Switch	175.50	ConnectIT	Japan	Networking Devices

As you can see, we have 4 attribute columns: Product, Price, Customer, Country. Our target column is the Sales Group. Therefore, the sales group is masked in 2 lines with the value [PREDICT], so that RPT-1 can demonstrate that it can fill in the missing values.

Running the predictions

To run the predictions, we will use the requests library to send an HTTP POST request to the RPT-1 API endpoint. The request will include our test data and the authorization token.

Code

from dotenv import load_dotenv
import os

load_dotenv()

auth_token = os.getenv("RPT_TOKEN")

Code

import requests

url = "https://rpt.cloud.sap/api/predict"

headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {auth_token}"
}

response = requests.post(url, json=payload, headers=headers)

Analyzing the results

Let’s take a look at the results returned by RPT-1. Here are the predictions in the raw JSON format:

Code

import json

data = response.json()
preds = data["prediction"]["predictions"]
print(json.dumps(preds, indent=2, ensure_ascii=False))

[
  {
    "ID": 1001,
    "SALESGROUP": [
      {
        "confidence": null,
        "prediction": "Enterprise Solutions"
      }
    ]
  },
  {
    "ID": 1002,
    "SALESGROUP": [
      {
        "confidence": null,
        "prediction": "Office Furniture"
      }
    ]
  }
]

Let’s merge these predictions back into the tabular format to see the results more clearly.

Code

preds_df = sample_data.copy(deep=True)

preds_df["ID"] = preds_df["ID"].astype(int)

for pred in preds:
    row_idx = int(pred["ID"])
    predicted_value = pred["SALESGROUP"][0]["prediction"]
    preds_df.loc[preds_df["ID"] == row_idx, "SALESGROUP"] = predicted_value

preds_df

	ID	PRODUCT	PRICE	CUSTOMER	COUNTRY	SALESGROUP
0	1001	Tablet	599.00	TechStart Inc	USA	Enterprise Solutions
1	1002	Standing Desk	325.50	Workspace Solutions	Germany	Office Furniture
2	1003	Workstation	1450.00	Enterprise Systems Ltd	Canada	Enterprise Solutions
3	1004	Laptop Pro	1899.99	Business Corp	UK	Enterprise Solutions
4	1005	Gaming Laptop	1250.00	Digital Ventures	USA	Enterprise Solutions
5	1006	Smart Watch	299.99	Gadget Store	Australia	Consumer Electronics
6	1007	Ergonomic Chair	445.00	Office Outfitters	France	Office Furniture
7	1008	Storage Array	3500.00	CloudTech Systems	Singapore	Data Infrastructure
8	1009	Network Switch	175.50	ConnectIT	Japan	Networking Devices

For comparison, here is the original sample data before predictions:

Code

sample_data

	ID	PRODUCT	PRICE	CUSTOMER	COUNTRY	SALESGROUP
0	1001	Tablet	599.00	TechStart Inc	USA	[PREDICT]
1	1002	Standing Desk	325.50	Workspace Solutions	Germany	[PREDICT]
2	1003	Workstation	1450.00	Enterprise Systems Ltd	Canada	Enterprise Solutions
3	1004	Laptop Pro	1899.99	Business Corp	UK	Enterprise Solutions
4	1005	Gaming Laptop	1250.00	Digital Ventures	USA	Enterprise Solutions
5	1006	Smart Watch	299.99	Gadget Store	Australia	Consumer Electronics
6	1007	Ergonomic Chair	445.00	Office Outfitters	France	Office Furniture
7	1008	Storage Array	3500.00	CloudTech Systems	Singapore	Data Infrastructure
8	1009	Network Switch	175.50	ConnectIT	Japan	Networking Devices

Conclusion

As we can see, RPT-1 has filled in the missing sales groups based on the patterns contained in the context, i.e., the other entries. Essentially, the non-masked rows serve as few-shot examples that RPT-1 uses for in-context learning. This behavior is comparable to an LLM imitating examples you provide in a prompt. Here’s a simple example of such a prompt:

Write an executive summary for the Lord of the Rings.

Here is an example for Romeo and Juliet:
Boy and girl fall in love, in the end they both die.

If you try this prompt, you might get a result like this:

Guy inherits ring. Walks a lot. Destroys it.

Similarly, RPT-1 uses the non-masked rows as examples to predict the masked values. The key difference is that RPT-1 does not semantically generate text, but it predicts the most likely value for the masked fields based on the patterns it has learned during training.

Of course, the example we used is very simple, and it will be interesting to see how RPT-1 performs on more complex datasets. For now, I hope you are all set to try SAP-RPT-1 with your own data. Happy experimenting!

Reuse

CC BY 4.0