Using SAP-RPT-1 in the playground is a nice way to get a first impression, but you’ll soon realize there are some limitations. In this blog post, let’s explore how to use SAP-RPT-1 in a production-like environment. I’ll walk you through the steps to build a simple “Hello World” application using SAP-RPT-1 and AI Core.
Creating a Deployment for SAP-RPT-1
Throughout this blog post, I’ll assume that you have access to a BTP subaccount with instances of the AI Core service (extended plan) and the AI Launchpad service (standard plan).
Configuration Name: For example, sap-rpt-1-large or sap-rpt-1-small (I recommend naming the configuration so that you can easily recognize the underlying model)
Scenario: foundation-models
Version: 0.0.1 (default value)
Executable: aicore-sap (meaning that this model is hosted by SAP; you can find this value in note 3437766)
Model Name: sap-rpt-1-large or sap-rpt-1-small (again, as per note 3437766)
Model Version: latest
In the next step, you can turn this configuration into a deployment by clicking on the “Create Deployment” button in the configuration overview. You can leave all the default values as they are and create the deployment.
After a few minutes, the deployment should reach the status “Running” and you can start using it. Note that a deployment URL has also been created, something like https://api.ai.prod.eu-central-1.aws.ml.hana.ondemand.com/v2/inference/deployments/1234567890abcdef. The most important part is the deployment ID at the end of the URL (in this example 1234567890abcdef), which you’ll need later on.
Handling AI Core Authentication
At the time of writing, the SAP Cloud SDK for Python does not yet support the RPT-1 models. Following the documentation, we will therefore use the requests library to call the deployment endpoint directly.
Before we can do that, however, we need to authenticate as also described in the documentation. What may look intimidating at first is actually quite straightforward. We only need a few parameters which we’ll store securely in a .env file. You can collect this information from the service key of your AI Core instance.
AICORE_API_URL="https://api.ai.prod.eu-central-1.aws.ml.hana.ondemand.com/v2"AICORE_AUTH_URL="https://a1b2c3d4e4f61234.authentication.eu10.hana.ondemand.com"# called "url" in the service keyAICORE_CLIENT_ID="sb-95d9f03d"# shortened, yours will be longerAICORE_CLIENT_SECRET="c51ad55c"# shortened, yours will be longerAICORE_RESOURCE_GROUP="default"# your actual resource group
Additionally, store the RPT deployment ID(s) you collected earlier:
RPT1L_DEPLOYMENT_ID="1234567890abcdef"# your actual deployment ID for RPT-1 largeRPT1S_DEPLOYMENT_ID="abcdef1234567890"# your actual deployment ID for RPT-1 small
Once the .env file is ready, we can use the following code to retrieve an access token from the authentication endpoint:
For testing the connection to the RPT-1 deployment, we’ll use the same test data we used when trying out RPT-1 in the playground. Here’s the test data in the format RPT-1 expects:
Note: For the next steps, I also created a Jupyter notebook, which you can download to interactively do all the steps below yourself.
For better readability, here is the data in table format:
Code
import pandas as pdsample_data = pd.DataFrame(payload["rows"])sample_data
ID
PRODUCT
PRICE
CUSTOMER
COUNTRY
SALESGROUP
0
1001
Tablet
599.00
TechStart Inc
USA
[PREDICT]
1
1002
Standing Desk
325.50
Workspace Solutions
Germany
[PREDICT]
2
1003
Workstation
1450.00
Enterprise Systems Ltd
Canada
Enterprise Solutions
3
1004
Laptop Pro
1899.99
Business Corp
UK
Enterprise Solutions
4
1005
Gaming Laptop
1250.00
Digital Ventures
USA
Enterprise Solutions
5
1006
Smart Watch
299.99
Gadget Store
Australia
Consumer Electronics
6
1007
Ergonomic Chair
445.00
Office Outfitters
France
Office Furniture
7
1008
Storage Array
3500.00
CloudTech Systems
Singapore
Data Infrastructure
8
1009
Network Switch
175.50
ConnectIT
Japan
Networking Devices
Running the predictions
To run the predictions, we will use the requests library to send an HTTP POST request to the RPT-1 deployment endpoint. The request will include our test data and the authorization token.
Code
AICORE_API_URL = os.environ["AICORE_API_URL"].rstrip("/")AICORE_RESOURCE_GROUP = os.environ.get("AICORE_RESOURCE_GROUP", "default")RPT1_DEPLOYMENT_ID = os.environ.get("RPT1L_DEPLOYMENT_ID")ifnot RPT1_DEPLOYMENT_ID:raiseValueError("Missing RPT1L_DEPLOYMENT_ID in .env (deployment id from AI Launchpad).")url =f"{AICORE_API_URL}/inference/deployments/{RPT1_DEPLOYMENT_ID}/predict"headers = {"Authorization": f"Bearer {access_token}","AI-Resource-Group": AICORE_RESOURCE_GROUP,"Content-Type": "application/json","Accept": "application/json",}response = requests.post(url, headers=headers, json=payload, timeout=120)response.raise_for_status()
Analyzing the results
Let’s take a look at the results returned by RPT-1. Here are the predictions in the raw JSON format:
Note that the result looks different from what we saw in the playground, which hosts the RPT-1-OSS model. RPT-1-Large also returns a confidence score for each prediction.
Let’s merge these predictions back into the tabular format to see the results more clearly:
Code
preds_df = sample_data.copy(deep=True)preds_df["ID"] = preds_df["ID"].astype(int)for pred in preds: row_idx =int(pred["ID"]) predicted_value = pred["SALESGROUP"][0]["prediction"] preds_df.loc[preds_df["ID"] == row_idx, "SALESGROUP"] = predicted_valuepreds_df
ID
PRODUCT
PRICE
CUSTOMER
COUNTRY
SALESGROUP
0
1001
Tablet
599.00
TechStart Inc
USA
Enterprise Solutions
1
1002
Standing Desk
325.50
Workspace Solutions
Germany
Office Furniture
2
1003
Workstation
1450.00
Enterprise Systems Ltd
Canada
Enterprise Solutions
3
1004
Laptop Pro
1899.99
Business Corp
UK
Enterprise Solutions
4
1005
Gaming Laptop
1250.00
Digital Ventures
USA
Enterprise Solutions
5
1006
Smart Watch
299.99
Gadget Store
Australia
Consumer Electronics
6
1007
Ergonomic Chair
445.00
Office Outfitters
France
Office Furniture
7
1008
Storage Array
3500.00
CloudTech Systems
Singapore
Data Infrastructure
8
1009
Network Switch
175.50
ConnectIT
Japan
Networking Devices
For comparison, here is the original sample data before predictions:
Code
sample_data
ID
PRODUCT
PRICE
CUSTOMER
COUNTRY
SALESGROUP
0
1001
Tablet
599.00
TechStart Inc
USA
[PREDICT]
1
1002
Standing Desk
325.50
Workspace Solutions
Germany
[PREDICT]
2
1003
Workstation
1450.00
Enterprise Systems Ltd
Canada
Enterprise Solutions
3
1004
Laptop Pro
1899.99
Business Corp
UK
Enterprise Solutions
4
1005
Gaming Laptop
1250.00
Digital Ventures
USA
Enterprise Solutions
5
1006
Smart Watch
299.99
Gadget Store
Australia
Consumer Electronics
6
1007
Ergonomic Chair
445.00
Office Outfitters
France
Office Furniture
7
1008
Storage Array
3500.00
CloudTech Systems
Singapore
Data Infrastructure
8
1009
Network Switch
175.50
ConnectIT
Japan
Networking Devices
Conclusion
We’ve successfully transitioned from using SAP-RPT-1 in the playground to deploying and using it via AI Core. This setup allows you to use RPT-1 in a scalable, production-like environment. On the playground, for example, you quickly hit limits if you try more complex scenarios or larger datasets. Using a deployment via AI Core overcomes these limitations.
I already have quite a few ideas for building more complex applications using RPT-1 and AI Core. Which scenarios will you build first? Let me know in the comments!