Serving DeepSeek-R1 with Vertex AI Model Garden

HU, PiliHU, Pili
7 min read

Deploy

Go to Model Garden and search for deepseek-ai/deepseek-r1. With one click, you can deploy a DeepSeek endpoint on VertexAI. It takes about 30min to complete the process and then you would have the end point information.

Find end points

On the Model Garden page, find view my endpoints and models. Note that the page lists models & endpoints by region (a dropdown selection).

The region is important information, as it will be used: 1) as part of the end point hostname; 2) as one argument of the library or part of the path.

Note the project (digits) and endpoint_id (digits). Those are key information to be used in the client code.

Client code in one

Just in case you do not have the dependency:

pip install --upgrade --user --quiet "google-cloud-aiplatform[reasoningengine]"

Set the variables obtianed from last section:

PROJECT_ID = "SSSS"
LOCATION = "asia-southeast1"
PROJECT_NUMBER = 'DDDD'
ENDPOINT_ID = 'DDDD'

Utility 1:

# Copied from: https://github.com/googleapis/python-aiplatform/blob/main/samples/snippets/prediction_service/predict_custom_trained_model_sample.py

from typing import Dict, List, Union
from google.cloud import aiplatform
from google.protobuf import json_format
from google.protobuf.struct_pb2 import Value

def predict_custom_trained_model_sample(
    project: str,
    endpoint_id: str,
    instances: Union[Dict, List[Dict]],
    location: str = "us-central1",
    api_endpoint: str = "us-central1-aiplatform.googleapis.com",
):
    """
    `instances` can be either single instance of type dict or a list
    of instances.
    """
    # The AI Platform services require regional API endpoints.
    client_options = {"api_endpoint": api_endpoint}
    # Initialize client that will be used to create and send requests.
    # This client only needs to be created once, and can be reused for multiple requests.
    client = aiplatform.gapic.PredictionServiceClient(client_options=client_options)
    # The format of each instance should conform to the deployed model's prediction input schema.
    instances = instances if isinstance(instances, list) else [instances]
    instances = [
        json_format.ParseDict(instance_dict, Value()) for instance_dict in instances
    ]
    parameters_dict = {}
    parameters = json_format.ParseDict(parameters_dict, Value())
    endpoint = client.endpoint_path(
        project=project, location=location, endpoint=endpoint_id
    )
    response = client.predict(
        endpoint=endpoint, instances=instances, parameters=parameters
    )
    print("response")
    print(" deployed_model_id:", response.deployed_model_id)
    # The predictions are a google.protobuf.Value representation of the model's predictions.
    predictions = response.predictions
    for prediction in predictions:
        print(" prediction:", prediction)
    return response

Utility 2:

import os
import vertexai

vertexai.init(project=PROJECT_ID, location=LOCATION) #, staging_bucket=BUCKET_URI)

def chat(query):
  prediction_request = {
      "instances": [
          {
              "@requestFormat": "chatCompletions",
              "messages": [
                  {
                      "role": "user",
                      "content": query,
                  }
              ],
              "max_tokens": 2048,
              "temperature": 0.7,
          }
      ]
  }
  api_endpoint = f"{LOCATION}-aiplatform.googleapis.com"
  response = predict_custom_trained_model_sample(
      project=PROJECT_NUMBER,
      endpoint_id=ENDPOINT_ID,
      location=LOCATION,
      instances=prediction_request['instances'],
      api_endpoint = api_endpoint,
  )
  m = response.predictions[0][0]
  return(m['message']['content'])

Now is happy chatting time:

response_text = chat('what are the three numbers after 4, 9, 16, 25')
print('reponse_text:')
print(response_text)

Tests

Derive perfect square

Request:

response_text = chat('what are the three numbers after 4, 9, 16, 25')
print('reponse_text:')
print(response_text)

Reponse:

<think>
To determine the next three numbers in the sequence 4, 9, 16, 25, I first observe the pattern between consecutive terms.

Calculating the differences between each pair of numbers:
- 9 - 4 = 5
- 16 - 9 = 7
- 25 - 16 = 9

I notice that the differences are increasing by 2 each time, which suggests a quadratic relationship.

Assuming the nth term of the sequence can be represented by a quadratic function, I set up the equation:
\[ a_n = an^2 + bn + c \]

Using the known terms to create a system of equations:
- For n=1: \( a(1)^2 + b(1) + c = 4 \) → \( a + b + c = 4 \)
- For n=2: \( a(2)^2 + b(2) + c = 9 \) → \( 4a + 2b + c = 9 \)
- For n=3: \( a(3)^2 + b(3) + c = 16 \) → \( 9a + 3b + c = 16 \)

Solving this system:
1. Subtract the first equation from the second:
   \( 3a + b = 5 \)
2. Subtract the second equation from the third:
   \( 5a + b = 7 \)
3. Subtract the new equations to eliminate b:
   \( 2a = 2 \) → \( a = 1 \)
4. Substitute a back to find b:
   \( 3(1) + b = 5 \) → \( b = 2 \)
5. Finally, substitute a and b to find c:
   \( 1 + 2 + c = 4 \) → \( c = 1 \)

Thus, the quadratic function representing the nth term is:
\[ a_n = n^2 + 2n + 1 \]

Using this formula to find the next three terms:
- For n=5: \( 5^2 + 2(5) + 1 = 36 \)
- For n=6: \( 6^2 + 2(6) + 1 = 49 \)
- For n=7: \( 7^2 + 2(7) + 1 = 64 \)

Therefore, the next three numbers in the sequence are 36, 49, and 64.
</think>

To identify the next three numbers in the sequence \(4, 9, 16, 25\), let's analyze the pattern.

### Step 1: Identify the Pattern

Looking at the given numbers:
- \(4 = 2^2\)
- \(9 = 3^2\)
- \(16 = 4^2\)
- \(25 = 5^2\)

It appears that each term in the sequence is a perfect square. Specifically, the \(n\)-th term corresponds to \((n+1)^2\), where \(n\) starts from 1.

### Step 2: Determine the Next Terms

To find the next three numbers, we'll continue the pattern:

1. **Next Term (5th term):**
   \[
   (5 + 1)^2 = 6^2 = 36
   \]

2. **Following Term (6th term):**
   \[
   (6 + 1)^2 = 7^2 = 49
   \]

3. **Final Term (7th term):**
   \[
   (7 + 1)^2 = 8^2 = 64
   \]

### Step 3: Conclusion

Therefore, the next three numbers in the sequence are:

\[
\boxed{36,\ 49,\ 64}
\]

Reason impact of presidential election

Query:

my client owns BYD, GOOD, HKEX, 0700. Please rank in order of impact level when the news comes out that Trump is reelected as president.

Responses: (Long thinking and output are ommited; only the key conclusion)

1. **BYD**: As a Chinese electric vehicle and energy storage company, it is likely to face significant challenges due to Trump's trade policies, including tariffs and support for domestic manufacturing.

2. **0700 (Tencent)**: A major Chinese tech company, it could be impacted by Trump's restrictions on Chinese tech firms, including potential delisting threats and operational limitations in the U.S.

3. **HKEX**: As a key stock exchange in Hong Kong, it may face challenges due to U.S.-China tensions, including reduced IPO activity and geopolitical instability affecting its operations.

4. **GOOD (Gladstone Commercial)**: As a U.S.-based real estate investment trust, it is likely to be the least impacted and may even benefit from certain economic policies under Trump's administration.

Reframe and typo fix

Note that in the above example, the last symbol GOOD was meant to be GOOG. We tested the same question with Gemini and GPT4o. All failed to correct GOOD. The Gemini 2.0 Flash with Google search gounding enabled, automatically detected the issue and reframed "GOOD (Alphabet Inc.)" in the response.

Then we ask DeepSeek to reframe:

The users original question was "my client owns BYD, GOOD, HKEX, 0700. Please rank in order of impact level when the news comes out that Trump is reelected as president." Please help to reframe the question and help to fix typos.

The response from DeepSeek is not a reframed question or typo fix. Instead, it tried to answer the original question in double quotes. The long version output is ommited.

We as Gemini 1.5 Flash to do the reframing:

The original question has a few issues: typos ("GOOD" is likely a ticker symbol needing clarification), ambiguity (what kind of "impact" is being measured – stock price movement, geopolitical consequences, etc.?), and an assumption (that Trump's reelection would have a significant impact on all listed entities).

Here are a few ways to reframe the question, each addressing different aspects of the ambiguity:

Option 1 (Focus on Stock Market Impact):

"My client holds shares in BYD, [Company Name corresponding to "GOOD"], HKEX, and Tencent Holdings (0700). Assuming a Trump reelection, how would you rank these holdings in terms of expected short-term stock price volatility? Please specify the reasoning behind your ranking."

... [Option 2 and Option 3 are omitted]...

Before answering any reframed question, it's crucial to identify the ticker symbol "GOOD" and replace it with the correct company name. Without this information, a complete and accurate ranking is impossible.

In this test, Gemini's instruction following is good -- It needs to understand the language structure and quoted content.

Cost

The one-click deployed endpoint costs USD $1600/ day.

References

References:

  • AAIE notebook.
  • The "sampe request" on Vertex AI model garden.
  • PredictionServiceClient sample request.
  • The verison name used in this experiment is deepseek-ai/DeepSeek-R1-Distill-Llama-70B .
0
Subscribe to my newsletter

Read articles from HU, Pili directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

HU, Pili
HU, Pili

Just run the code, or yourself.