Chat Completions

Streaming

Stream chat completion responses token by token using server-sent events.

DEVUP AI supports streaming responses via server-sent events (SSE), the same protocol as OpenAI. Set stream: true in your request to enable it.

Examples

from openai import OpenAI

openai = OpenAI(
    api_key="$DEVUP_API_KEY",
    base_url="https://api.devupai.com/v1",
)

stream = openai.chat.completions.create(
    model="deepseek-ai/DeepSeek-V3",
    messages=[{"role": "user", "content": "Hello"}],
    stream=True,
)

for event in stream:
    if event.choices[0].finish_reason:
        print(event.choices[0].finish_reason,
              event.usage['prompt_tokens'],
              event.usage['completion_tokens'])
    else:
        print(event.choices[0].delta.content, end="", flush=True)

SSE format

Each streamed chunk is a data: line containing a JSON object:

json

data: {"id":"chatcmpl-...","object":"chat.completion.chunk","choices":[{"delta":{"role":"assistant","content":"Hello"}}]}

data: {"id":"chatcmpl-...","object":"chat.completion.chunk","choices":[{"delta":{}}],"usage":{"prompt_tokens":10,"completion_tokens":1}}

data: [DONE]

The final chunk before [DONE] contains usage information.

Notes

Streaming works for all supported models.
Usage stats are available in the last chunk (when finish_reason is set).
The completion_tokens and prompt_tokens counts are the same as non-streaming.

Chat Completions

Tool Calling