Chainlit supports streaming for both Message and Step. Here is an example with openai.

Streaming OpenAI response

from openai import AsyncOpenAI
import chainlit as cl

client = AsyncOpenAI(api_key="YOUR_OPENAI_API_KEY")


settings = {
    "model": "gpt-3.5-turbo",
    "temperature": 0.7,
    "max_tokens": 500,
    "top_p": 1,
    "frequency_penalty": 0,
    "presence_penalty": 0,
}


@cl.on_chat_start
def start_chat():
    cl.user_session.set(
        "message_history",
        [{"role": "system", "content": "You are a helpful assistant."}],
    )


@cl.on_message
async def main(message: cl.Message):
    message_history = cl.user_session.get("message_history")
    message_history.append({"role": "user", "content": message.content})

    msg = cl.Message(content="")

    stream = await client.chat.completions.create(
        messages=message_history, stream=True, **settings
    )

    async for part in stream:
        if token := part.choices[0].delta.content or "":
            await msg.stream_token(token)

    message_history.append({"role": "assistant", "content": msg.content})
    await msg.update()

Integrations

Streaming is also supported at a higher level for some integrations.

For example, to use streaming with Langchain just pass streaming=True when instantiating the LLM:

llm = OpenAI(temperature=0, streaming=True)

Also make sure to pass a callback handler to your chain or agent run.

See here for final answer streaming.