The unexpected behavior with Helicone and streaming res...

At a glance

The community member is having an issue with Helicone, where the response they are logging is different from what Helicone is showing. They are using Vercel AI to stream the response to the frontend, and they cannot use a gateway since their inference server is hosted on an IP address without a domain. The community member has verified that they are logging a simple object with id and completion properties, but Helicone is showing a completely different response. They are frustrated and want to know what can be done about this unexpected behavior.

In the comments, a Helicone representative suggests that they are doing this to reconstruct the body and calculate the usage tokens. They recommend omitting the stream:true field to prevent the async logger from attempting to reconstruct the body. If the community member wants to include this field, they are asked to let the Helicone representative know so they can figure out a solution.

Another community member mentions that the tokens are not being counted for the gpt-4-1106-preview and gpt-4-0125-preview models, indicating that the issue is still ongoing.

The Helicone representative then asks the community member to add them to their organization and provide the organization name so they can debug the issue.</

LLuka

I can't figure out why Helicone invents different response than what I am logging when stream=true. I verify that I log a simple object with id and completion properties and yet in Helicone I get completely different one. Why??

This is limiting me from releasing the new feature and I must admit it's pretty frustrating. Request is logged all good. I use async logging and pass this object to providerResponse.json. It's been working for everything else.

It also works if I omit stream:true property. I cannot use gateway since it's my own inference server hosted on IP address (no domain).

What can be done about this rather unexpected behaviour?

For context: I'm using Vercel AI to stream response to the frontend and thus I don't have access to the direct streaming response - only the full string upon completion.

Attachments

3 comments

JJustin

Hi Luka!

We do this to reconstruct the body and calculate the usage tokens.

I would recommend omitting the stream:true field for now to prevent the async logger from attempting to restruct the body for you.

If you want to include this field for some reason, please let us know and we can figure out a solution that makes sense

LLuka

Hey @Justin , this is not working anymore.Tokens not counted for gpt-4-1106-preview, gpt-4-0125-preview models.

JJustin

Hi! Sorry Luka for the issue, Would you mind adding me to your org justin@helicone.ai and let me know your org name and we can debug? @Luka

Add a reply

Helicone Community Page

The unexpected behavior with Helicone and streaming responses