Helicone Community Page

Updated 4 months ago

Helicone's impact on latency in LLM calls

DDavid Alonso

Am I crazy or does Helicone add a bunch of latency to every single LLM call? I'm using the Vercel AI SDK

18 comments

JJustin

Hi David! It should not, unless you are using Cache or Rate Limiting

DDavid Alonso

ah okay, my understanding is that we need to hit your server first and then you forward to whatever LLM provider. But maybe I don't get what the baseUrl param does

DDavid Alonso

how can I check whether I'm using cache or rate limiting? feel free to share a link and happy to do some reading

JJustin

it'd be really explict

https://docs.helicone.ai/features/advanced-usage/caching#llm-caching

JJustin

If you want to use our async integration and pass traffic through Helicone you can use our OpenLLMetry integration

https://docs.helicone.ai/getting-started/integration-method/openllmetry

DDavid Alonso

this link is broken i think

Attachment