cache llm calls to disk #126

anotherjesse · 2024-09-05T00:14:29Z

cache to json on disk instead of in-memory

cache lives between restarts
facilitate debugging by viewing the json files
purge request(s) from cache by deleting file(s)

bfollington

Nice addition, I haven't had time to revisit this repo but lots of the model around Threads is probably unnecessary bloat from my current POV. Feel free to rip it up as needed.

Also, we might also benefit from looking at Anthropic's prompt caching https://www.anthropic.com/news/prompt-caching

anotherjesse · 2024-09-05T13:28:03Z

@bfollington good callout on the prompt caching - although I think it might require us to be semi-stable / additive in prompt generation (meaning as we know more, are we able to just append the new content vs inserting new content before - which would break the prompt cache)

I'm guessing we will want to have hints from the caller .. (don't worry about caching this vs this content will be used a bunch in the next 5 minutes...)

Hmm.. given that prompt caching has a limited lifetime, we could do something like "the second time we see with the same first 2k tokens of context within 5 minutes" we switch to cache mode..

https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching#cache-limitations

or ...

Prompt Caching introduces a new pricing structure where cache writes cost 25% more than base input tokens, while cache hits cost only 10% of the base input token price.

perhaps we just default to it, and monitor the cache hit rate ... emitting warnings when hit rate is low

cache llm calls to disk

2948f30

anotherjesse requested a review from bfollington September 5, 2024 00:14

bfollington approved these changes Sep 5, 2024

View reviewed changes

anotherjesse merged commit e19e0c8 into main Sep 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

cache llm calls to disk #126

cache llm calls to disk #126

Uh oh!

anotherjesse commented Sep 5, 2024 •

edited

Loading

Uh oh!

bfollington left a comment •

edited

Loading

Uh oh!

anotherjesse commented Sep 5, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

cache llm calls to disk #126

cache llm calls to disk #126

Uh oh!

Conversation

anotherjesse commented Sep 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bfollington left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

anotherjesse commented Sep 5, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

anotherjesse commented Sep 5, 2024 •

edited

Loading

bfollington left a comment •

edited

Loading