@alexalbert__ I think a cool api feature would be "prospective caching" e.g. to have the input + output cached so that if I make a call planning to append the model output before making another call I can save ~18% on that part of the conversation