Thread
Tweet 1
The best benchmarks are the ones that confirm what you always knew in your heart: Sonnet 3.7 is the best model and a big jump over 3.6
---
Tweet 2
With 3.6 it was so important to cutoff your agent and start over at fairly few tokens. Now it feels like it can just go and go
---