
With the recent release of Grok 4, supposedly the most intelligent AI model, there's a significant question about how well this model performs in c...
For further actions, you may consider blocking this person and/or reporting abuse
In all honesty in terms of pure code ability Claude sonnet 3.7 still wins (often produces cleaner, more reliable code, and excels in complex reason). Grok has its benefits but it falls short on instruction adherence and hits rate limits really fast under heavy usage. As for OPUS edges ahead in raw performance, this is noticeable on extra-long or intricate tasks. I guess whatever suits the use case is best.
I've moved on from 3.7 Sonnet, though currently, I'm sticking with Sonnet 4 and Opus. I don't have a great use case for Grok; it's not so good with coding, after all. Once they launch the code-tuned model next month, we'll see if things change for me. It's true, you go with whatever suits the situation best.
Makes perfect sense, Sonnet 4 and Opus are both solid choices! I’m also curious to see how Grok’s code tuned model performs too. I’d even go as far to add that if you truly want to get the most out of your tokens switching between models usually generates better results e.g (1 for writing) (1 for code) (1 for bugs) given that you can keep the model within the proper context/scope.
I'm so looking to their code model release. It's always a good idea to use different models based on the use case. I'll most likely test their code model this August. Will ping you then.
Just now im trying to use the Grok4, TBH i'm not happy.
Is it? Where did it fall short? I guess it's in coding? Opus all the way.
Great comparison. Didn't Grok say they were releasing a coding optimised version in the coming months? Would be interesting to redo the comparison again when they do
Sure. It's coming next month in August. Will do a quick coding test at that time with that coding-tuned model.
I don't plan to use this on a regular basis. What's the pricing?
This is free to use. The terminal app costs a little, but you can run it on your local machine or the server.
It depends on what coding agent we're using and the model. Otherwise, the MCP server is free.
For this entire test, I've used Cursor. It comes with some free requests, but it's not completely free, and to use any other custom models, you need to have the Pro plan.
Folks, let me know how has been your experience with Grok 4 so far! ✌️
Great comparison indeed. Thanks for sharing this, Shrijal! Love your work. ☺️
Thanks for checking, Aavash! ✌️