Reduce flow-credit usage by automatically using Cascade Base to run tools that pull in context
closed
B
Bala Bosch
Using a model like Claude 3.7 Sonnet usese a lot of flow credits on tasks (like pulling in context) that can much cheaper be done using free models like Cascade Base or Deepseek V3 (once its tool usage gets better)
Build in a feature to allow-list certain types of tool calls to automatically use Cascade Base instead of the selected, expensive, model.
This would vastly reduce flow credit consumption by models like Claude 3.7 Sonnet
Rohan Phadte
closed
Closing here as we made a major change to pricing and removed the flow action credits. We only have user prompt credits (charged when you send a message to Cascade).
This should allow your existing plans to go much, much farther. See more at our pricing:
https://windsurf.com/pricing
M
Marlen Brunner
Rohan Phadte Thankyou!! The new pricing looks _way_ more trustworthy and value to cost based!
H
Hari
Very quickly I exhausted my flow credits. Can there be inbuilt MCP from windsurf which can optimize this. Allow developer to choose that option. People like me may use that option freely and confidently.
Mustapha Abdulhameed Bolaji
used up 300 flex credits on just flows in a few hours, the same issue as everyone else. cascade re-reading the same files multiples times
I
Ivan Nite
Mustapha Abdulhameed Bolaji I pretty much use up my whole credits in a couple of days of a few hours per day.
Thinking of cancelling today and switching to Cursor. they at least have unlimited slow prompts.
Mustapha Abdulhameed Bolaji
Ivan Nite windsurf is great, if these issues are solved, i think we'll be good. the team just needs to act on it asap
I
Ivan Nite
Mustapha Abdulhameed Bolaji I cancelled today, I'll try cursor for a month and compare
Mike-Dev
Hi;
something must have changed here, since the last 2 updates, the credits consumption has gone up, the system reads the same documents several times within a chat. suffers from Alzheimer's, the system no longer recognizes what is available at the beginning of a chat after a few minutes. you have to point out very often that we have already discussed this, did the AI drink beer? that was much better before
Symbion
This is indeed getting to a state where it's getting interesting to look around for alternatives. The credit consumption has grown over the past weeks. Never before did I consume so much credits, but my work hasn't changed.
I agree it should use the base model for simple tasks that do not require Claude. And it should stop stealing credits by using multiple runs to read one file. See also https://codeium.canny.io/feature-requests/p/file-read-in-multiple-runs-consumes-multiple-credits
It's a shame that this is happening and I feel like I'm being ripped off.
If there are any alternatives to Windsurf/Cascade, please let me know in the comments. I'd be glad to try them out. Thank you.
M
Marlen Brunner
Symbion I have a co-worker who swears by Zed editor, https://zed.dev/,
I think you need to supply your own API key to the appropriate AI though (and pay for credits from that company instead)
Symbion
I'm also going to be looking at Augment: https://www.augmentcode.com/
Antoine
I've spent 150$ this weekend. This is unsustainable. At this point copy pasting manually the errors into cursor doesn't sound too bad. I'm switching next month if this is not fixed.
Jeremy Schoemaker
Antoinehi
Marvin
100% this must be solved.. My usage remains relatively constant and I was always getting over the month with my paid plan, but after an update in the last few weeks, credit usage has been skyrocketing that I had to purchase additional credits after just one week.
I simply stopped paying and I advise anyone else here to do the same until this stops. US Companies only understand money...
Update:
I looked into viable alternatives and after some try and error, for now, Cline / Roo Code with OpenRouter seems to be the next best alternative. Given how long this issue lingers around without being addressed, I suspect selling flex credits is their actual business model with the implication that the problem will never be resolved. Asking for better pricing is a bit like asking Tesla to make trucks that aren't glued together and fall apart. Yes you can ask, but nobody will change anything because that's how they make money.
Mateus Junior
Marvinif Cline and Roo code were valid alternatives everyone would be using it. The fact is that these two are much more expensive and consume tons more tokens.
Marvin
Mateus Junior I rarely comment on those kinds of statements without evidence, but this time I feel compelled to lay out an important difference. Windsurf adds a hard limit how much you can do before being forced to top up using their excessively expensive flow credits. Cline and Roo don't do that but instead let you choose how far you want to go. I used Roo on paid models and on free models and in both cases it got the job done and the result delivered within same day. In Windsurf, I would have spend a week debugging rubbish hallucinated by Claude. When a new and better model comes out, in Roo I get it same day so not sure what is not viable. Besides, if you were to check the open router stats you see that Roo and Cline have the highest token usage by a wide margin.
To your point on cost. I would be okay to pay up $200 to get the limits lifted. I have a job and no mood to waste time. Windsurf does not offer that and their $60 plan is a joke.
With Roo, you can chose and assuming you are burning through 800 Mio tokens a month (30 MiO day, 25 days), the most recent models stack up as following:
### Cost Rankings (from cheapest to most expensive)
- Gemini Flash 2.5: $201
- OpenAI GPT-4.1: $2,100 (10.45× more expensive than Gemini)
- Claude 3.7: $3,375 (16.79× more expensive than Gemini)
- OpenAI O3: $10,500 (52.24× more expensive than Gemini)
### Effective Rate Per Million Tokens (blended)
- Gemini Flash 2.5: $0.24
- OpenAI GPT-4.1: $2.55
- Claude 3.7: $4.09
- OpenAI O3: $12.73
### Monthly Savings Potential (compared to O3)
- Switching from O3 to Gemini Flash 2.5: $10,299 (98.09% savings)
- Switching from O3 to GPT-4.1: $8,400 (80.00% savings)
- Switching from O3 to Claude 3.7: $7,125 (67.86% savings)
I worked with both Gemini 2.5 Pro (Free and paid) as well as Gemini Flash 2.5 and on many tasks they perform the same so you can safely use the cheapest Fladh model.
I already have cancelled my Windsurf subscription and Im not going back because with Roo I get real products shipped in days.
Mateus Junior
Marvin I love that answer. It really puts a light into an emerging kind of usage pattern that is unheard of before the AI age, which is the fact that some people are ready to spend an absurd amount of money to progress.
Now don't misunderstand me I can totally see lots valid use cases for doing that.
And yes I can totally see that currently Roo and similar would support this kind of usage a lot better. If Windsurf wants to compete they must have some kind of no BS unlimited plan that does not care about token usage and just focus on getting the task done.
Thanks for your detailed explanation.
P
Paulo Vargatt
That's right, before (2 months ago) windsurf, didn't use so many flow credits because the model read the file only once, now it reads 200 lines at a time, making the flow credits be used much faster, it didn't happen before! now we have a situation where we use half of the prompt credits (250) for 1500 flow action credits executions.
M
Mike Thomas
I don't understand why cascade reads 100 lines of code one time and then 20 lines of code the next. I burn through so many credits just reading code. Ridiculous.
Matej Mulej
Just got an email that such feedbacks exist and can be voted on. Seeing this one makes me hope more people will click on the link, because it's a supreme suggestion. I too had to make a rebuy to continue playing last month. And I don't consider myself in the top percents of heavy users at all. How can this have only 29 votes?
Load More
→