Add Thinking Effort Configuration¶
Context¶
Users have no way to control thinking/reasoning effort in Keen Code. This adds:
1. A StepThinking in the /model setup flow for models whose registry entry has non-empty thinking_efforts
2. A /thinking runtime command to change effort on the fly using the current model's supported values
3. Persistent thinking_effort in ~/.keen/configs.json
The registry stores provider-native effort values. The CLI adds an off choice on top of that:
- Anthropic UI: off + low|medium|high|max
- OpenAI UI: off + low|medium|high|xhigh
- Google AI UI: off + low|medium|high
Persist the provider-native value in config; represent off as "" in config/runtime state so request builders can omit provider-specific reasoning fields cleanly.
Step 1: Registry — add per-model thinking_efforts¶
File: providers/registry.yaml
Replace the boolean supports_thinking idea with explicit provider-native effort options:
- Anthropic:
claude-opus-4-6:thinking_efforts: ["low", "medium", "high", "max"]claude-sonnet-4-6:thinking_efforts: ["low", "medium", "high", "max"]claude-haiku-4-5: omit (extended thinkingexists, but Anthropic'seffortparameter is not supported on Haiku 4.5)- OpenAI:
gpt-5.4:thinking_efforts: ["low", "medium", "high", "xhigh"]gpt-5.4-mini:thinking_efforts: ["low", "medium", "high", "xhigh"]gpt-5.3-codex:thinking_efforts: ["low", "medium", "high", "xhigh"]- Google AI:
gemini-3.1-pro-preview:thinking_efforts: ["low", "medium", "high"]gemini-3-flash-preview:thinking_efforts: ["low", "medium", "high"]- DeepSeek / Moonshot: omit (always-on or not part of this effort-level work)
Keep the raw provider values in the registry. Do not normalize these to a shared off/low/medium/high set in the registry layer:
- Gemini does not get an off-equivalent registry value in this plan
- Anthropic uses max
- OpenAI uses xhigh; represent "no extra reasoning" by omitting params.Reasoning rather than sending "none"
File: providers/loader.go
// Add to Model struct:
ThinkingEfforts []string `yaml:"thinking_efforts"`
func (m Model) SupportsThinkingEffort() bool {
return len(m.ThinkingEfforts) > 0
}
// Add helper to *Registry:
func (r *Registry) GetModel(providerID, modelID string) (Model, bool)
Step 2: Config — add ThinkingEffort¶
File: internal/config/config.go
// GlobalConfig — add field:
ThinkingEffort string `json:"thinking_effort,omitempty"`
// ResolvedConfig — add field:
ThinkingEffort string
// Resolve() — copy into resolved:
resolved.ThinkingEffort = global.ThinkingEffort
File: internal/cli/cmd/root.go
The manual ResolvedConfig construction (~line 44) bypasses Resolve(). Add:
resolvedCfg = &config.ResolvedConfig{
Provider: globalCfg.ActiveProvider,
Model: globalCfg.ActiveModel,
APIKey: providerCfg.APIKey,
ThinkingEffort: globalCfg.ThinkingEffort, // ADD
}
Step 3: LLM clients — thread thinking effort through¶
Architecture note¶
Anthropic is not routed through Genkit. It uses its own AnthropicClient backed
directly by anthropic-sdk-go. The routing in internal/llm/models.go is:
| Provider | Client |
|---|---|
anthropic |
AnthropicClient |
googleai |
GenkitClient |
openai |
OpenAIResponsesClient |
deepseek / moonshotai |
OpenAICompatibleClient (no-op) |
File: internal/llm/models.go
// ClientConfig — add field:
ThinkingEffort string
// NewClient — pass cfg.ThinkingEffort into each relevant constructor
3a. Anthropic — internal/llm/anthropic.go¶
Add thinkingEffort string to AnthropicClient. In StreamChat, set
params.Thinking and params.OutputConfig / params.MaxTokens before each turn:
// Helper:
func anthropicThinkingParams(effort string) (anthropic.ThinkingConfigParamUnion, anthropic.OutputConfigParam, int64) {
switch effort {
case "low", "medium", "high", "max":
thinking := anthropic.ThinkingConfigParamUnion{
OfAdaptive: &anthropic.ThinkingConfigAdaptiveParam{},
}
outCfg := anthropic.OutputConfigParam{
Effort: anthropic.OutputConfigEffort(effort),
}
return thinking, outCfg, 32768
default: // off / unset
thinking := anthropic.ThinkingConfigParamUnion{
OfDisabled: anthropic.NewThinkingConfigDisabledParam().Ptr(), // or just &anthropic.ThinkingConfigDisabledParam{}
}
return thinking, anthropic.OutputConfigParam{}, anthropicMaxTokens
}
}
// In StreamChat, when building params:
thinking, outCfg, maxTok := anthropicThinkingParams(c.thinkingEffort)
params := anthropic.MessageNewParams{
Model: c.model,
MaxTokens: maxTok,
Messages: msgParams,
Thinking: thinking,
OutputConfig: outCfg,
}
ThinkingConfigDisabledParam can be constructed directly as
&anthropic.ThinkingConfigDisabledParam{} (the Type field is a constant.Disabled
which zero-marshals correctly), or via anthropic.NewThinkingConfigDisabledParam().
The important change from Step 1 is that Anthropic should consume the raw registry
value directly; this layer should not collapse max back to high.
3b. Google AI — internal/llm/genkit.go¶
Add thinkingEffort string to GenkitClient. In StreamChat, append a
ai.WithConfig(&genai.GenerateContentConfig{...}) option when effort is set:
import "google.golang.org/genai"
if c.thinkingEffort != "" {
opts = append(opts, ai.WithConfig(&genai.GenerateContentConfig{
ThinkingConfig: &genai.ThinkingConfig{
IncludeThoughts: true,
ThinkingBudget: budgetForEffort(c.thinkingEffort),
},
}))
}
Check the exact google.golang.org/genai API available in go.mod (v1.41.0) for
the exact helper signature, but in the installed SDK the relevant fields are
ThinkingConfig.IncludeThoughts and ThinkingConfig.ThinkingBudget.
Gemini is the only provider in this plan where the registry value is still a label that Keen maps to an internal numeric budget. Keep that mapping local to the Google client; do not push budgets into the registry.
3c. OpenAI — internal/llm/openai_responses.go¶
Add thinkingEffort string to OpenAIResponsesClient. In StreamChat, set
params.Reasoning when effort is non-empty:
import "github.com/openai/openai-go/shared"
if c.thinkingEffort != "" {
params.Reasoning = shared.ReasoningParam{
Effort: reasoningEffortForLevel(c.thinkingEffort),
}
}
func reasoningEffortForLevel(effort string) shared.ReasoningEffort {
switch effort {
case "low": return shared.ReasoningEffortLow
case "medium": return shared.ReasoningEffortMedium
case "high": return shared.ReasoningEffortHigh
case "xhigh": return shared.ReasoningEffort("xhigh")
default: return ""
}
}
This needs to stay aligned with Step 1's registry values. The currently installed
openai-go version exposes constants only for low|medium|high, so xhigh must
either be passed as a raw string cast or be handled after an SDK upgrade.
3d. OpenAI-compatible (DeepSeek / Moonshot) — internal/llm/openai.go¶
Add thinkingEffort string field to OpenAICompatibleClient for symmetry; no-op
in request building (these providers have always-on reasoning).
Step 4: Model selection — add StepThinking¶
File: internal/cli/repl/widgets/model_selection.go
Current: StepProvider → StepModel → StepAPIKey
New: StepProvider → StepModel → StepThinking (if the selected model has registry-defined thinking efforts) → StepAPIKey
Changes:
- Add StepThinking constant to the existing StepProvider / StepModel / StepAPIKey enum
- Add fields to widgets.Model: ThinkingCursor int, ThinkingOptions []string, SelectedThinking string
- Update handleKeyMsg():
- after StepModel confirm, call registry.GetModel(provider, model)
- if SupportsThinkingEffort() is true, build ThinkingOptions as append([]string{"off"}, model.ThinkingEfforts...) and advance to StepThinking
- otherwise continue directly to StepAPIKey
- Initial selection logic:
- if current saved resolvedCfg.ThinkingEffort == "", preselect off
- else if current saved resolvedCfg.ThinkingEffort is in ThinkingOptions, preselect it
- else if medium is supported, preselect medium
- else preselect off
- Add renderThinkingSelection() and route it from ViewString()
- handlePasteMsg() remains API-key-only; no change needed for thinking selection
- In complete():
- store the selected provider-native effort in GlobalConfig / ResolvedConfig
- map UI choice off to ""
- if the chosen model does not support configurable effort, clear ThinkingEffort to "" so stale incompatible values do not carry across model switches
- keep the existing onComplete(...) callback pattern; repl.startModelSelection() already reinitializes the LLM client through updateLLMClient()
Step 5: /thinking command¶
File: internal/cli/repl/commands/commands.go
{"/thinking", "Change thinking effort for the current model"},
This updates slash-command suggestions only. The help text is defined separately in internal/cli/repl/repl.go, so add /thinking to both places.
File: internal/cli/repl/repl.go
Add constant thinkingCommand = "/thinking".
In handleEnterKey, parse the argument from the input line:
if input == thinkingCommand || strings.HasPrefix(input, thinkingCommand+" ") {
modelMeta, ok := m.ctx.registry.GetModel(m.ctx.cfg.Provider, m.ctx.cfg.Model)
if !ok || !modelMeta.SupportsThinkingEffort() {
// show error: "Current model does not support configurable thinking"
...
}
effort := strings.TrimSpace(strings.TrimPrefix(input, thinkingCommand))
allowed := append([]string{"off"}, modelMeta.ThinkingEfforts...)
if !slices.Contains(allowed, effort) {
// show error: "Usage: /thinking " + strings.Join(allowed, "|")
...
}
storedEffort := effort
if effort == "off" {
storedEffort = ""
}
m.ctx.cfg.ThinkingEffort = storedEffort
m.ctx.globalCfg.ThinkingEffort = storedEffort
if err := m.ctx.loader.Save(m.ctx.globalCfg); err != nil { ... }
// Reinitialize LLM client with new effort
newClient, err := llm.NewClient(m.ctx.cfg)
// update appState.llmClient and show confirmation: "Thinking effort set to: EFFORT"
No new message type or UI selection flow required.
Step 6: Display thinking status¶
File: internal/cli/repl/repl.go
inputMetaView(): append · thinking:EFFORT to the status line when cfg.ThinkingEffort != "" and the current model has SupportsThinkingEffort().
buildInitialScreen(): add a Thinking: EFFORT line (next to Model/Provider) when cfg.ThinkingEffort != "" and the current model supports configurable effort.
Critical Files¶
| File | Change |
|---|---|
providers/registry.yaml |
Add per-model thinking_efforts using provider-native values |
providers/loader.go |
ThinkingEfforts []string; SupportsThinkingEffort(); GetModel() helper |
internal/config/config.go |
ThinkingEffort in GlobalConfig + ResolvedConfig; copy in Resolve() |
internal/cli/cmd/root.go |
Set ThinkingEffort in manual ResolvedConfig construction |
internal/llm/models.go |
ThinkingEffort in ClientConfig + NewClient pass-through |
internal/llm/anthropic.go |
Enable/disable thinking based on non-empty effort; pass through low|medium|high|max directly |
internal/llm/genkit.go |
Map low|medium|high to Gemini ThinkingBudget; omit config when effort is empty |
internal/llm/openai_responses.go |
Set params.Reasoning.Effort from low|medium|high|xhigh; omit when effort is empty |
internal/llm/openai.go |
Store field; no-op |
internal/cli/repl/widgets/model_selection.go |
StepThinking in full /model flow using per-model option lists |
internal/cli/repl/commands/commands.go |
Register /thinking for slash-command suggestions |
internal/cli/repl/repl.go |
Handle /thinking <effort> against current model's option list; update help text and status display |
Test Updates¶
Because the REPL package is now split into subpackages, the test work should be split the same way instead of being treated as a single monolithic repl change:
providers/loader_test.go- verify
thinking_effortsload from YAML - verify
GetModel(provider, model)andSupportsThinkingEffort() internal/config/config_test.goandinternal/config/loader_test.go- verify
ThinkingEffortsurvives save/load - verify
Resolve()copiesThinkingEffort internal/llm/models_test.go- verify
ThinkingEffortis threaded into the constructed client structs internal/llm/anthropic_test.go- verify
off/empty disables thinking and restoresanthropicMaxTokens - verify
low|medium|high|maxset the expected Anthropics params internal/llm/openai_responses_test.go- verify non-empty effort populates
params.Reasoning - verify
xhighis passed as the raw string-backed enum value internal/llm/openai_test.go- verify OpenAI-compatible clients ignore
ThinkingEffort internal/llm/genkit_test.goor existing Genkit tests- verify the Google client maps labels to budgets only when effort is non-empty
internal/cli/repl/widgets/model_selection_test.go- add focused widget tests for
StepThinkingtransitions, defaults, and persistence intoresolvedCfg/globalCfg internal/cli/repl/repl_test.go- add
/thinkingcommand tests - add
inputMetaView()/buildInitialScreen()coverage for thinking display internal/cli/repl/widgets/suggestion_test.go- update command-count and command-order expectations for the extra
/thinkingslash command
SDK API Quick Reference¶
anthropic-sdk-go v1.37.0
// Enable adaptive thinking with effort level:
params.Thinking = anthropic.ThinkingConfigParamUnion{
OfAdaptive: &anthropic.ThinkingConfigAdaptiveParam{},
}
params.OutputConfig = anthropic.OutputConfigParam{
Effort: anthropic.OutputConfigEffortMax, // Low / Medium / High / Xhigh / Max
}
params.MaxTokens = 32768
// Disable thinking:
params.Thinking = anthropic.ThinkingConfigParamUnion{
OfDisabled: &anthropic.ThinkingConfigDisabledParam{},
}
params.MaxTokens = 16192 // anthropicMaxTokens
openai-go (Responses API)
params.Reasoning = shared.ReasoningParam{
Effort: shared.ReasoningEffortHigh, // Low / Medium / High
}
The installed openai-go (v1.8.2) models ReasoningEffort as a string type but
only exposes constants for low|medium|high. If Step 1 keeps xhigh in the
registry, use shared.ReasoningEffort("xhigh") or upgrade the SDK before coding.
google.golang.org/genai v1.41.0 (via Genkit's googlegenai plugin)
Installed SDK fields:
ai.WithConfig(&genai.GenerateContentConfig{
ThinkingConfig: &genai.ThinkingConfig{
IncludeThoughts: true,
ThinkingBudget: ptr(int32(budgetForEffort(effort))),
},
})
Verification¶
go mod tidy && go test ./...— all tests pass/model→claude-sonnet-4-6→ thinking step appears withoff|low|medium|high|max; if no saved compatible value exists, default ismedium/model→deepseek-chat→ thinking step skipped/model→gpt-5.4→ thinking step appears withoff|low|medium|high|xhigh/thinking maxonclaude-sonnet-4-6and/thinking xhighongpt-5.4both persist in session +~/.keen/configs.json/thinking badshows usage with the current model's valid values, not a hard-coded global list/thinking offclearsThinkingEffortin runtime/config and omits provider reasoning config on the next request- Switching from a model with
xhighormaxto one that does not support that value preselects a compatible fallback (mediumif available, elseoff) - Cold start: quit and relaunch
keen→ThinkingEffortstill applied (proves root.go fix) - Status line shows
thinking:max,thinking:xhigh, etc. using the actual stored provider-native value claude-haiku-4-5:/modelskips thinking step;/thinkingshows error- Slash suggestions (
/) and/helpboth show/thinking