Keen Code Roadmap - April 2026¶
Purpose¶
This roadmap turns the current product discussion into a concrete direction for making Keen Code realistically useful for software engineers and competitive with modern CLI-based coding agents.
The product is intentionally minimal today. That is a strength as long as the next features are chosen carefully. The roadmap below keeps that constraint in mind:
- Preserve a lightweight, terminal-first experience
- Add capabilities that improve daily engineering workflows
- Avoid feature bloat that can be handled through extensibility
- Prefer strong infrastructure over shipping lots of bundled behavior
Current Baseline¶
Keen Code currently provides:
- A REPL-based coding agent UI
- A small built-in toolset:
read_file,glob,grep,write_file,edit_file,bash - Filesystem guard and interactive permission prompts
- Model/provider selection
- Context compaction
- Basic project-instruction loading from files like
AGENTS.md
This is enough for focused single-agent tasks, but not yet enough for long, high-trust, team-oriented engineering workflows.
Product Principles¶
These principles should shape roadmap decisions:
- Keep the core small and composable
- Prefer infrastructure over hardcoded workflows
- Make risky behavior explicit and reviewable
- Optimize for real software engineering tasks, not toy demos
- Support team adoption through repo-local configuration and conventions
Tier Definitions¶
- Tier 1: Required for daily usability and trust
- Tier 2: High-leverage features that expand capability significantly
- Tier 3: Advanced workflow features that improve competitiveness and polish
Tier 1: Core Usability And Trust¶
1. Persistent Sessions And Resume ✅¶
What it should do
Allow users to persist conversation state and later resume it from disk. A session should capture enough context to continue work without starting over.
Why it matters
Serious engineering tasks often span hours or days. Losing state when the process exits makes the tool less practical than competing agents.
Key behaviors
- Save session history locally
- Reopen the most recent session or choose from prior sessions
- Preserve compacted state and relevant metadata
- Support explicit session naming for longer tasks
Considerations
- Storage format should be stable and inspectable
- Session history may contain sensitive code or prompts, so local storage behavior must be clear
- Large sessions need compaction-aware persistence
2. Autopilot Mode¶
What it should do
Add an explicit autopilot mode that bypasses interactive permission prompts
for non-bash tools while keeping the existing permission behavior as the
default mode.
Why it matters
Interactive prompts are useful by default, but they create friction during trusted in-repo editing workflows. An explicit autopilot mode reduces that friction without making the permission model ambiguous.
Key behaviors
- Keep the current permission system unchanged in normal mode
- Add an explicit
autopilotmode users can opt into - In
autopilot, bypass interactive permission prompts for non-bashtools - Keep
bashpermission handling separate from autopilot - Surface the active mode clearly in the UI
Considerations
- Autopilot should bypass prompts, not the core filesystem guard
- Blocked paths should remain blocked
- The behavior must be easy to explain: normal mode prompts, autopilot
auto-allows non-
bashactions within existing guardrails - The UI should make autopilot highly visible so users do not forget it is on
3. Web Search And Fetch¶
What it should do
Allow the agent to retrieve external documentation and web content when the task requires current or external information.
Why it matters
Modern software work regularly depends on upstream docs, release notes, libraries, API references, and issue trackers.
Key behaviors
- Search the web for relevant sources
- Fetch pages or documentation content
- Present citations or source references in responses
- Respect user approval and clear network access rules
Considerations
- Network access should remain explicit and policy-controlled
- Source quality matters; official docs should be preferred when possible
- Output should distinguish retrieved facts from model inference
Tier 2: Capability Expansion¶
4. Native Code Review Mode¶
What it should do
Provide a first-class way to review local changes, diffs, or specific files in review mode instead of implementation mode.
Why it matters
Code review is one of the most common and highest-value software engineering workflows for coding agents.
Key behaviors
- Review unstaged changes, staged changes, or arbitrary diffs
- Report findings ordered by severity
- Focus on bugs, regressions, risks, and missing tests
- Optionally generate fix suggestions without applying them automatically
Considerations
- The output format should be concise and trustworthy
- The review mode should avoid drifting into summary-heavy responses
- Good review quality may require diff-aware prompting and git-aware helpers
5. MCP Support¶
What it should do
Add Model Context Protocol support so Keen Code can connect to external tools and systems through a standard integration layer.
Why it matters
MCP is the cleanest path to extensibility without turning Keen Code into a collection of one-off integrations.
Key behaviors
- Discover and register MCP servers
- Expose MCP-provided tools and resources to the model
- Respect existing permission and approval flows
- Surface connection and failure status in the UI
Considerations
- Tool trust and provenance must be visible to the user
- Authentication and local configuration should be straightforward
- MCP should complement built-in tools, not create overlapping confusion
6. Subagents And Delegation¶
What it should do
Let the agent delegate bounded subtasks to additional agents when parallel or role-specific work would improve throughput.
Why it matters
Large engineering tasks benefit from splitting exploration, implementation, testing, and review instead of forcing one linear agent thread.
Key behaviors
- Spawn subagents for scoped tasks
- Keep ownership boundaries clear
- Allow result collection and integration into the main thread
- Make delegated work visible in the interface
Considerations
- Delegation should not become chaotic or opaque
- Approval and filesystem boundaries still apply
- The main thread should remain understandable to the user
7. LSP And Diagnostics Integration¶
What it should do
Integrate with language servers or equivalent diagnostics sources to provide semantic navigation and actionable feedback.
Why it matters
Text search is useful, but semantic code understanding is much stronger for non-trivial codebases.
Key behaviors
- Retrieve diagnostics
- Support definitions, references, and symbols
- Support rename previews and other semantic actions where safe
- Feed compiler and editor-quality feedback into the model loop
Considerations
- Language support should degrade gracefully when no server is available
- Diagnostics should be presented compactly
- Integration should avoid making startup heavy or fragile
8. Custom Commands And Reusable Workflows¶
What it should do
Allow users and teams to define reusable commands that package common prompts, instructions, or workflow entry points.
Why it matters
Teams often repeat the same tasks: fixing tests, preparing releases, reviewing PRs, or auditing changes. Commands reduce friction.
Key behaviors
- Support project-local and user-global custom commands
- Expose command descriptions and discoverability in the UI
- Allow commands to compose with built-in product features
- Keep commands easy to version in a repository
Considerations
- Commands should remain inspectable and not hide dangerous behavior
- Naming and precedence rules should be simple
- Commands should complement core features instead of replacing them
- Can be implemented using agent skills
9. Persistent Memory Beyond Project Instructions¶
What it should do
Provide a way to retain useful learned context across sessions beyond static
instruction files like AGENTS.md.
Why it matters
Teams accumulate local knowledge that is too specific for the base prompt but too valuable to rediscover repeatedly.
Key behaviors
- Store durable notes or structured memory
- Separate project memory from user-global memory
- Make stored memory visible and editable
- Allow explicit opt-in and cleanup
Considerations
- Memory should be reviewable and not silently mutate behavior
- Stored context must not become stale or misleading
- Repo-local memory should be easy to version or ignore intentionally
Tier 3: Competitiveness And Workflow Depth¶
10. Hooks And Automation¶
What it should do
Provide hook points around agent and tool activity so users can run validation, formatting, notifications, or policy checks automatically.
Why it matters
Hooks make the agent fit established engineering workflows instead of requiring teams to work around it manually.
Key behaviors
- Support pre-tool and post-tool hooks
- Support validation hooks after edits
- Support notifications or logging hooks
- Keep hook execution visible and debuggable
Considerations
- Hook failures must be understandable
- Hooks should not silently introduce unsafe behavior
- Execution order and retry behavior need clear rules
11. Richer Built-In Tools¶
What it should do
Expand the built-in toolset only where the capability is foundational and too common to leave entirely to external integrations.
Why it matters
A small toolset is good, but some missing primitives create unnecessary model friction and poor reliability.
Candidate additions
- Directory listing
- Move or rename files
- Better diff inspection
- Task or todo tracking
- Structured patch application
- Test runner helpers
Considerations
- Every new built-in tool adds maintenance and prompt complexity
- Avoid adding tools that are better handled through MCP
- Prefer tools that materially improve reliability over convenience-only tools
12. Worktree And Branch Isolation¶
What it should do
Support isolated work environments for tasks that should not modify the current working tree directly.
Why it matters
Isolation improves safety for larger autonomous tasks, experiments, and review flows.
Key behaviors
- Create or target dedicated worktrees or branches
- Keep task state associated with the isolated workspace
- Make isolation explicit in the UI
Considerations
- Git interactions become more complex quickly
- Users need clear visibility into where edits are happening
- This feature becomes more valuable after sessions and delegation exist
13. Session Sharing And Export¶
What it should do
Allow users to export or share session artifacts for collaboration, debugging, or reproducibility.
Why it matters
Shared sessions help teams understand agent behavior, review outputs, and file useful bug reports.
Key behaviors
- Export session transcripts and metadata
- Share compacted context and decisions
- Support local export first, remote sharing later
Considerations
- Shared sessions may include sensitive data
- Export format should be easy to inspect
- Sharing is lower priority than local usability and trust features
14. Image Input And Multimodal Workflows¶
What it should do
Allow the agent to use screenshots or image inputs when supported by the active model and workflow.
Why it matters
This is especially useful for frontend work, design implementation, and certain debugging tasks.
Key behaviors
- Attach local images to prompts
- Surface model capability constraints clearly
- Keep image use optional and lightweight
Considerations
- Not all providers or models will support this equally
- The UI and prompt pipeline should degrade gracefully
- This is useful, but not as foundational as the higher-priority items
Proposed Implementation Sequence¶
The roadmap should not be built strictly by tier order alone. Dependencies and infrastructure matter. A practical sequence is:
- Persistent sessions and resume
- Autopilot mode
- Web search and fetch
- Native code review mode
- MCP support
- Persistent memory
- LSP and diagnostics
- Custom commands and reusable workflows
- Subagents and delegation
- Hooks and automation
- Richer built-in tools
- Worktree and branch isolation
- Session sharing and export
- Image input
Out Of Scope For The Near Term¶
These ideas may become useful later, but should not distract from the roadmap above:
- Heavy cloud task orchestration
- Complex auto-triggered workflows
- Provider-specific feature divergence where it harms consistency
- Large numbers of built-in niche tools
Success Criteria¶
Keen Code should feel competitive when a software engineer can:
- Start work in a repo and safely let the agent explore and edit
- Pause and resume multi-hour tasks without losing context
- Review code changes as easily as generating them
- Pull in external documentation when needed
- Reuse local team workflows through product features and custom commands
- Extend the tool through MCP instead of waiting on core changes
That should remain the bar for roadmap decisions: improve real engineering utility without compromising clarity, control, or product simplicity.