Kiro

By Arjen Schwarz (17 minutes read)

In early June, I received an email inviting me to an early look at Kiro. The only thing I knew at that time was that it was a new agentic IDE and created by AWS. Obviously, I was very interested in this, and today Kiro is available to everyone in public preview so that I can talk about it.

Like many others I’ve become more and more interested in agentic coding over the past months, and I’m usually using several different tools at the same time. Whether it’s a VSCode plugin like Cline or Copilot, a terminal app like Claude Code, a web-based tool like OpenAI’s Codex, or a standalone VSCode fork. Kiro falls in that last category, which has its pros and cons. The introduction blog post and official documentation covers the basics like installing, on-boarding, etc. So I’m not going to go into that. Instead I’ll focus on what I like about it, what can be improved, and what I learned about how to make efficient use of it.

What does Kiro do?

In many ways Kiro is an extension of VSCode, with the aim to make it a single experience focused on agentic coding. This, of course, means that it has the now familiar sidebar with a chat agent, but the strength of Kiro actually lies in the toolbar on the other side. This is where you find the MCP config (as expected), but also 3 other sections:

Specs
Agent hooks
Agent steering

The Kiro sidebar, waiting for you to start your project

These 3 are the core of what makes Kiro, well, Kiro. The ideas aren’t unique, but they are well implemented in a nice package. Let’s start with the steering files.

Steering

If you’ve done any agentic coding, you’re likely familiar with steering files. They have different names in different tools, but in general they are the files that guide/steer your agent into making the best use of your environment. This means things like understanding the structure of your code, what it does, and what it uses to achieve this result. When you first open a project in Kiro, it will show you a button to generate these files, but then it creates for you a product, structure, and tech Markdown file. These are fairly succint¹, but mostly show how it all works. The structure of these files has changed a bit over the releases, but I can’t say right now if that’s because of tweaks or that my original ones were generated by Claude Sonnet 3.7, while the later ones were with Claude Sonnet 4.

Aside from these generated ones however, you can still add your own steering files. This means you can leave the generated ones alone (and potentially regenerate) and create new ones specifically for your purposes. Fun fact, when dealing with an early version of a tool like this, you can end up using these a lot to work around issues, some examples I’ve had to use:

Never include cd . when running terminal commands
You can’t read the terminal output, so pipe all commands to output.log and read the results from there²

So, these steering files are really nice. They allow you to easily have a good start for your agent to work with, and add custom rules you want to include. My biggest complaint with them, though, is that there is currently no option for a global steering file (or multiple files). And that is something I really miss, as there are plenty of things I’d like to have consistent across projects without needing to copy them.

So, what’s the best way to use these? Well, you generate them, create your own to put in your own additions, and whenever you’ve done a major update, you should ask for them to be regenerated. That way the steering files will remain up to date. And of course, there are all the usual things you should do for an agent file, like make it easy for the agent to find tools (tell it to use your make commands or other scripts if you have them), include your own preferences around tool use and even things like spelling or comment structure. Do note that when you ask Kiro to regenerate your steering files, it will also do that for the ones you manually made.

Getting hooked

The hooks are a useful integration. It’s basically a set of pre-defined prompts that you can have automatically trigger, or manually. Again, Kiro is not the first to implement something like this (although the automatic part isn’t very common yet), but they did a good job here with the creation bit. Basically the flow is that you will provide a prompt of what you want to achieve, and then based on the steering files, and its knowledge of the project it will provide you with an efficient code. For example, I gave it the prompt

Every time a task is finished, I want to check the code for efficiency. If there is a better way to get the same result, this should be noted in a file called IMPROVEMENTS.md.

Kiro went to work and it created a hook with the description as shown below.

Interestingly is that as it can’t trigger on the agent chat finishing a task, it figured a way around it by monitoring the task files (I’ll come back to these in a moment), but it’s nice to see it find a solution to my request. Now, these files will be edited for other reasons as well, and that’s presumably where the first line comes in. If I were to rewrite it, I would probably want to say something like “Verify that a task has been created or updated by running a git diff, if another change was made to the file, you will stop here.”

Because that brings me to one thing I’ve found about these automated hooks, they can block you from continuing. Kiro can only have a single agent running in a window (using it simultaneously in multiple projects seems to be fine), so if you have a hook that will run after the update of a file it can block the execution of your main thread. Now, it will usually be nice and wait until the main thread is finished, but the moment that has a break (whether because it asks for permission or because it believes it’s finished), the hook will jump in and run.

I’ve had times where I couldn’t figure out why it wouldn’t continue when I prompt it, and usually this would be because Kiro was busy running its hooks. And maybe it asked for permission to run a command, or it was trying to fix some unit tests. You can’t know until you go there, which you can do via the Task list. Alternatively, you can also see which hook is running in the Kiro sidepanel by way of the spinning icon, but you’ll still need to go to the task list to interact with it.

3 states of a hook, running, waiting for manual action, and disabled

All in all, the hooks are a good addition, and I miss tools that don’t have them. Now, what are some things I’ve found work well? My usual “Update changelog and create commit” task is one that works well, because it’s a manual trigger. This particular one is one that I’ve got configured in pretty much all of my agentic coding tools, and my only gripe here is that I have to redefine it in each project instead of having it as a global command.

Other than that, because the hooks are blocking, you want the automated ones to be fast. That means something like “Go format and test” which runs all of the tests and linting, takes too long. Instead, I’ve since updated this to just run “go format”. Testing and linting are instead included in the agents steering file as part of all tasks. Another option to speed things up is to ask it to write its findings to a file, which you can then include in your work later on.

Specs

And now we come to the specs. Specs are basically a sort of information gathering phase. It works by providing your requirements, after which it will turn this into an EARS (Easy Approach to Requirements Syntax) style requirements document. An example of this is the below that was generated as part of one of my project improvements (specifically, an improvement to the automated AWS profile generation for Identity Center logins that I added recently).

### Requirement 1

**User Story:** As an AWS CLI user, I want to specify how existing profiles should be handled during profile generation, so that I can control whether they are replaced or preserved.

#### Acceptance Criteria

 1. WHEN a user runs the profile generator THEN the system SHALL provide a flag to control existing profile behavior
 2. IF the user specifies `--replace-existing` flag THEN the system SHALL replace existing profiles with new names based on the pattern
 3. IF the user specifies `--skip-existing` flag THEN the system SHALL skip generating profiles for roles that already have profiles
 4. WHEN no existing profile handling flag is provided THEN the system SHALL default to prompting the user for each conflict
 5. IF both flags are provided THEN the system SHALL reject the command with a validation error

One downside I’ve found is that this is not an interactive process in itself. Kiro will generate the requirements document, and then ask you if it’s ok. It doesn’t ask questions in between about things that are unclear, but will make assumptions. Often this is fine, and you can make it adjust it (or even do it manually), but this is the part where input is good. That said, you can keep iterating on it during this phase.

It’s the same thing with the design document that will be generated based on the requirements document. While this goes through your code, and looks at your requirements, it doesn’t actually seem to use the tools it has at its disposal unless you explicitly tell it to. That said, I’ve mostly had it work on tooling around AWS, so it’s possible it had that knowledge already available, but it might be worth telling it to use your MCPs etc. for getting up-to-date information. Interestingly, a lot of the decisions that seem to be made by Kiro in the requirements phase, actually show up with their rationale in the design document, along with any other decisions it makes for you. ALWAYS pay careful attention to this section, as it’s the most likely to bite you.

The task list is the next thing to be created. This is a simple list of tasks, usually grouped and they will always have subtasks. The bottom item on the subtasks is actually a reference to the specific requirement from the requirements.md file, which is a bit confusing when you first see it as you may think it reflects dependencies between the tasks.

An overview of tasks, showing the start task triggers

A very nice thing that you can see in the image above, is that you can trigger the tasks (and look at their run) directly from the tasks file. This is quite handy if you want to do a phased implementation and trigger each task manually and then review the work. Of course, you can also tell it to execute all tasks in the chat.

One thing I would really like with the specs is a way to archive them, or to somehow mark them as done. Yes, I can move them out of the .kiro/specs directory manually, but I’d love to have a way within the kiro interface to see a difference between implemented specs and once that are still open. The list of my specs in some projects is starting to get large, and I fear that will only get bigger.

Personally, I think the idea of these specs is great and have, ehm, been inspired by them when using other tools. While the plan mode in Cline or Claude Code is similar in idea, I really like how structured this works. Yes, I would probably like more back and forth with questions, but that can be replicated after the first time a document is generated. The one thing I haven’t really managed to find a good way to do is make additions to the requirements of a spec that I’ve already partially implemented and have that turned into good updates. To clarify, I have done so, but it makes it look quite messy, which is a shame as it makes them less nice for documentation purposes.

As for what works best with these, it’s simply to follow the flow, but pay attention to the output and interact. Make sure that you agree with any assumptions it made, and otherwise change them. Tell it to tweak or even retry the generation of the design document or requirements. These are literally the most important parts of your work, because once the requirements and design are done well, you can easily tell the agent to implement the tasks without needing to look over its shoulder.

And speaking of the agent…

The agent

Unlike most agentic tools, Kiro’s agent comes with a simple question to start with: Do you want to go through the whole process of building a Spec, or just Vibe. With the Spec option you’ll be going through the process I described above, and with Vibe you’ll bypass that and have it act like a standard agent. To be clear, that seems to be the only difference between the two. And once you’ve chosen, there are a couple of options left: run in Autopilot or Supervised mode, and which model do you want to use: Claude Sonnet 4, or Claude Sonnet 3.7. Personally, I’m not entirely sure why 3.7 is still being offered as an option (to be fair, Sonnet 4 is still quite new), but if you prefer it over 4 then you can do so.

There is also the option to add images, or a whole slew of context options, to your query. This works as expected and can be useful to provide the extra information the agent would want. Actually, I should probably call the agent by its name, Kiro, as that how it shows up in the chat interface.

As for how well Kiro performs, it depends. Obviously, the better input you give it, the more you’ll get out of it, which is where the specs come in. Using the specs makes it a lot easier to get higher quality input, as you’ve already spent time working out the requirements and design. That said, I’ve had some great results, and I’ve had it generate a 2300 line bash script to serve as a GitHub Action. Below I’ve got a comparison though that will let you have a look yourself.

Tools

An agent doesn’t work alone; it needs to interact with its tools, and that’s where Kiro isn’t quite there yet. On my main MacBook, I’ve been having some terminal integration issues. I don’t have this issue on my Mac Studio, so I fully expect that it’s something specific to this machine, but it’s frustrating as I don’t have the same problem in VSCode with either Cline or Copilot. This has improved, though, as most of the time it works fine, but I feel that having it be inconsistent is worse.

The bigger issue is the MCP integration. Again, it’s improved a lot since my first look at Kiro, but it still doesn’t seem to support the newer features, such as remote or streamable MCP servers. This is a bit disappointing. Yes, as referenced in the linked issue there are workarounds, but we shouldn’t need hacky ways to do a standard action. I’m pretty confident that this will be resolved by the time the public preview is over, but it might be best if you (the person/AI reading this) upvotes that ~~bugreport~~ feature request. Some other things I’d like to see in the MCP integration is the ability to disable not just the whole MCP server, but specific commands in there.

Some other things to note: trusting commands is turning into a nice experience. As per the below picture, it asks you which of three levels of trust you’re happy to give it; the full command, the partial command (usually the base command and first argument), or the base command itself. This is a nice interface, but I’d like to see the ability to disallow specific commands. As an example, I might like to allow git * to run automatically, but not git commit. Right now, I would need to allow each subcommand before getting that result. Still, it’s actually a good interface for this.

But, the actual result of trusting commands is a bit unclear. Sometimes it works perfectly, other times it will keep asking you to trust the command. This is especially true when piping output to a file or when chaining commands. Even something like timeout 30 go test ./... can ignore the fact that I’ve whitelisted all go commands (and go test specifically) and ask for confirmation. As it happens, I’m not a fan of whitelisting a command like timeout * when I don’t know how that’s going to be checked. That said, it is possible to modify the whitelist manually so you can make that more suitable to your needs.

The inevitable comparison

One thing I did yesterday (early today? it’s 1pm right now, so who knows), after Kiro was updated to the public release version, was ask both Kiro and Claude Code to implement a feature so I can compare the results. Would it have been a better comparison if I used something like Copilot or Cline? Probably, but with Copilot, that meant I’d have to approve commands every couple of minutes, and Cline would’ve used my actual money.

Anyway, using the Claude chat interface, I asked Claude Opus 4 to create a requirements and design document for implementing proper error handling and validation in my go-output library. It dutifully did so, and then I had both Kiro and Claude Code work on it in separate branches. For Kiro I went through the spec generation process (“Read the requirements and design documents found here and generate the requirements for the spec based on that” and follow this through). For Claude Code I simply asked it to generate the task list. And then I asked both to execute the generated tasks. To be clear, for this test I used minimal direction. I didn’t interact with the requirements part of the spec at all, and for the design document, the only change I requested was to include all the code snippets from the original design document.

I’ll let you be the judge regarding the respective PRs (found here for Claude and here the incomplete one for Kiro), but the experience is quite different. Claude Code was a lot more hands-off. I gave it the task, had to approve a couple of times for things being done in the session, and off it went. It was a lot faster too³. Kiro on the other hand, was having some terminal integration issues where I had to tell it to put the output of tests into a file. Kiro also spent a lot more time going around in circles trying to fix issues with the tests (and got stuck in a loop once), meaning that it was still working on the code when Claude Code had already refilled its credits and finished (and the public preview went up).

In conclusion

There are many things to like about Kiro. The specs and hooks, especially, are very useful in having a more organised way of working with your agent. As I said before, I’ve happily incorporated the specs into my agentic flows, so obviously I like the way that works. Does that mean you should use Kiro? I’d say that’s up to you, but I highly recommend at least checking it out. Yes, it will still be rough around the edges, especially for MCP use, but that’s going to improve. And during the public preview it’s also free, which is of course the best price.

Or I just haven’t used it on a complex enough project ↩︎
You have no idea how happy I was when this got resolved ↩︎
Until I ran out of credit, obviously. The downside of being on a measly Pro subscription is that you frequently look askance at those upgrade prices and wonder if maybe it’s worth it. As I write this, I’ve got 10 minutes left until it resets. Not that I’m counting. ↩︎