Agentic AI

PyRIT-Ship

Max Corbridge

Cofounder

May 29, 2025

Last week we took a look at the AI hacking game called Gandalf, in which you try to convince an AI wizard to reveal his secrets with increasing difficulty. For the final level we brought in some AI firepower in the form of PyRIT, Microsoft's Python Risk Identification Tool for generative AI. For those that read last week's update, you will have seen that I was grumbling about PyRIT not being exactly 'plug and play'. Ultimately, PyRIT was designed to solve many different problems and has some really cool features to allow it to do this, but sometimes you don't need to boil the ocean and just want something quick and easy. Queue PyRIT-Ship.

Developed by engineers at Microsoft this is a toolkit which allows you to run PyRIT locally without much setup needed, and then query the tool over an HTTP API. This way, we've confined our operating space to just integrating this HTTP API with our target. You can see why this is appealing, as for many AI testing scenarios we will be accessing it over HTTP already. Therefore, when I saw they had made a version of PyRIT which works inside Burp I was very excited to try it out.

In the BlueHat2024 talk linked above they mention that all of what you can do in PyRIT-Ship is do-able in PyRIT by expanding the Python codebase and creating custom targets. Doing this through Burp though allows us to interact with the LLM as it was intentionally designed (through a browser) and capture all of the necessary application context, and any dependencies, before adding the testing toolkit in-situ. I can already tell this is going to be a go-to when fiddling with Python to get PyRIT running isn't going well.

So, let's get started. Sadly there were no nice video guides to follow on setting up PyRIT-Ship - perhaps I should make one? Taking this newsletter beyond just written form is something I've considered for a while as I feel being able to record videos and talk over some of this stuff would make it appealing for many different audiences. If that is something you'd be interested in then let me know! Anyway, as there was nothing to follow and the PyRIT-Ship documentation was pretty skinny (and mentioned hardcoding for Azure OpenAI which was going to be a problem) I knew this was also going to have a bit of fiddling required to get it setup.

The README says that there are 2 elements included in the repo right now: /pyritship which is a Python Flax server in which certain PyRIT features are exposed over an API, and /burp_extension which is the Java extension for Burpsuite which works with Intruder. I wasn't sure at this point if I needed to run the Python server locally in addition to the extension or if that was somehow covered in the extensions functionality, so I got started with the extension first. There was a single page README on the Burpsuite extension setup, so let's hope there is enough info to get going.

When doing tasks like this I like to try to get some AI-assistance to speed up the process of getting things working. Short of allowing something like Windsurf's Cascade to try to just do it for me, I saved the README.md locally and gave it to o3 to try break down the steps, expand on them where necessary and give me some more info.

I got started by installing the 3 extensions mentioned, and downloading the Java runtime and SDK 21. With than unpacked I can add it to my PATH and test that it's all setup fine.

Now we need to tell VSCode to use that JDK too, which they have nice instructions for on the README. With the below file updated and VSCode restarted we are ready to start building the extension.

Now we need to clone the repo and open it in VScode. As they said it would opening this folder with the aforementioned extensions installed kicks VSCode into gear and it starts leveraging the Java SDK to create the build files automatically (yay for README's being accurate). Once it did that, I was able to run the build task.

This took just 2 seconds to run, and apparently we are now ready to add the extension to Burp! Sure enough, we now had a JAR file ready to go so it was time to head over to Burp.

Adding it as a Burp extensions gives us 0 errors and the PyRIT-Ship tab appears - so far so good!

It was around this point that I started to doubt that the extension alone was going to provide the actual PyRIT-Ship API service too, and sure enough when watching the demo on their conf talk they run the API locally too - good to know as this isn't explicitly mentioned in the README.

So, next is getting the PyRIT-Ship API up and running. In their video demo they just run pyritship.py script which hosts it locally, but I only have an app.py which I'm going to assume is what is needed here. Firstly, I make a new virtual env and then install 'pyrit' and 'flask' through pip.

The tool is currently hardcoded for Azure OpenAI but I didn't want to use that and had previously used OpenAI directly with PyRIT, so I tried to override this by copying the setup in my .env for PyRIT into PyRIT-Ship and see if it was going to work. Sadly, this didn't work and it appears we'll have to use Azure OpenAI for this unless we want to dig through the code and repurpose it to use OpenAI directly.

I headed over to Azure OpenAI (for the first time actually) and setup an LLM there for testing purposes. With this setup, I could now get our server running locally:

Let's test it out by heading over to Gandalf again and see if everything worked. I capture the request to Gandalf Level 1 which is an example of how we might be interacting with a LLM in a real-world scenario.

Skipping on to level 3 so that we had some actual challenge to achieve we then send this request over to Intruder and select our input as the position. We select 'Extension Generated' payload type and choose Pyrit-Ship, then we can begin the attack! I had a few issues here, so I've decided to make that video guide after all.

After some troubleshooting, we got it working :) the local server showed Burp hitting it's API

And Burp also knew when the attack was successful

So we had managed to get PyRIT-Ship up and running successfully in Burp! Very nice. This certainly felt a lot more applicable to my sort of use cases than the original PyRIT implementation did. However, I still have some concerns. Firstly, using the tool to its full potential as you can see in the screenshot above relies on us having a nice and simple single endpoint to hit for chat requests, and for the LLM responses to be sent straight back in the response.

However, this is often too simplistic for how modern AI systems are built. For example, I was recently testing an LLM which split requests over HTTP and responses from the LLM over websocket's on entirely different endpoints, with several additional requests in between. I did notice that there are HTTP and Websocket settings in PyRIT-Ship Burp Extension menu, so perhaps there is something that could be done here.

PyrRIT-Ship Settings, clearly setup for Gandalf by default

One thing I do like is the ability to integrate LLM payloads into Burp Intruder AND also provide the underlying hacking LLM with a 'system prompt' via the Goal description. Even if the highlighting of a successful response in green wasn't going to be possible, this could be used to generate nice and potentially bespoke payloads to test. On the flip side, there doesn't seem to be a great deal of control over the types of payloads or techniques I can use here.

Having recently used Spikee on an engagement I enjoyed the fact I could select the types of attacks and base documents I wanted to test, and then when I found a successful bypass I could use just that single technique moving forward. This sort of customisation doesn't appear to be possible in PyRIT-Ship, but perhaps on more complex targets its iterative approach would yield successful results.

For now though it's nice to know that there are other toolkits that we can turn to when securing AI features in apps, and I will catch you all next week!

Thanks.

blogs

Our Latest Thoughts

Interviews, tips, guides, industry best practices, and news.