AI Security Testing - Rabbit R1

KZ
Jan 17, 2025
17 min read

The aim of this article is to detail the Security testing for a well-known AI product, the Rabbit R1, and see how it stacks up out of the box as well as whether the enhanced lessons can be leveraged by threat actors to threaten security either at home or in a corporate world.

We will also aim to explore whether this can have advantages to your security stance, if this lends itself to that answer of course.

This article and related videos are not sponsored, we procured the device privately for testing and get no money or kickbacks for this. All opinions are garnered from hands on testing, carried out using both Wi-Fi and with a SIM card installed to provide full functional testing.

Link to device website: rabbit r1 - AI in action - just $199, no subscription required

Link to review video: UNDER CONSTRUCTION - CHECK BACK SOON.

Format:

We will go through some initial findings from when we first take it out of the box, this is part of our learning and familiarization before carrying out testing as we have not handled device before beginning this article.

Then we will go through some testing as per the Primary and Secondary objectives as below, then provide an outcome, recommendations and summary.

Primary Objectives:

Out of the box:

Test recognition of people from recorded means (photos and videos) or live in person (Results will be anonymous)
Test how this can be used to identify security weaknesses in products
Test whether the device can recognize a security vulnerability on a building
Run through standard questioning to see if it helps with security advice

After 'learning' process carried out:

See if the device can be trained from hacker/security websites to identify issues
Test if the training has changed the answers to original questions - and how.

Secondary Objectives:

These are simply functional tests done on the R1 out of curiosity, they will include general questions and follow ups as well as specific questions on Emergencies and preparedness advice.

PHASE 1 - Out of the box

Initial observations:

Note: We will not run through the unboxing or basic 'how to' features as there are plenty of these on Youtube, we will try to stay focused on the mission.

After taking the device out of the box and performing the updates, I did notice that the main integrations seem to be retiring soon which sadly removes part of the testing plan. The only integrations that seem to be continuing are Apple Music and Spotify, so the integrations that rely on your location and payment information are not going to be present soon - therefore I will skip the testing on these items but may revisit in future if integrations are re-introduced.

Connection window from the Rabbit Hole — Connections page as of 19/12/2024

The voice recognition seems to be pretty accurate and the initial testing of the conversational side and identification of my dogs seemed to be impressive (I do accept that getting exact breeds from one photo can be tricky, so it didn't identify that one of them is a Lab X with a Great Dane), the ability to have a conversational approach with the device also helps when asking questions and wanting to follow up without having to craft longer and more elaborate questions such as is required with home assistants.

The above images and questions are taken directly from the 'Rabbit hole' which is a web-based log of your questions and images that have been taken. This article will not go into the security of the web-based nature of the queries, however it should be noted that whatever you ask or take photos of is immediately uploaded to the internet so privacy is not guaranteed. For corporations it is also worth noting that if you have security requirements that prohibit information going internationally, there are no guarantees that this is not violating those requirements so you should do your due diligence and ensure you have policies in place to deal with this type of technology.

I tested the basic question and follow up with a query around the current fire weather conditions in New Zealand, the answer is below:

This pulled recent information which was impressive, so I asked a follow up question for my area (local area blanked for privacy).

So it is interesting that it pulled up the relevant information, the follow up question just followed a general conversation flow when considering these risks. It is good to know that it does seem to provide relevant information, I may test this through the rest of the fire season and provide an additional update as fire conditions change so keep an eye out!

So with the initial testing done, the testing of security features was next!

Security based testing:

Identification and Recognition of people:

So for this I decided to try a couple of well known political figures that anyone would recognize, the sources were news reports on Youtube and the results were interesting as it did not seem to recognize them at first (it also seemed to pick up some of Trump's words which I did not realize initially):

Due to the answers given it seemed like it may have just been bad luck, so I persisted a bit with other actors and then got this answer instead:

This seems to indicate that it may be switched off by default, so out of the box it cannot identify people from images. It will be interesting to see if this can be activated, but it does bode well for the privacy of everyday people like you and me.

I performed the facial recognition search against people who are on Facebook (and more candidly post photos of themselves than I do) also did not turn up results. For privacy reasons I will obviously not be posting their photos here or anything to do with their Facebook Accounts! However, as functionality grows for the R1, I will re-visit these tests to see if there are any concerns for privacy.

We will be testing this again when we try to 'teach' the R1 to use certain resources, but straight out of the box does seem to resist searching against images.

NOTE: It should be noted that once a question is asked about a person, the images will appear in the 'Rabbit Hole' so they can be accessed again later on a computer where more tools are available. This is very similar in concerns to cameras on phones, however there is the added 'new gadget curiosity' factor that can be leveraged by malicious people to get photos they may normally struggle to get covertly.

Security Weaknesses in products:

To test this out I used several items that are pretty common, I asked about threats which provided a generic answer. After the Generic answer I tried a few different ways to get specific vulnerabilities to the products that could be used by a malicious actor to exploit, this was done without the perspective of a professional as the threats by trained people are obvious - I wanted to see how this could be used by your everyday criminal who is trying to use this type of tool to 'up their game'.

Smart lock:

First up was a smart lock, I got the R1 to identify the type of lock first.

It identified the type with no real issue but nothing more specific from the outside view (the perspective of the external threat), then I asked about security vulnerabilities:

As you can see, it started with generic answers and I decided to try some basic 'social' engineering (which feels odd when talking to an 'AI') trying to get further information, but all it did was provide the same generic advice.

TBC: Further tests coming

Security Threats on a building:

For this I tried a Fire Station and a residential home (both results will be anonymized for security), I tried from multiple angles and with various methods of questioning. Instead of screenshots here, I have included the text output for each question asked so you can refer to the answers.

TBC: Further tests to be carried out

Can this help with security information and advice?

This is probably the section where the R1 AI shined over the other sections, it was simply providing a lot of information quickly while I was making coffee(s) and certainly helps make the most of your time.

World News:

Cyber Security News:

Cyber Security Briefings/Bulletins:

Cyber Security Advice (collected summary for personal and business):

Physical Security Advice (collected summary for personal and business):

PHASE 2 - Going beyond 'Out of the box'

Learning Process:

For the learning process, I went into the "Rabbit Hole" and used teach mode to train the R1 to use images taken from the camera for searching against multiple options (people search, generic search, vulnerability search etc.) but this functionality does not exist out of the box.

I attempted to both trigger the camera on the Rabbit R1 and use the last recorded image to run more detailed searches, however due to current limitations as of 06/01/2025 this seems to not be available.

I am currently waiting for further information from the R1 team to see if there is a way to get this enabled. We will provide further updates as information comes.

R3ptAr here! I have collaborated with KZ on this blog post to assist with the experimentation and training of the Rabbit R1 AI agent. The following is my write-up of the experimentation I've completed so far with the LAM Playground and Teach Mode functions found in the "Rabbit Hole". This is, hopefully, the beginning of a longer-running series of write-ups detailing KZ and I's attempts at teaching the R1 AI agent to complete a variety of useful, repeatable functions to assist cybersecurity professionals in their never-ending quest to secure networks, organisations and people!

LAM Playground: Further to the initial experimentation in the "Rabbit Hole" web portal, the R1 provides functionality to test web-based interactions through the R1 program. This enables the user to log in to any website via the web browser interface and request that the R1 run a sequence of actions for a desired outcome.

As I attempted to test if the R1 would be suitable in a cybersecurity environment, I experimented with several requests to gauge the R1's proficiency of action and accuracy in the results provided.

The first scenario that came to mind was to utilise Shodan.io to discover openly broadcast vulnerabilities for IT devices. For the uninitiated, Shodan.io is a leading OSINT search engine that indexes devices, servers and services publicly broadcasting to the internet. This service includes the indexing of service banners which provide details regarding the server software, the options the service supports, welcome messages or anything else the client can find out before directly interacting with the server/service. In the world of cybersecurity, this information provides threat actors and defenders alike insight into devices and services across the globe that have potential security vulnerabilities and misconfigurations ripe for abuse or that must be remediated as soon as possible.

When it comes to utilising the Rabbit R1 to assist a cybersecurity professional in their day-to-day capacities, I envision a future where you can pull out the R1, aim it at a network device and utilise built-in lessons (which we will cover in detail in the following segment) to search for known security vulnerabilities or advise the user of best practice security configurations.

LAM Test 1:

The first round of testing with the LAM playground consisted of logging in to Shodan.io with my account credentials to ensure full access to the search engine's functionality. Once logged in, I provided the Rabbit R1 with the following prompt via the web interface pictured below: go to shodan.io, search for "IoT", click on "View Report" and summarise the results

I hit Enter and let the R1 magic begin!

As you can see from the screenshot above, the R1 effectively controlled the web browser interface and navigated to the Search bar, typed in my query "IoT" and hit "Search". Once the results had loaded, the R1 managed to navigate the website to produce a report and summarise the findings in text format. The findings: Based on the Shodan Report for IoT devices: Total Results: 12,260 IoT devices found Key findings: 1. Geographic Distribution: - China appears to have the highest concentration (shown in dark red on the map) - Other significant presences in North America and Europe - Top countries include Korea Republic (3,712 devices), United States (2,359 devices) 2. Common Ports: - Port 54321: 5,948 devices - Port 20002: 1,080 devices - Port 161: 1,036 devices - Port 443: 551 devices - Port 47808: 359 devices 3. Major Organizations: - Korea Telecom - SK Telecom - Comcast Cable Communications, LLC - Bouygues Telecom SA - Aliyun Computing Co., LTD 4. Notable Vulnerabilities: - SMBv3 Remote Code Execution - Ticketbleed - BlueKeep This indicates a significant global IoT presence with particular concentration in Asia, varied port usage patterns, and several critical vulnerabilities that need attention.

Although the results of this experiment may seem basic, I believe they provide valuable insight into the potential of the Rabbit R1 platform. When combined with the appropriate resources and ongoing development of features, the R1 could be customized to become a practical tool for both cybersecurity professionals and enthusiasts.

Further to the first experiment, I utilised the LAM Playground to test a secondary use case scenario wherein a user could use Shodan.io to discover services/servers publicly broadcasting to the internet that are configured with default passwords. As we all know in the world of cybersecurity, default passwords are an open invitation for maligned actors to access and potentially disrupt services critical to operational availability. Checking your organisation or even third-party suppliers for this potential security misconfiguration could be particularly helpful for cybersecurity professionals attempting to identify holes in their posture.

LAM Test 2: As per the previous test scenario, I initially logged into the Shodan.io service with my credentials to ensure full functionality during testing. Once logged in, I provided the R1 with the following text prompt: go to shodan.io, search for "default password", click on "View Report" and summarise the results

I hit Enter once again and let the R1 magic begin!

As you can see from the screenshot above, the R1 effectively controlled the web browser interface and navigated to the Search bar, typed in my query "default passwords" and hit "Search". Once the results had loaded, the R1 managed to navigate the website to produce a report and summarise the findings in text format.

The findings: Based on the Shodan report for "default password" search: Total Results: 51,146 Key findings: 1. Geographic Distribution: - United States leads with 14,370 instances - Korea Republic follows with 4,401 instances - China has 3,893 instances 2. Top Organizations: - Korea Telecom: 3,439 instances - Amazon Data Services NoVa: 2,334 instances - Viettel Group: 1,928 instances 3. Common Ports: - Port 443 (HTTPS): 8,698 instances - Port 8080: 6,011 instances - Port 80 (HTTP): 5,312 instances 4. Notable Vulnerabilities: - HTTP.sys Denial of Service - HTTP.sys Remote Code Execution - CVE-2021-41277 This data suggests a significant number of systems potentially exposed due to default passwords, with the highest concentration in the United States and East Asian countries.

The findings, while somewhat basic, provide a valuable overview of the numerous insecurely configured servers, devices, and services that exist. This serves as a strong warning for all cybersecurity professionals: it is essential to ensure that all of your assets are configured securely. Additionally, the results clearly illustrate the extensive attack surface present in today’s digital landscape. Without proper education, guidance, and implementation, many more internet-connected devices, servers, and services could be vulnerable to exploitation. Once again, whilst this is early days for the R1, there is great potential for this device to produce quick, convenient insight into the current cybersecurity landscape and help inform cybersecurity professionals of common vulnerabilities and security issues. All with the click of a button or a simple text prompt. Teach Mode:

While the LAM Playground scenarios are useful and fascinating to watch in action. The real meat and potatoes of the device can be seen when utilising the "Teach Mode" functionality of the "Rabbit Hole". Teach mode allows you to train the R1 AI agent in a specific sequence of actions to produce a desired outcome. Once this initial learning process is complete, the idea is that the R1 can reproduce similar processes with different input prompts. For example, the first experiment run with this module was to replicate the process completed in the LAM playground. That is, going to Shodan.io and searching for certain devices/servers/services and their associated vulnerabilities. Utilising the "Teach Mode" I was able to produce a lesson that enabled the R1 to access Shodan.io and follow a fixed sequence of searching for a specified query, accessing the results in a report and then summarising the findings. I will provide a step-by-step breakdown of how to use "Teach Mode" to create the process for a desired repeatable query. Similar to the LAM Playground, to be able to utilise the services we need we must first log in with our account credentials.

Once logged in successfully, I can move on to the actual teaching process.

Note: The Rabbit Hole enables the saving of cookies for your services and websites which is particularly handy for the websites you will be interacting with regularly.

The "teach" section of this module requires the input of a defined lesson name to indicate what this lesson should be achieving when accessed alongside a task description to clearly describe the task you want the R1 to complete.

For this particular instance, I named the lesson "Shodan Vulnerabilities Search" and the task description is set as "Search Shodan.io for IoT Vulnerabilities".

Once complete, I was able to begin the teaching process by clicking the "teach" button.

As you can see from the screenshot above, once you begin the "teach" process the R1 AI agent captures the mouse and keyboard inputs provided in the web browser interface on the right of the screen.

One of the notable features of this teach mode is that the "recording log" is accessible which means that if you have a good grasp of the coding syntax utilised there is a diverse range of customisability that can be infused into the working process and logic structure.

Once the recording progress is completed, clicking the "process" button will move this lesson to the final stage.

The screenshot above provides an example of the processing of the lesson in "Teach Mode". The R1 AI agent breaks down the chronological order of your inputs step-by-step. This can be helpful to review as you may notice some steps missing or realise a point in the process that could be improved or augmented to produce your final desired result.

Once the lesson has been processed by the R1 AI agent, the final stage of the module is "replay" which enables you to replay the lesson created with the added input of a new search query. As you can see in the screenshot above, I substituted "IoT" with the search term "default passwords".

This prompt produced a total of 374 results and the R1 agent produced the shodan report quickly and effectively based on one simple text prompt.

Finally, after completing your lesson, you can find all of the lessons kept in memory of the Rabbit R1 on the teach mode landing page. The most compelling and, frankly, exciting thing about this functionality of the Rabbit R1 is that once this lesson is in place, you can retrieve these lessons with your voice directly on the R1. KZ will be demonstrating the use of voice prompts and lessons in his summary YouTube video. If you'd like to see this in action, please make sure you check that out!

Overall, the functionality currently available for the Rabbit R1 opens the doors to many potential use cases, whether applicable to cybersecurity or otherwise! Some other use case scenarios I have tested but have yet to fully document here are tasks such as using the R1 to invoke OTX Alienvault to run an endpoint scan utilising subscribed pulses and up-to-date threat intelligence, checking hashes and files with VirusTotal and many more. I will endeavour to continue experimenting with these functions and to document them in detail for all users interested in the evolution of this tool. We hope that the R1 developers will eventually be able to marry the image search functionality (as described by KZ earlier in this post) with the teach mode module so that one day we can aim the Rabbit R1 at a device and run any lesson required to gain insight into said device's security vulnerabilities, known issues and best practice configurations.

Until the next post, I'll hand this one back over to KZ! - R3ptAr

Further testing and training here

Secondary Objectives:

For the secondary objectives, I did not have a solid plan and just decided to try things out as they occur to me and pass along some interesting use cases.

One of these however did come up with my tinkering on the Rabbit R1 website, you can make a custom voice for your RabbitR1 and all it takes is a small description of what you are looking for. For the purposes of testing the Deep Fake possibilities, I thought I would try a few well-known celebrity voices - for your enjoyment below:

Robin Williams: https://youtu.be/VA-SzC1oHVE
Donald Trump: https://youtu.be/Lf0bpASe-Wo
Cortana (Jen Taylor): https://youtu.be/C7QmmWxtt88
Jim Carrey: https://youtu.be/kqRzeHUGS9g
Terry Crews: https://youtu.be/vbfMKvtTFX4
- Terry Crews 2: https://youtu.be/Qw_HgnqJdwE
- (Specified to sound more like his Brooklyn 99 character)
Elon Musk: https://youtu.be/1yHSsTT4uGw

I think after seeing the examples, you will agree that the deep fake possibilities are rather limited through the R1! Although, this may improve with time and there are lots of possibilities out there! We will also continue testing and provide further short videos should we get any good examples.

As this post got very long very fast, I will give brief information on further tests carried out:

Plant identification
- I am not a gardener (my partner frequently asks me about plants, but I have no idea). Testing this identification function through the R1 was surprisingly accurate which led me to wonder if it could narrow down my location through identification of plants. It did a pretty good job with identifying that I lived along a coastal area and identified the region, with a little more tweaking this could in theory be used to narrow down a location (good in the right hands, bad in the wrong hands just like all tech).
Summarising and translating
- There are many boards of information for tourism purposes around my location, I tested asking the R1 to read and summarise these boards with great success. Then I decided to try asking it to translate that summary into French which it also did successfully. This is interesting as you could get a summary and translation of anything with less than a few seconds of time taken to take a photo - again this is not far removed from other technology but was quite impressive how quickly it summarised the information and translated it.
HAM Radio uses
- As I am super cool and a HAM radio guy (stop rolling your eyes please), I posed some questions around antenna lengths that would have taken me a little research to come up with the answers for, and it very quickly came up with antenna lengths for a linear antenna at all bands at 1/2 wave, 1/4 wave and 1/8 wave. In a matter of minutes all the information was available to me. I may do another article on this at a later date as this seems useful for new HAMs.

Summary:

So to summarise this experiment, I would say the device is quite impressive for the abilities that it has out of the box with a great deal of promise for the future options. While right now, it is not easily used for malicious behavior the possibility cannot be ruled out as future upgrades may be mis-used by criminals who may come across it and realise it may take the thinking out of their 'job'.

That said however, like most tools that can be used for crime, they can also assist the everyday person or security worker to find holes in their own security and patch them - this is always a good thing. If you talk to cops there will be many who agree that to catch criminals - you have to think like them, that's not an option for most people. With a device like this and publicly available scripts that can be run by anyone - the opportunity may exist for this or similar devices to be a real tool in keeping your home/company safe and spot those holes that may be exploited. (Watch this space as we will be working on this type of approach with the R1)

The deep fake possibilities are certainly limited right now, however we will be testing this regularly and seeing if there are ways to get the R1 to mimic voices more effectively (if you would like to see a celebrity voice we have not covered, let us know and we will see what we can do).

The ability to use this device to keep yourself up to date on threats and have one or many follow up questions that retain the context is very impressive, more so than Alexa or Google Home which are very spotty and cannot follow a conversation most of the time.

Conclusion:

We know the R1 got a rough launch but the emergence of the LAM and ability to Teach the device makes it worth close attention. While the R1 in itself it cannot easily be mis-used out of the box, the potential is there with the integrations and lessons you can create. This will require the criminal to have a certain level of technical knowledge so is unlikely to be widely employed by your average criminal.

This can also be used to help you get ahead of the malicious actors in the world but will likely require the work of people who are invested in keeping you secure to make it more accessible, much like when Police, Fire and Ambulance teams perform prevention and awareness work - the Cyber Security or Developer communities would be well placed to make this device into a real tool for pros and general public.

The R1 certainly has potential both for good or bad much like any other tech device out there, the lessons we learned could certainly be applied to similar products being launched right now (Humane AI Pin, Smart Glasses of all types and many other 'AI' products) so we focused on one but may expand our research as new items catch our eye.

Despite the potential for misuse, I think there is also real potential to use this as a tool in under-resourced IT departments, non-tech savvy general public, time strapped professionals who just want updates while making coffee and for quick answers to some technical questions when you're out and about. We will continue to test and provide less lengthy reports (unless we really need to make another one this size!), as well as shorts on interesting developments with the R1 on a security/safety/HAM subject.

SecuriKiwi -

AI Security Testing - Rabbit R1

Recent Posts

Comments