Hacking a Toy Drone to Put Artificial Intelligence on It, Part I : The Hack!

drone

I've been working with some friends on an little academic project aiming to create a business model of a high-tech company related to data and artificial intelligence. The assignment itself was f'ing boring, so we found ourselves time to time just chatting and fooling around. One interesting chitchat (at least for myself) was about the capabilities that A.I. would bring to connected objects, vehicles and specifically drones, where I've expressed that I would love to do some experimentation with a drone if I had one within reach. Two, very lovely, friends recorded that and offered me a fpv (first person video) drone for my birthday, a Chinese toy drone to be more specific.

Now That I have a drone, it will all be fun and I will enjoy my time crafting a convolutional neural network or a real time object recognition for random applications to put this drone on steroids, the only thing I have to do now is to use its API...you said API?! I was a bit naive to believe that this toy drone (a Snaptain S5C) would have an interface to communicate with. I've spent a big amount of time googling various key words, looking in forums, reddit, random websites...there was no API you fool!

This quick conclusion saddened me A LOT...until I've realized something a bit interesting : to use this drone, you have to install an app on your phone, turn on the drone using a switch and then connect to a sweet non-secured Wifi hotspot, yep you've heard it (I mean read it) a non-secured Wifi where anyone and anything can connect to it without the need for any authentification or any security layer! Or as Michael Robinson said it, on one of the most entertaining DEFCON talks : "The thing is its own flying router with DHCP enabled...AWESOME!". This is great news, the project I had on mind went from fun to exciting, hacking something to use it feels like being a caveman that will cook his own quarry for dinner.

Having a direct access to Wifi is not enough to call this a hack nor sufficient to command the drone. I need to figure out the communication protocol so I can control the drone from my computer and more importantly retrieve the video stream. For that I have a strategy based on one of the oldest trick on the book of hacking: sniffing packets! The following graph depicts what is my approach for this hack.

So, I will use my smartphone (1) to communicate with the drone (3) through its official app. This communication between them is represented at (2) as a hexadecimal stream (a more convenient way to represent binary sockets). The idea is to connect to the same hotspot with a third party machine or agent (4) and listen (sniffing is the proper term) to the packets that have been exchanged between the two. Having this stream, I will try to imitate it, using my laptop this, time to communicate directly with the drone, so it will look like this representation :

Ok, now that the plan is set, I will connect from my laptop to the drone's Wifi hotspot and do the most basic action : see what is my IP address. It seems that the IP address follows this structure 172.17.186.146, which means that I need to have a closer look at IP addresses starting with 172.17 when sniffing the packets.

I've used Wireshark to have a first glance at the packets that's been transmitted throw the hotspot, and this is what I've got :

wireshark all

I can see my laptop base IP address 172.17.181.146 and another one with the same basis 172.17.10.1. I'm pretty sure this is the drone's IP. A lot of packets has been sent to the drone from my laptop which is not expected, I was hoping for another IP address with the same base representing my smartphone, but nothing is showing any activity related to the drone/smartphone communication, it's like the main signal is drowned in all this DNS requests and I don't have any clue on how to properly put aside what interests me. I think sniffing is not enough...and a man-in-the-middle attack is necessary to relay communication from the laptop to capture those packets.

Before using anything fancy, I think I will make a little change this time: instead of using my laptop to listen to the exchanged packets, I will use a third party agent installed inside the smartphone (a mobile app called Packet Capture) and only focus on the Snaptain's app (the drone official app).

After the app selection let's launch it, receive the video stream and put the drone on "ON Mode" (ready-to-fly). Redoing this a couple of times will give us the chance to see if any pattern is emerging, which is the case in fact. Each time doing this, there is 5 connections made to the IP address 172.17.10.1 a.k.a the drone, four of them represents a TCP protocol on port 8888 and the one right after is a UDP protocol on port 9125.

Looking for the total "weight" of packets transmitted, the fourth one shows 22 Mb (don't mind the Mo at the screenshot it is the french equivalent to Mb, bytes in french is "octets") which is a value constantly increasing. I can assume that this specific connection is certainly linked to the video stream. So let us start by retrieving the video stream, and to do that I will take a closer look into the packets transmitted and received through this connection (blue is transmitted, red is received).

As you can see, there is three phases in this packets transmission :

  • Sending 106 bytes of data
  • Receiving 106 bytes of data
  • Open bar or streaming the video!

I can emulate this by using my laptop (socket programming), then I'll be using Wireshark to see what's happening exactly.

python
# importing the socket library and codecs to convert hexadecimals to bytes
import socket
import codecs


# defining the address to our drone
HOST = '172.17.10.1'
PORT = 8888

# defining a socket object for TCP protocol
sv = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# connecting to the drone
sv.connect((HOST, PORT))

Wireshark shows us that we've succeeded to connect to the drone, great!

connect drone

The only thing remaining now is to send the same "scavenged" data and voilà!

python
# same data sent from the app but now we will use the laptop to do it
hexa_message = '4954640000005d000000c2e2c6d1cb7992b3385faf6c6e4b49b992db3e6afc10502d79800ca1a5e5bad4aa2d951581b4ab822f3fdbd00738a62f8a3144a7322c11dc245de017f9144ccc251875e0131b1ba99e4086148dc8ad5fd6ff3ad3d849817dd853e1c9cda65606'

# sending the data to the drone
sv.send(codecs.decode(hexa_message, 'hex'))

# printing the response as a hexadecimal and we specify that 
# we're only interested by the first 106 bytes
print(codecs.encode(sv.recv(106), 'hex'))

Uh-oh...the reception of the response is hanging and Wireshark is really showing some nasty messages ...I've failed miserably...without really knowing why...

problem socket

After banging my head to multiple walls and ruthlessly googling, I've finally found one little phrase hinting to what I have to do (thanks to Hermann Stamm-Wilbrandt and his repo here ):

Opening a connection against stealth port with connect() hangs because no "SYN ACK" gets returned; I use fork() to deal with that. sending a single (hand crafted) SYN packet (and do not deal with a response in case there is one) allows for single threaded operation of pull_video.

Ok, this is not really a direct answer to my problem, but it gave me an eureka moment. What I need to do is replicate inch by inch the packets sent before the video stream, if I guessed it right, this is like a "handshake" or a confirmation that the drone is dealing with the right and true client...so I did just that!

The code bellow (yes it is a bit barbaric) represents the emulation of the first connection using the TCP protocol. Doing this, I've got no reception hanging and all hexadecimals received corresponds perfectly to the ones seen during packets capturing.

python
import socket
import codecs


HOST = '172.17.10.1'
PORT = 8888

s0 = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s0.connect((HOST, PORT))

packets = [
    '49546400000052000000bbb2993925b2a4c3dc01d8b1b5115b9892db3e6afc10502d79800ca1a5e5bad4aa2d951581b4ab822f3fdbd00738a62f8a3144a7322c11dc245de017f9144ccca7967a1bb6dc2da95ed3faefba06f003a50b3260d3c0f800dbd5b77b51de8913',
    '4954640000005200000005a54f1856b34db5a441b68ab79ceda092db3e6afc10502d79800ca1a5e5bad4aa2d951581b4ab822f3fdbd00738a62f8a3144a7322c11dc245de017f9144ccca7967a1bb6dc2da95ed3faefba06f003a50b3260d3c0f800dbd5b77b51de8913',
    '49546400000056000000fdc1c96b0d382310d8dec4b8bc41deb392db3e6afc10502d79800ca1a5e5bad4aa2d951581b4ab822f3fdbd00738a62f8a3144a7322c11dc245de017f9144cccf021e8c16cbea0baccf51500f0f6037f3bf52412101c5026b023487e6f808e24',
    '4954640000005200000097b29b56ae56992f5637b7dbafee1d4c92db3e6afc10502d79800ca1a5e5bad4aa2d951581b4ab822f3fdbd00738a62f8a3144a7322c11dc245de017f9144ccca7967a1bb6dc2da95ed3faefba06f003a50b3260d3c0f800dbd5b77b51de8913',
    '4954640000005e00000072e15ff488174d2d4c1afb8ce9282e1192db3e6afc10502d79800ca1a5e5bad4aa2d951581b4ab822f3fdbd00738a62f8a3144a7322c11dc245de017f9144cccb13411060f0e5d34627af71ed31e3be8444e905fae73c0875e866f1d9064032e',
    '49547400000066000000a006285e6567863b79bb534a0c04024792db3e6afc10502d79800ca1a5e5bad4aa2d951581b4ab822f3fdbd00738a62f8a3144a7322c11dc245de017f9144ccca6d7f518e61c1478edea3c23a2941dd5f384d65615a44394f1f10a583e249ce0a50b3260d3c0f800dbd5b77b51de8913',
    '49546400000052000000ec38203549c433442df1b61a7d88a29a92db3e6afc10502d79800ca1a5e5bad4aa2d951581b4ab822f3fdbd00738a62f8a3144a7322c11dc245de017f9144ccca7967a1bb6dc2da95ed3faefba06f003a50b3260d3c0f800dbd5b77b51de8913',
    '4954640000005a0000009c0555c4dcda4b8bf43e037a8b5049bc92db3e6afc10502d79800ca1a5e5bad4aa2d951581b4ab822f3fdbd00738a62f8a3144a7322c11dc245de017f9144cccf1a14e872edb8fec7fa3ab290d38744049af6aeee42a6ea4c12db7b6dff2796c',
    '49546400000054000000a483512855cb3c2b1ac9068f107d041692db3e6afc10502d79800ca1a5e5bad4aa2d951581b4ab822f3fdbd00738a62f8a3144a7322c11dc245de017f9144ccc2700d69f52b63cb3fa430f8ec4a770200f11e142207501637bbe2a72017a0032',
    '4954640000005200000049071b70200f540cde85f09f25cf10b492db3e6afc10502d79800ca1a5e5bad4aa2d951581b4ab822f3fdbd00738a62f8a3144a7322c11dc245de017f9144ccca7967a1bb6dc2da95ed3faefba06f003a50b3260d3c0f800dbd5b77b51de8913'
]

# looping packets and printing the hexadecimal response
for packet in packets:
    s0.send(codecs.decode(packet, 'hex'))
    print(codecs.encode(s0.recv(1024), 'hex'))
    print('\n--------\n')

The codecs.encode(sv.recv(106), 'hex') gets the first 106 bytes so we can skip it and all what follows will be written to a file with a H.264 video extension, you'll ask me "how the hell do you know that this is the right format ?", and my answer is :

  • Thanks to google, I've discovered that a lot of toy drones and many IP connected objects streaming videos are using this format, for a very reasonable reason that it supports signal discontinuity.
  • If you've noticed it before, the "Open bar" packet starts with this hexadecimals : 00 00 01 A1, which is a signature specefic to H.264 format.
python
BUFFER_SIZE = 1024

f = open('drone_stream.h264', 'wb')
while True:
    try:
        data = sv.recv(BUFFER_SIZE)
        f.write(data)
    except KeyboardInterrupt:
        sv.close()
        f.close()
sv.close()
f.close()

Did I retrieve the video stream now ? The answer is a f'ing YES!!! I've got the VIDEO STREAM!!!

One of the two objectives has now been achieved, what remains now is sending a control signal to the drone and hope it will react to it.

The first 4 TCP connections seem to be related to protocole confirmation and video streaming, the fifth one is different in terms of the protocole used (a UDP) and the port number. With a closer look, the connection to the drone using UDP at port 9125 starts with sending empty bytes until a certain point in time where it emits an 8 bytes signal with this format : 66 80 80 7E 80 00 FE 99.

If you recall what I've written before, after starting the drone's app and receiving the stream I've put it into "ON/ready-to-fly Mode". I'm in a way confident that the behavior described through the packets is correlated to this series of events and that the hexadecimal signal we're seeing here corresponds to the "ON" signal sent to the drone.

I did further examination by sending different commands, through the app, to the drone and the control packets are consistent in terms of there format. They are always 8 bytes signals, starting with 66 and finishing with 99, and the bytes in between are the ones changing so I will assume changing their values will give different navigation commands to the drone.

Let's wake up the drone by sending what I think is the "ON/ready-to-fly" command (66 80 80 00 80 00 80 99). If the drone is receptive it will stop blinking its lights.

python
# sending "ON Mode" command, and adapting it to UDP by choosing socket.SOCK_DGRAM
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.connect((HOST, 9125))
cmd = '6680800080008099'
s.send(codecs.decode(cmd, 'hex'))

S#@t again...the drone seems stagnant and not receptive to my command as you can see it here...

So after rechecking the captured packets I've realized my mistake, the drone's app seems to send not one and lonely command but it sends the same command over, and over and over again. It is like a buffer making sure that the drone is receiving a constant stream of live execution signals...so knowing that I've changed my code to reflect this by simulating a buffer with a 50 ms delay.

python
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.connect((HOST, 9125))
cmd = '6680800080008099'

while True:
    s.send(codecs.decode(cmd, 'hex'))
    time.sleep(.05)

And holy moly I've guessed it right, now the drone is receptive and it stops blinking! Second objective achieved!

After achieving the two "it is a hack" criteria, I can officially declare that I've succeeded to hack this Chinese toy drone! Unfortunately drones for personal use are not allowed in Morocco, where I'm spending my vacation right now. So I will put aside this project until coming back to Paris...the project is not finished yet, two parts are still remaining.

That's all folks! Don't hesitate to follow me on Twitter!