DEF CON 31 - Using SIM Tunneling to Travel at Light Speed - Adrian Dabrowski, Gabriel Gegenhuber
Good morning, everyone. Yeah, my name is Adrian Dabrowski. My colleague, Gabriel Dabrowski. This is work, together with Wilfried Mayer and Edgar Weippl. We will be talking about decoupling the SIM card from the mobile phone or the modem to get all kinds of fancy measurements. And it's called Cellular Carriers Hate
This Trick. I will explain it why. Using SIM Tunnels to Travel at Light Speed. For the motivational example, let's assume we have three medieval city-states, and they recently invested into technology and so they switched from landline to wireless, to cellular networks. And we have all these new mobile phone operators popping up like KnighT Mobile or Arthur and Excalibur with the sword in there and dragonphone, and all the others. And with that, we also had an explosion of new social networks and we'll have something like RoyalTweet and BardBoard and PesantPost. And all these people are super hooked on social networks,
so their mobile phone operators think, "Well, how do we target some demographics that like social networks?" So they come up with these new data plans where there is something already included like video streaming or messaging or social network apps. And we have... It's super transformative. It's very addictive. You have all these people spending hours and hours on end on the social networks. And I heard
that the king over there got actually so much hooked on it. He's considering buying RoyalTweet and renaming it to the Roman numeral of 10. And so this is Archibald. He recently switched from Alchemy to Cellular Network Research.
And he also has a few friends abroad and he also already found a few inconsistencies and maybe vulnerabilities in his local network, but he is a bit a prisoner of his own city-state. Well, when he can travel, he can do the measurements abroad in another network, but also when visiting his friends, he actually would like to spend time with the friends, and also it's getting very expensive. Why is roaming so complex or why is it so interesting? The interesting thing with roaming is that you have your home network operator on one side and your visiting mobile network operator that's using some interconnection in between, pretend to you as the customer that they're providing a singular set of services and a consistent set of services. Although they're using completely different configurations, hardware, software manufacturers, whatnot. And so at a closer look, or we see that this picture of a consistent service pretty fast falls apart. And you might wonder, "Well when I'm abroad, how is my traffic actually routed?" And there are two ways. You can have either local breakout or you can have home routing. Interestingly, depending
on the service that you're using, it will either use one or the other. So, for example, if you're using data, that's usually home-routed. So even if you're abroad, all your data will exit with an IP address of your home network operator so you can't use all the fancy dual-locked services in another country unless you connect to the Wi-Fi. For other services like voice, you have local breakout. Back in the 1980s, when GSM was specified, or even today, voice is considered as a time-critical service, so you like to get it out into the public networks as fast as possible with a route as short as possible. So technically, for example, voice roaming works by the visiting network operator issuing you a temporary phone number in that country that you're visiting, and your home network operator then reroutes all the incoming calls to this new temporary phone number.
So you have all the small differences in roaming. And so, while Archibald likes to travel, it's getting very expensive, and all this testing can be quite tedious. Let's look at an example. So whoever travels internationally for DEFCON might have received that SMS. That's because AT&T doesn't support voice roaming for most European carriers.
However, it does support data and SMS. So you might even get billed for voice traffic while in the US without being able to use voice. And so you have all the small oddities. And let's see can we measure that in our toy example? Let's take one SIM card. We have one home operator plus three network operators in other city-state and the other. However, there's just one SIM card. Of course, there are multiple carriers within our home network or home country, so we'll have to buy multiple SIM cards. But there are also more than one data plan, which might be
important because they support different services. So we'll have to multiply by that. But then, of course, you have all the network operators in the other city-states as well, and we end up, just in our toy example, with 190 combinations. So clearly, that doesn't scale well. And what are the possibilities here for Archibald to continue his work? Well, he could buy a lot of SIM cards and a lot of modems and position them everywhere. Well, in the three countries. But well, the hardware costs the monthly costs, and soon after that bankruptcy, so that doesn't work. Well, you could have one modem in each country and then ship SIM cards around, but that has large overhead and the shipping times and a lot of manual labor. Or what we did is try to decouple
the SIM card from the modem. So usually, the SIM card and the modem are like one unit and they communicate with each other. But what if we can extend this internal bus all around the globe? And so that's what MobileAtlas does. The academic name is Geographically Decoupling Cellular Measurements and Exploitation. And with that, we can now solid-state travel. We can connect SIM cards to different places to our modems all around the world and pretend to the network operators that we are in that country and do our measurements there and tests. What were our goals with the project? Well, of course, scalability, automatability, but also an important point is, control the background noise. If you use a off-the-shelf cellular phone,
then you have a full-blown operating system on it with all kinds of background tasks that might interfere with the measurements you are doing or with the exploits you're testing depending on what you're precisely doing. And we also would like to have the full-feature spectrum. For those who know RIPE Atlas, so our name Mobile Atlas basically is a homage to RIPE Atlas. RIPE Atlas is a probe system developed and maintained by the European internet authorities or administration and they have the probes, and you can do pings and trace routes between different autonomous systems. However, in a cellular world, we have more than just an internet or data connection. We also have phone calls. We have USSDs. We have text messaging. So we would like ideally to test or have a test system that works on all these features. Here, a short diagram. Traditional combinatorial
explosion. You put in probes in different countries and replicate all the SIM cards, or you have something where you can tunnel the SIM cards to one place and save a lot on costs. Basically, that's what we did. We decoupled the SIM cards where we can have a SIM card reader connected to a computer and we have done a TCP tunnel to our measurement probes and replicate the SIM card there. And so, at a management on top of it. And you'll end up with a system like on the left side. You have the probes on the right. You have the
SIM cards and you have a SIM provider, which is basically just a piece of software that connects to a PC/SC or serial or even an Android phone and routes the SIM card traffic to the probes. It needs to be an online connection because the SIM cards produce all the cryptographic material that we need on the network side to authenticate and to encrypt the traffic there. This has been a project, ongoing for five years now, so we have... You can see our left probe, which looks very crude. So we had a Raspberry Pi and a USB adapter and then a M. 2 modem attached to this adapter. And because everything was very loose in the box,
we put in this yellow piece of foam in there to hold everything in place. But the current version looks much more professional. It's a shield on top of the Raspberry Pi. And so then the only other thing you need is an ethernet connection for the uplink. And so on one side, we have the SIM provider, which is basically just this piece of software that works with all the usual SIM provider, SIM card readers. You can use the more expensive PC/SC readers or the very cheap Chinese SIM readers that only speak a very crude serial protocol, or you can even use an Android phone. Oh, we didn't have the picture. Or maybe it's later in our slides. So I briefly want to talk
about the challenges that we faced, and I have to pick two because of time. So let's talk about the SIM interface and the SIM protocol. So the SIM card protocol is basically a smart card protocol, but it's now 40 years old. So you have a lot of different options, voltages, speeds, and all kinds of that. And also, it was designed to work within one device with very low latencies.
When we stretch that over half of the globe, we need a few techniques to cope with the latency. You can see, for example, this is how we connected it with the SIM slot of the modem to the GPIO pins of the Raspberry Pi. We just made a very small simple adapter that we slot in. Our first version was actually directly soldered-in. And you can see just one component on it. And that's a Schottky diode. Why is that? Because the SIM card I/O pin is actually an open collector bus. So on the Raspberry Pi side, we need a Schottky diode to split up the send and the receive channels for the UART. And the pull-up register is already provided by the modem. Luckily for us, this reduces a lot of complexity for us. We can negotiate
speeds and voltages and other parameters independently on the SIM provider side and on the modem side. We don't have to pass it one-on-one, so that eases much of the problems. We can also add waiting time extensions and we tested it for latencies up to a thousand milliseconds. So this should actually be good enough even for Starlink connections and future work. We'd like to also locally emulate some of the files that are not necessary for the modem or for the measurements to work. The second problem that I like to mention is traffic metering. And we need a way to control the background traffic. Let's step back. So some of the tests we want to do is test the accounting, the data counting of network operators. So we need
very precise measurements on what we are sending to the network and what is then accounted. And so the background traffic is something that will mess up with these measurements. And the other thing is that the call data records often are shown on the operator website with a large delay. So domestically, this can be like ours. But internationally, this can be something around this. And also, there is no standardized way to check your account balance, so some operators use an app or web application. Some can use or support USSD codes or SMS inquiries. To eliminate background traffic, we basically use Linux network spaces. That's the same thing that
Docker does. So our measurement process is put into a separate namespace that's then connected to the modem, and only that one talks to the modem. So all the traffic from that process group is routed through the modem. And all the other things like the management suite is all over the VPN.
They're completely separated. How do we deal with delayed traffic accounting? Well, we came up with binary encoding. So what we do is, for example, the first test is what, one megabyte in size. The second test is two megabytes. The third one is four. And the fourth test is eight, and so on. So one of these also will be a control group,
so that at the end, maybe a day later when it's finally shows up on the accounting balance, we can then distinguish exactly which test was accounted and which wasn't, like which traffic group. You might ask, "Well, you do all this thing to tunnel physical SIM cards across the globe. What about eSIMs?" The problem with eSIMs, even though for example in the US, they're pretty relevant. They are not widely available everywhere. And also, it usually tends to only cover some data plans and not all. And they're not always easy transferable between devices. So what we actually can do is we can use Bluetooth rSAP protocol to connect to an Android phone that then shares the SIM card over Bluetooth to our system. The SIM Access Profile was actually delivered or developed somewhere in the '90s to
allow cars to connect to your phone and then use the SIM card on your phone and the modem from the car. But today, this is rarely actually used. So the only thing you need to do on an Android is make the eSIM your primary SIM card, and then you get the screen and you allow the usage. In the title, we said, "Carriers hate this trick." That might be a little bit controversial. Why
do we think carriers hate this trick? Well, SIM tunneling isn't exactly new. It has already been used for over-the-top bypass fraud. So that's when you use batteries of SIMs or SIM banks to terminate international travel or international calls within the country because usually, local call rates are cheaper than international interconnect fees. However, we do actually the opposite. We are tunneling from domestic to abroad to test all the other networks. This might also hint why this might be an opportunity for carriers to have such a system. Because nowadays, carriers have to rely on their roaming
partners to deliver the services the way that they wanted for the customers. But with a system like this, the carriers can actually verify the services and especially things like Voice over LTE roaming, which lacks good auto configuration protocol and it causes a lot of trouble internationally. This might actually help to test the different configurations. What have we learned during the implementation of our system? A few, well, some surprising results. First, if you read almost any book on cellular networks, they will basically say something among the lines that the IMSI is basically the unique identifier of a SIM card. And they're also used
to find your home operator and stuff like that. However, even in our small tests, we found several examples of SIM cards that can actually update the IMSI over the air or change it dynamically. This is usually used for selecting a roaming network. So it's not like you're not selecting the
network that you are using as a visitor. You're selecting someone that has all the contracts in place with all the operators in the different country so that they have just one place to do their accounting and to work with. And the other thing that we've learned is that theoretically there is a 127-device limit on USB, but practically, it's hard to get over 20 or 30. That has to do with lousy hardware,
weak drivers, the power consumption, even if you use active hubs. And we've tried several things. Here are the pictures. So on the left, you can see we bought a box of a hundred SIM card readers on AliExpress. And then we tried to run them naively on all the single USB hubs. That turned out to be very error-prone. And then we also tried these professional USB hubs that are used by, or have been used by miners for these USB FPGA boards, but this also didn't work well.
Where do we stand today? So now, our system is deployed to 10 European countries and to North American. You can see Canada isn't fully covered. That has to do with, in Canada, not all the bare metal operators are available in all the provinces. Our current probe in Canada is in the Yukon Territory, so that's why it's just half green. Ethical considerations. So there are some ethical considerations we have to talk about. You might,
for example, ask, "Why do we use modems and don't use software-defined radios?" I mean on the one side, software-defined radios give you much more capabilities on the radio side. On the other side, all the open-source implementations usually only focus on one access technology, so you can get a GSM implementation or you can get an LTE implementation, but you cannot get one implementation that covers all the access technologies. And the other thing is that it's a regulatory minefield. So we give out, or so far we gave out these probes to friends and family in different countries and we cannot subject them to the risk of having a software-defined radio and all the radio regulatory problems that might come with that. So we rather opt for
unmodified globally-certified modems that are safe to use in all the countries. The second thing that I want to mention is we do not enrich ourselves with our tests, for example, with the traffic accounting tests. So we made sure that at the end of the month, we let expire at least that amount of traffic that wasn't accounted in our tests. And with that, I'll switch over to Gabriel who will talk about the results and what you can actually do with our system.
Yeah, thank you. So I would say, "Let the games begin." So let's take a look at what we can do with this fancy platform. I'll walk you through a few showcases. Of course, our platform has very versatile capabilities. But the first one will be an internet-related measurement case, so it will be about curating measurements. It will be billing measurements. And also, after presenting the measurements, we will show some proof of concept, how you could abuse these zero-rating offers or how an attacker could abuse this to gain some free internet traffic.
What is zero-rating? Actually, there are some providers that offers this kind of programs and offers. Usually, they provide several groups for applications. So in this screenshot example, there is a group for messaging application, for social media applications, and also for video. So for example, for Netflix. And yeah, they offer this. The customers can buy a package. And then by buying this package, they gain unmetered access to this kind of applications. Of course, from a carrier perspective, this data traffic, all the data traffic that passes the provider needs to be classified, so it needs to be separated. It needs to be classified into billed traffic and also zero-rating traffic. And let's take a look at which possibilities they have for
the cellular carrier. Which metrics could be used for the classification? A very old metric that has been used back in the days to classify and to then block BitTorrent traffic was the TCP or UDP port. Nowadays, it's mainly used in conjunction with some other metrics because it's kind of vague. Most of the traffic anyway is web traffic, so it might be using Port 4 for free. And yeah, you could easily fake this port. Thereby, it's not that reliable. However, the IP address is kind of accurate, especially for all those big services like WhatsApp, usually, the IP address of the service is pretty static, so it's a reliable classification metric. Some provider might use some cloud hosting, so some applications. But yeah, usually, if it's a big application, it's pretty stable.
So it's a good classification metric. Also, some operators use the packet inspection. So this is when the classification mechanism doesn't only look at packets header, but also at the content of the packets. So the classifier needs to be protocol aware. It needs to understand what the fields of the protocol mean. And yeah, this is also commonly used. Also,
for our measurements, we focused on IP address and on deep packet inspection classification. For deep packet inspection, we mainly focus on hostname-based classification. But also there are some other metrics. So some operators, for example, classify by the time to live, to detect, or to block mobile hotspots. And since this is kind of popular with any problem, there are
also some people throwing machine learning at it. Yeah, let's take a look at deep packet inspection. So this is an example for hostname-based classification. If the traffic is just HTTP3, it's pretty straightforward because we do not have any encryption. So the classifier can simply take a look at host data of the protocol. If it's an encrypted connection like HTTPS or HTTP3,
the classifier actually has to take a look at the TLS handshake and thereby take a look at the clientele message that contains the server name indication. But yeah, it's again pretty similar. So just the hostname is extracted and thereby the traffic is classified. Within our study, we've bought some SIM cards. We've measured seven operators of three different countries. And within those SIM cards, we've analyzed available zero-rating
applications. And we figured that WhatsApp, Snapchat, and Facebook, and Facebook Messenger were the most popular applications. Like any Android or iOS application, they're heavily communicating via Web APIs via web endpoints. We got some traffic dumps from those applications and also reverse-engineered applications. And yeah, we found some endpoints that we could use of our measurements to probe this kind of web servers. And for the selected web endpoints,
we found that they support HTTP, HTTPS, and HTTP3. And also they are hosted via dual stack so you could communicate to them via IPv4 or IPv6. Yeah, we've had two measurement campaigns. And yeah, we've executed measurements in the domestic case, but also during roaming conditions. Our basic methodology for these kind of zero-rating measurements was to get the credits then to execute some experiments and then wait for the data to be built. And again, get the credit and calculate the delta. Within the experiment, we had some payload that was potentially zero-rated. And afterwards, as Adrian already explained, we also had some
control traffic that was acting as a marker, so we knew when all the data units were built. And for the zero-rating experiments, we had three experiments. The first one was to verify that the web endpoints that we selected are actually zero-rated. And the other two were to learn more about the classification, so to detect IP-based or hostname-based classification methods. This is a chart for the very first experiment. We have our MobileAtlas measurement probe on the left side and we have some web endpoints on the right side. So in this case we are probing WhatsApp.
So we are just repeatedly querying or retrieving some web endpoints until our specified data units were retrieved. And afterwards, we do the same with some control traffic that usually is bigger. And when the control traffic is built, we know that the experiment is finished and we can say whether the first traffic was subtracted from our data quota or whether or not. To detect a IP-based classification, the test case was pretty similar. So the actors didn't change,
but we just spoofed the host header. So the data packets were still going to the WhatsApp web server, but the host header didn't match the WhatsApp endpoint anymore. And so if, for this case, the packets still were zero-rated, we knew that most probably some IP-based classification is in place. And we did a similar thing to detect hostname-based classification, but we needed to introduce another actor. In this test case, we automatically spin up an AWS instance that just
forwards all the necessary ports to the WhatsApp application, so through the WhatsApp web server. Thereby, the host header didn't change because the content of the data packets was exactly the same, but the IP address changed. So when the traffic classifier is watching the data packets, is inspecting the data packets, the hostname is still WhatsApp, but the IP address is the IP address of our AWS instance. And thereby, if the traffic is still zero-rated in this case, we knew again that some hostname-based classification is used. Let's come to some results. As you can see, we
found that operators are using both IP-based and hostname-based classification. Sometimes they even combine it. So in this cases, they zero-rated the traffic when either one of the rules applied. And interestingly, for two operators, actually we were not able to verify zero-rating, so all the tested packets were fully built. Although this operator promoted and sold some zero-rating packages to
their customers, we were very surprised by this. And to make sure that this isn't just a quirk of our measurement methodology, we also simulated this on our smartphones, so we just verified it. We downloaded the Facebook application. We plucked our SIM cards, the corresponding SIM cards, and we again did some measurements. We could verify that a huge portion, so over 90% of the actual application traffic was wrongfully built. Yeah. Additionally, some operators
turned off zero-rating during roaming. So this already was challenged by the national regulators in some countries. And yeah, nevertheless, we found some operators still doing this. Additionally, we found one operator that was billing traffic when the endpoint was retrieved by IPv6. So when IPv6 was used, the packets again were wrongfully built similarly
for HTTP3. So when we were accessing the relevant endpoints, the relevant application by HTTP3, again, the traffic was fully built. And interestingly, for one operator, as I showed earlier, we had two measurement campaigns for one operator. Actually, the packets got billed in the first period, but then it got fixed, and in the second period it was fine. Yeah.
Archie is a security researcher, so he's not only interested in how those things work, but also how he or how an attacker could exploit this kind of things. Yeah. We have two cases. So the first one is if hostname-based classification was used. In this case for HTTP, it's pretty straightforward. You just would need to write some relaying script that fakes the host data. Sometimes the provider, during the classification even just uses a simple regex for the host string. If HTTPS was used, it's maybe a little more complex. But still, this is
just content of the packet. So you can spoof that. You can change that. And you could maybe implement something on the top of OpenVPN and spoof the SNI to pretend to be WhatsApp traffic, for example. So this is similar to domain-fronting, this technique, if it's for TLS connections. For IP-based classification, this however is a little more complex. And so you would need to
have a server where you could spoof IP addresses. Also, it would only work for the downlink because if the client sends some packets to some Spotify IP or some WhatsApp IP, obviously, the packets will just land at Spotify or at WhatsApp. But for the downlink, this is a feasible thing to do. So you could simply replace the source IP address at your relay point at your VPN maybe, and then pretend to be Spotify. And then the packets will be
classified as zero-rated packets and you can get some free internet or an attacker could do that. Yeah, for TCP, it might be a little more complex because we have this connection-based approach and we have the 3-way handshake. But for UDP, this is totally feasible. And this is what we did. We had a SIM card with free Spotify. We set up a VPN with WireGuard on the server where we could spoof the packets. And yeah, then we wrote a kernel module, a kernel extension that rewrites the IP address of the outgoing packets. And for the provider, these kinds of packets look like Spotify. And we already came up with a nice name
for this proof of concept, and I hope you like it as well. So we called it Spoofify. Okay. Now, let's continue with some other showcases. So the second one is some privacy-related showcase. It's location tracking with ringback tones. So what is the ringback tone? The ringback tone is the tone that you hear when you call somebody and when you basically wait for them to pick up. So it's this audio feedback that you get. The interesting thing is that this is issued by the terminating operator. So in case of roaming, this is issued by the roaming partner. And the interesting thing as well is that we have different ringback tone for different regions. So
for example, in the US operators use this dual ringing of 440 and 480 hertz. And in Europe, most operators use something around 425 hertz. So for these two cases, you could even hear the difference with your bare ear. But also we found that within Europe, when operators use very similar settings, it's totally feasible to record this tone and to differentiate between operators. I'll show you some examples. This is from Vodafone in Romania. We have a peak frequency of 430 hertz.
This is the spectrum. And also at the left, we see the amplitude. If we compare it to a German provider, we see that another different frequency is used, and also that the amplitude differs, so it's louder for the German provider. And if we compare to another German provider, we see that the frequency stays the same, but the amplitude change. Also, the signal is less clear. There are some side lobes. Yeah. And we did this for all the available operators,
and then we printed a scatterplot with the amplitude and with the frequency. And as you see, this is kind of nicely scattered, nicely divided across the diagram, the figure. So yeah, you can easily take those two metrics and determine the operator that terminated the call. Yeah, you can use this to find out the country of the person that you just called. So you just need one test call and then you know the country where the person is in. Also, you could,
of course, use some other metrics like the overtones I just showed you, or some duty cycle. We also had differences in this. Or you could also use some other call progress tones to fingerprint. And this is also interesting from an attacker perspective for SIM swapping because you could also use this to find out the responsible home operator and then you know whom you need to call to swap the SIM card maybe. Yeah. Now, let's come to the last showcase. So this is some proactive SIM communication showcase. Since we tell all this SIM communication, we have full access to the communication, to the payload that is sent between the modem and the SIM card. And SIM cards are kind of mighty and powerful
microcontrollers, so they can even run some Java. There is this instruction set of proactive SIM commands where the SIM basically can take over control and tell the smartphone what to do. So the SIM could tell the smartphone to send an SMS message, to display some text on the handset, and yeah, since we have all this communication of our measurements, we can analyze it. And for measured SIM cards, we found two SIM cuts that were phoning home so that were covertly sending some binary SMS messages. And this is pretty scary, actually, because it happens totally in the background. The smartphone user doesn't know of it. Interestingly,
also, we had some cases where this kind of binary SMS were also built by the operator. So during roaming, these SMSs were even built. That's pretty shit from a user perspective. Yeah. We tried to analyze the content of this binary SMS, and we found out that there is some information about the user equipment. So, for example, the IMEI but also from the SIM card, so ICCID and the IMSI, where in there. If you're interested in getting some more
insights, we've published two papers. So the first one is about zero-rating measurements, it's called Zero-Rating One Big Mess. And the second one is basically the white paper of our platform. It was just presented at USENIX Conference some days ago. For conclusion, Archibald can now, when he's traveling, spend more times with his friends because for all the measurements like longitudinal measurements or exploit testing and development, he can now use a platform to do this from the comfort of his home.
We find roaming especially interesting because it's this special case where two operators with completely different setup have to cooperate and pretend to be one. And we showed you a few use cases. You'll find more in our paper from two days ago from USENIX Security about how to hide and dress up traffic as one of the free services so you don't have to pay for it, how you can locate other subscribers based on the ringback tone, and some internals such as proactive SIM communication. We'd like to thank all these institutions
like NLnet, University of Vienna, Technical University of Vienna, SBA Research, CISPA, and the SSL Laboratory from UCI for supporting us over these five years. You'll find the URL and our contact information up here. The whole project is open-sourced. If you are from a country that you think that is interesting to us to host a probe, please get in contact with us.
We have actually brought some probes here to DEFCON. And yes, thank you a lot.
2023-09-23 05:08