Blog skyhook logo Skyhook Under The Hood: Determining Location with Wi-Fi Access Points

Mar 11, 2015 12:21:00 PM

Skyhook Under The Hood: Determining Location with Wi-Fi Access Points

Posted by Kipp Jones

Read All Posts

In the first installment of this series I talked about how we use geolocated Wi-Fi beacons to determine the location of a Wi-Fi enabled device. Then last week, we walked through how our location services are organized. Now I’ll go into how we determine the location of those Wi-Fi beacons, the first step in the process outlined above.

1. Determine the location of a Wi-Fi access point

In the described ‘bootstrap’ model, our very own hardware devices with dedicated Wi-Fi radios and multiple GPS units are wired together to produce very accurate and rich samples of the Wi-Fi (and other radio) environment as a vehicle is driven around a city. Because it is using our hardware and our drivers and our software, we know to a high degree of certainty that the data is of high quality. And because we can control the radios, we can ensure that we get adequate samples by increasing the scan rate for all of the radios. This method of large-scale data acquisition has some obvious benefits, but it also has some challenges — such as having to hire drivers, maintain our own hardware, and manage a fleet of vehicles. Not to mention that the refresh rate for this data is dependent on re-driving the very same area, something that has cost and resource implications.

Based on these samples, which contain such things as MAC Address, signal strength and GPS based location and quality measures, we can create a model of the signal propagation for any given Wi-Fi access point. The more samples we get, the better we can characterize the positioning quality of each Wi-Fi access point.

For example, the formula below depicts an example of how an access point is visible from many different locations around the actual position of the access point. Using these samples, which also include signal strength, we can compute the most likely location for each Wi-Fi access point that we detect. To do this well, we use something called signal path loss formula. Radio signal strength decays over distance and based on obstacles in those signals’ way. A simple version of the path loss formula looks like this: 



Where L is the path loss in decibels, n is the path loss exponent, d is the distance between the transmitter and the receiver, usually measured in meters, and C is a constant which accounts for system losses.

Optimizing this formula to deal with a global system is not easy because environmental effects vary greatly depending on whether you are in a crowded urban environment, inside a building, or out on a farm.

The path loss formula is an important function that allows us to compute a weight on each observation that we make of a given Wi-Fi access point. For example in this formula, each of the observations came with a particular signal strength. We use that information to compute the probable distance from the observation to the beacon, which is then converted to a weight that is applied to the value of that observation.  Having computed the weights of the measurements for each observation, we can then compute the most likely location of the beacon using all of the available observations and their respective weights.

Note, this is much simplified as the full process takes into account a lot of other environmental information and spatial distribution of the samples. If you want to really dive deep, feel free to read through our patents which cover many aspects of our positioning system.

For each access point, not only do we capture the computed location but we compute a number of ‘Confidence Factors’. These are elements of our Wi-Fi model (or Cell model if we were to discuss that one), that provide additional information that we will use when computing the location of an end user device.  Examples of these confidence factors include things such as recency of last observation, the number of spatially diverse observations, the computed signal propagation model, etc.

2. Gather on-device signals and information

In order to compute the location of a device, you need to gather signals. Whether you are looking for GPS signals, Wi-Fi signals, Cell signals or others, you need to be diligent in how you acquire and process this data. We have spent many years continuously improving our on-device data collection and processing algorithms to optimize across a number of constraints and performance metrics.

Because we spend a lot of time working with device manufacturers to embed our services into their firmware and operating systems, we end up getting fairly deep into the weeds regarding signal scanning. In particular, we have focused a lot of energy on being able to optimize signal gathering for Wi-Fi positioning. It’s actually rather amazing at how difficult it can be to do this well.

First, you need to realize that at the base level, we have to deal with device drivers. This is the code that provides an interface to some piece of hardware, in our case we are talking about the Wi-Fi chipset. And since the people writing Wi-Fi drivers generally optimize the interface and actions of the Wi-Fi chipset for connecting and managing data transmissions, we’ve had to help guide these drivers toward providing interfaces and options that allow faster scanning of all available Wi-Fi channels for purposes of positioning.

Minimizing Time To First Fix (TTFF) is one of the key goals of our system. This is the time it takes from getting a location request to returning a position back to the user. In the olden days when GPS was primarily used as a fixed component in your vehicle, waiting a minute or two for a ‘fix’ was generally acceptable. But on a mobile device, people expect an immediate response. When I want to know where I am, I want it now!  Thus, efficient radio scanning is an important aspect of any positioning system.

I mentioned ‘channels’ above. Wi-Fi provides a number of channels (think TV channels) for communicating back and forth between devices and access points. When looking for an access point to connect to (or in our case to simply acquire the Beacon Frame), the Wi-Fi radio on your device must tune to each of these channels and look for signals. In the US, at 2.4 GHZ, we have 11 channels, other countries have 11, 13 or 14. So, to see all of the access points around you, you need to tune to a channel, listen for a bit to see if anybody is out there, then move on to the next channel. But that’s not all. You can also prompt access points to respond to your query. By sending out a Probe Request Frame, the device can ask any Wi-Fi access points out there listening on that channel to respond with a Probe Response Frame — allowing you to speed up and ensure that you see as many of the Wi-Fi access points within range.

So, while doing this, you need also need to consider what information is being cached by drivers and operating systems. Sometimes, to optimize power consumption and other resources, systems will hold onto old data and hand it back to you like it is brand new. If systems don’t provide information about the staleness or age of information that they return, you may think you are within range of access points to which you are no longer proximal.

So, once you have scanned all of the Wi-Fi channels, acquired any cell information (another potentially complex task), possibly waited for a GPS fix, gathered on device sensor information (e.g. accelerometer), you are ready to figure out where the heck you are.

Tune in next week to read about computing location of devices as we walk through how to get signals to server, or get beacon locations to a device.header_world