Real Time Tracking with Oracle NoSQL Database
I recently got back from a short trip to Boston to watch my
daughter race in the Boston
Marathon. As soon as my wife and I
found some spots to watch, about 1 mile from the finish line, we got out our
phones and fired up the tracking software. We were both disappointed at the
ability to get timely updates regarding the progress of our daughter. Remember
that once you have a space to stand and watch, you basically don’t move for
greater than 4 hours and try to figure out when your favorite runner will be
passing your spot. An effective and efficient tracking application is critical
for this.
I got to thinking about the application for tracking runners,
now that RFID tags are common place and so inexpensive. Each numbered bib that
the runner wears contains an RFID chip that can be activated as the runner passes
on or through the data activation mat. Here is what the sensor looks like from an
actual Boston Marathon bib.
During the race, at specific intervals, the time of
activation of the sensor is captured, stored, and some simple computations are
then performed, such as the most recent minutes/mile and an extrapolation of
what the expected finishing time will be. A NoSQL application for the timing of
runners would be quite straightforward to develop. Let’s look at two of the
basics.
- Registration – when someone registers to run in
a race, the basic information must be acquired, including name, address, phone
and birthday. The birthday is actually quite important, as qualifying times are
based on the age at the time of the race, as well as how a participant places
within their respective age group.
For example, a JSON Document could be created at registration
time with the following information.
{
"RunnerId":12345,
"Runner" : {
"firstname":"John",
"lastname":"Doe",
"birthdayyear":”1985”,
“birthdaymonth”:”02”,
“birthdayday”:”15”,
"email" :"john.doe@example.net",
“payment”:”Mastercard”
“paymentstatus”: “paid”
"social":{
"twitter":"@jdoe",
"instagram":"jdoe_pics"
},
}
}
- As the race begins, each runner passes over a
mat on the ground which activates the RFID chip and records the start time. As
the runner progress over the race course, at specified intervals the runners
cross more of these mats and the times are recorded. Simple math can then
determine the elapsed time for that specific runner, as well as the minutes per
mile over the past interval, as well as extrapolate the expected finish time. The
JSON data as the race progresses may look like below which is quite small and
can be transmitted to the servers quite quickly, or even batched up (depending
on the transmitting device capability) and sent every few hundred runners, or
when there is a break in the runners crossing the mat.
{
"RunnerId":”12345”,
"milestone":"milestone_5k",
"timestamp":"2017-04-12T10:00:00"
}
Then, this information could be added to the race record for
the runner as they make progress.
"Marathon_Boston"
: {
“RunnerID”:12345,
"start_time":"2017-04-12T10:00:00",
"milestone_5k":"2017-04-12T10:21:00",
"milestone_10k":"2017-04-12T10:44:00",
"milestone_15k":"2017-04-12T11:10:00",
"milestone_20k":"2017-04-12T11:25:00",
"milestone_25k":"2017-04-12T11:42:00",
"milestone_30k":"2017-04-12T11:56:00",
"milestone_35k":"2017-04-12T12:09:00",
"milestone_40k":"2017-04-12T12:28:00",
"milestone_41k":"2017-04-12T12:42:00",
"milestone_42k_end":"2017-04-12T12:45:00"
}
Overall, this would be an ideal application to use a
NoSQL system. The amount of data, even for a 35,000 person race would not be
very much, and as the runners spread out, even less so than comparted to the
starting gates. If we assume that each runners record would consume about 1K of
data, then for the entire race there would only be about 35 MB of raw data. If
we then assume a replication factor of 3, and include some overhead, the entire
race data would need about 225 MB of storage, which could easily fit on a USB
thumb drive. Using high speed SSDs can
store in the Terabyte (TB) range, so that thousands of marathons results could
be stored in a single Oracle NoSQL Database.
This still doesn’t answer the question as to why the
updates were so slow, but from my source in the Boston area, the downtown is
notorious for poor cell service and add many thousands of race watchers trying
to use their tracking apps at basically the same time, and you can start to
understand the delays. At least we know
that if a system were based on NoSQL, it would not be the culprit.