Está en la página 1de 5

ROAD TRAFFIC PREDICTION USING BAYESIAN

NETWORKS
Poo Kuan Hoong, Ian K. T. Tan, Ong Kok Chien, Choo-Yee Ting
Faculty of Computing and Informatics, Multimedia University, Cyberjaya, Malaysia.
{khpoo, ian, ong.kok.chien08, cyting}@mmu.edu.my
Keywords: Road Traffic Prediction, Context Aware,
Personalized, Bayesian Networks.
Abstract
Having prior road condition knowledge for planned or
unplanned journeys will be beneficial in terms of not only
time but potentially cost. Being able to obtain real-time
information will further enhance these benefits. Current
systems rely on huge infrastructure investments by
governments to install cameras, road sensors and billboards to
keep motorists informed. These efforts can only be, at best,
available at pre-identified hotspots. Radio broadcast is an
alternative, where they rely on reports by other motorists.
However, such reports are often delayed and not tailored to
individual motorist. Seeing the limitations of existing
approaches to obtain real-time road conditions, this research
work leverages on mobile devices that provide context
sensitive information to propose a predictive analytics
framework based on a Bayesian Network for road condition
prediction. This paper aims to contribute to (i) defining a set
of evidences (variables) that could potentially be utilized for
road condition prediction and (ii) construction of a Bayesian
Network model to predict road conditions. In conclusion, we
presented a novel approach to provide potentially unlimited
coverage of road traffic conditions with substantially reduced
infrastructure investments.
1 Introduction
Knowing the road traffic conditions during travelling would
be advantageous in deciding on alternative routes or even
additional planned detours. This is evident in our daily life
where radio stations compete for listening audience by
ensuring that they provide regular traffic reporting, especially
during peak traffic hours. However, these are typically
delayed or incorrect and largely dependent on difficult to
verify third party resources such as listeners calling into the
radio stations to report on the traffic.
City councils and local governments invest and place
significant financial resources to deliver traffic reports
through installation of webcams and traffic billboards [10]
[7]. However, not only do these efforts require significant
investment and expensive on-going maintenance, users are
required to actively interact with the system to obtain
localized information that are relevant.
With the proliferation of smart phones with Global
Positioning Systems (GPS) as well as Internet connectivity,
this poses an opportunity to research into Awareness
Information Environments in Mobile Computing [14] and
develop a framework that is suitable for road traffic reporting
in real-time. In a comprehensive survey done by
Anagnostopoulos et. al. [1], they described a 3-dimensional
model space for context-aware system where the three
dimensions are context model, sensor centric support, and
system behaviour. In this research, we propose to map our
context sensitive predictive analytics approach on this model
where the context data will comprise of more than just
location [13] but to also include vehicle direction, speed,
altitude, drivers information and vehicle type. In our model,
the 2
nd
dimension of sensor centric support will be in the form
of weather, maps and road design information. From these
context data and sensor centric support, the overall road
traffic system behaviour will be derived through real-time
predictive analytics.
Research has shown that the presence of unexpected incidents
cause the inaccuracy of the forecasted traffic condition
[21][16][11]. Such incidents include sudden change in traffic
flow, weather conditions (e.g., rainy or snow) [2], road
conditions, accidents [18][8][17], road works or road
constructions [18]. According to statistic provided by Federal
Highway Administration, United States, 75% of weather-
related vehicle crashes occur on wet pavement and 47%
happen during rainfall. Besides that, [9] states that rain can
increase the crash rate by 71% and the injury rate by 49%. In
addition to the unexpected incidents on the road, inaccuracy
of a predictive model is also due to the insufficient data for
training the model [16]. Such data is often difficult to collect.
Various predictive analytics approaches have been proposed
by researchers to address the challenges in traffic flow
prediction. Prediction methods can vary from very simple to
complex versions. Examples of simple ones are the random
walk (RW), which is solely based on the information about
current traffic conditions; the historical average (HA)
approach utilizes average flow rates for prediction of current
flow rates; the informed historical average (IHA) combines
RW and HA to predict the current traffic flow rates. These
methods have been proven to work in specific situations [19].
Other complex methods include time series models such as
Autoregressive Integrated Moving Average (ARIMA) and
seasonal ARIMA [19]. In addition, data mining approaches
such as Artificial Neural Network (ANN), simulation,
regression, fuzzy-neural, and Markov chain model have been
employed for prediction of traffic flow
[3][4][5][6][12][20][22]. The above data mining methods not
only require dataset for model training, the accuracy drops
when incomplete evidence is fed into the model. Therefore,
this research work proposes Bayesian Networks (BN) as an
alternative to address the incomplete information in road
traffic condition prediction.
Recent studies have shown that BN have been employed as
alternative solution to the challenges faced when modelling
and predicting road traffic conditions [15] [23]. For instance,
the work by [15] employed BN to traffic flow forecasting in
the city of Beijing. In the study, BN are mapped to the citys
roads with arcs of the network represent the traffic flow and
the weight of the arc represents the volume of traffic flow.
That is the higher the volume, the strength of arc will
increase. Based on the dataset of Beijing traffic information,
the BN model was constructed using the Expectation
Maximization (EM) algorithm. The performance of the model
was then evaluated using another set of Beijing traffic
information to elicit its accuracy. The original proposed
model was later enhanced into a spatio-temporal BN [16],
where the network incorporates background information such
as peoples activities around shopping centres, car parks and
home communities.
In a study by Zheng et al. [23], a combination of a BN and a
Neural Network was employed. The model was trained and
tested using the dataset that comprises of the 15-min time
interval traffic information gathered from the Singapores
Ayer Rajah Expressway. The evaluation results have shown
that the combined model has outperformed predictors that
merely consist of two Neural Networks. While the above two
studies employed BN for short-term prediction of traffic flow,
the study by [11] employs Dynamic Bayesian Networks
(DBN) to predict the traffic flow in real-time mode. The
technique used was multi-regression dynamic model (MDM),
which aims at preserving the conditional independences and
causal drives that were exhibited by the traffic flow series.
The advantage of employing a DBN is in its ability to cater
for real-time changes rather than one time short-term
prediction. In this study, the model was trained and tested
using the dataset comprises of traffic flow collected in
London for six months duration.
2 Bayesian Network (BN) Model
Figure 1: Proposed Bayesian Network model for Road Condition Prediction (M
1
)
In this research work, road conditions could only be
predicted in light of receiving the information about the
traffic from various sources [21], which ultimately aim at
predicting the likelihood of traffic jams. More than often,
acquiring reliable information from such events can be
uncertain and therefore, the accuracy of forecasted traffic
conditions could be varying [21][16][11]. Therefore,
maximizing the number of events as evidence was an
important phase when building the BN model.
Figure 1 depicts the proposed BN model for prediction of
road conditions in this research work (hereafter refer as
M
1
). There are two main sub-networks, namely, Event sub-
network and Road Condition sub-network. The Event sub-
network consists of nodes that represent the events where
information about road conditions could be obtained
mainly from micro-blogging, Twitter traffic tweets. Such
events are referred to as evidential nodes in BNs. The
events (evidential nodes) identified in this research work
are weather web services, tweet for Bad weather, tweet for
accident, low travel speed, tweet for road block,
announcement for road block, tweet for road construction,
announcement for road construction, and tweet for others.
The second sub-network is the Road Condition sub-
network. The nodes in this sub-network are Bad weather
condition, Accident, Road Block, Road Construction, and
Others. These nodes are referred to as Intermediate nodes.
Arcs are directed from intermediate nodes to the evidential
nodes so that likelihood of traffic jams can be indirectly
Weather
web
service
Tweet for
bad
weather
Tweet for
accident
Low Travel
Speed
Tweet for
road
block
Road block
announcement
Tweet for
road
construction
Road
construction
announcement
Tweet for
others
Bad weather
condition
Accident Road block
Road
construction
Others
Traffic jam
Event sub-network
Road condition sub-network
inferred from the states instantiated to the evidential nodes.
For instance, the probability of traffic jam will increase, if
the probability of accident increases. However, the
probability of accident depends on whether or not the
inference engine receives a tweet about accident.
Similarly, in light of receiving a tweet about road block
and an announcement about road block, the probability of
Road Block will increase, which in turn increase the
chance of traffic jams. The information about road block
in Malaysia can be obtained via Twitter posts from Twitter
accounts including @KLroadblock, @TrafficDotMy,
@kltrafficupdate, @kltraffic, @LLMinfotrafik,
@amptraffic, @LEKAStrafik and @plustrafik. Statistics
shows that 17 out of 245 records (6.94%) of the tweets
about traffic jams are caused by road block at particular
locations. As shown in Figure 1, there is an arc directed
from traffic jam to low travel speed. The arc suggested that
low travel speed has direct influence to the occurrence of
traffic jam.
3 Implementation
This study began with collection of events as evidence to
BN. The identified events have included road traffic, road
construction, road block, and weather from various online
resources like online websites announcements, weather
web services and social networking updates especially
Twitter tweets. The preliminary evaluation of proposed
BN model was restricted to Twitter tweets and users
reports as main sources of evidences.
The Bayesian Network model construction was performed
by employing GeNIe & SMILE. SMILE stands for
Structural Modelling, Inference, and Learning Engine. It
consists of C++ library classes implementing graphical
decision-theoretic methods, such as BN and influence
diagrams, directly amenable to inclusion in intelligent
systems. GeNIe is a Windows-Based User Interface
application for graphical decision-theoretic models. In
short, SMILE is the engine and the GeNIe is the simulator
that helps to construct the BN and influence diagram. Both
modules were developed at the Decision Systems
Laboratory, University of Pittsburgh.
Preparing the dataset for evaluation of proposed BN
shown in Figure 1 was not a trivial task in this research
work. This is mainly due to the fact that there is no
standard dataset that can be used directly to evaluate the
predictive accuracy of the network. In this light, this
evaluation phase began with preparation of dataset, which
was generated by randomly assigned values to each of the
features. From the randomly generated 1000 records, 200
records that resembled the experts' beliefs were
handpicked by human experts to form a dataset for training
and testing of proposed BN model for traffic prediction.
The evaluation process began with creation of two
variations of BN, namely, Nave Bayesian Network (M
2
)
and parameter-learning Bayesian Network (M
3
), as
Figure 2: Sample Scenario of Road Information for Jalan
Duta and Joint Probability Expression
benchmark to measure the predictive accuracy of the
proposed Bayesian Network (M
1
), particularly when no
human intervention were involved in the creation process.
Before the models can be evaluated, pre-processing of raw
data and training of models are required. Two C++
programs, namely, FILTER and TESTER were developed
to perform these works. FILTER is the program that pre-
processes data by matching the case criteria (with filtering
rules) that we are interested and divide them into 2
separated files, 20% for testing and 80% for training. The
filtering rules: (1) if there is an evidence of particular
event/incident, then the event must occur; (2) if there is no
evidence of particular event/incident, the event may occur
as well; (3) if all event including others did not occurs
then the traffic jam does not occur.
TESTER is the main program that evaluates the accuracy
of the data. This programme utilizes the SMILE library,
which allows us to perform some features of BN such as
parameter learning on top of an existing model and
generate Nave Bayes model from datasets. Besides that,
this program is also able to read the dataset in .csv format
and set the evidence(s) into the model to get the final
results such as whether a traffic jam is likely to occur.
Sample Scenario
Theres a tweet about the about Heavy Rain at Jalan Duta,
According to the weather web service, weather condition is
bad at Jalan Duta,
Theres no tweet about accident at Jalan Duta,
The vehicles moving slow at Jalan Duta,
Theres no roadblock and road construction,
Theres no other tweet about Jalan Duta.
Joint Probability Expression:
P(TrafficJam=YES | BadWeatherCondition=YES) x
P(BadWeatherCondition=YES | WeatherWebService=YES ,
TweetForBadWeather=NO ) x
P(TrafficJam=YES | Accident=NO) x
P(Accident=NO | TweetForAccident=NO) x
P(TrafficJam=YES | LowTravelSpeed=YES) x
P(TrafficJam=YES | RoadBlock=NO) x
P(RoadBlock=NO | TweetForRoadBlock=NO ,
AnnouncementForRoadBlock=NO) x
P(TrafficJam=YES | RoadConstruction=NO) x
P(RoadBlock=NO | TweetForRoadConstruction =NO,
AnnouncementForRoadConstruction=NO) x
P(TrafficJam=YES | Others=NO) x
P(Others=NO | TweetForOthers=NO)
In order to evaluate the accuracy of M
1
, three experiments
were conducted. In the first experiment, comparison was
made between M
1
, M
2
, and M
3
(with five iterations of
parameter learning). This experiment involved five
different datasets.
In the second experiment, M
1
was compared with M
2
using
10 different datasets. Lastly, the third experiment was
conducted in order to determine the influences of the size
of the dataset towards the accuracy of the model
performance.
4 Results and Discussion
Figure 3: Comparing M
1
, M
2
, and M
3
with 5 datasets
Figure 3, 4 and 5 represent the results for the three
experiments conducted. As shown in Figure 3, the value of
M
3
remains constant after the first iteration as the
parameter had reached its own optimal level. Hence, there
was no need to perform iteration learning on top of M
3
in
the real working environment.
Figure 4: Performance comparison of M
1
and M
2
with 10
datasets
In terms of accuracy of road traffic prediction, we
measured the number of matches of testing result with
actual result and the percentage of matches. M
2
depicted
the lowest average accuracy (58.57%), followed by M
1
(74.37%) while M
3
scored the highest accuracy of 76.01%.
As depicted in Figure 4, M
1
consistently outperforms M
2
regardless of the dataset size. The average of accuracy for
M
1
was 72.10%, having a 5.13% higher than the average
accuracy for M
2
.
Figure 5: Accuracy Comparison between M
1
, M
2
, and M
3
using 3 datasets
Figure 5 shows the result for the third experiment. From
our observation, the size of datasets does not influence the
accuracy level of the BN Model (M
3
) for road traffic
prediction. From our observation, the smallest dataset size
of 170 cases scored the highest accuracy while the medium
dataset size of 1,700 cases dataset scored the lowest
accuracy as compared to the largest dataset size of 17,000
cases.
5 Conclusion
The research work presented in this paper had three
objectives: (1) to identify the variables that can be used to
infer traffic condition, (2) to identify the relationships
between variables and traffic jam, and (3) to construct a
BN model to represents the relationship among the
variables to infer traffic jam. We identified the variables
(unexpected incident) that can be used to infer traffic
conditions are: accident, weather condition (rain/snowy),
road work (construction), road block, and speed of
vehicles can be used to infer traffic condition. We
evaluated our constructed BN (M
3
) model with Original
CPT (M
1
) and Parameter Learning (M
2
). Based on our
initial results, our proposed BN model shows promising
result even though the accuracy of road traffic prediction is
much lower. For our future works, we intend to improve
the BN model in order to achieve more accurate road
traffic prediction.
References
[1] C. B. Anagnostopoulos, A. Tsounis, S.
Hadjiefthymiades, Context Awareness in Mobile
Computing Environments. Journal of Wireless
Personal Communication. Vol. 42 (3) Aug. 2007, pp
445 464, (2007).
[2] R. Billot, Integrating The Effects Of Adverse
Weather Conditions On Traffic: Methodology,
Empirical Analysis And Bayesian Modelling.
[Retrieved from]
http://www.ectri.org/YRS09/Papiers/Session3/Billot_
R_Session3_Traffic(2).pdf, (2009).
[3] R. Chrobok, J.Wahle, and M. Schreckenberg,
Traffic forecast using simulations of large scale
networks, in Proc. 4th IEEE Int. Conf. Intelligent
Transportation Systems, Oakland, CA, pp. 434
439, (2001).
[4] M. Danech-Pajouh and M. Aron, ATHENA, a
method for short-term inter-urban traffic
forecasting, INRETS, Paris, France, Tech. Rep. 177,
(1991).
[5] G. A. Davis, Adaptive forecasting of freeway traffic
congestion, Transp. Res. Rec., no. 1287, pp. 29
33, (1990).
[6] M. Der Voort, M. Dougherty, and S. Watson,
Combining Kohonen maps with ARIMA time series
models to forecast traffic flow, Transp. Res., Part C
Emerg. Technol., Vol. 4 (5), pp. 307 318, (1996).
[7] Dewan Bandaraya Kuala Lumpur (DBKL) Integrated
Transport Information System. [Online]
http://www.itis.com.my/atis/index.jsf
[8] S.O. John, E.B. Fabian, Distributed or Centralized
TraIfc Advisory Systems - The Applications Take.
Proceedings of the 6th Annual IEEE communications
society conference on Sensor, Mesh and Ad Hoc
Communications and Networks (SECON09), pp. 709
718, (2009).
[9] Q. Lin, W. Nixon, Effects of Adverse Weather on
Traffic Crashes: Systematic Review and Meta-
Analysis. In Proceedings of the 87nd annual
meeting of the Transportation Research Board.
CDROM. Transportation Research Board of the
National Academies, Washington, D.C., pp. 139
146, (2008).
[10] New Zealand Transport Agency: Auckland Traffic
Flow. [Online]
http://www.nzta.govt.nz/traffic/current-
conditions/webcams/auckland/traffic.phtml
[11] C. Queen, C. Albers, Intervention and causality:
forecasting traffic flows using a dynamic Bayesian
network. Journal of the American Statistical
Association. Vol. 104(486), pp. 669 681, (2009).
[12] B. L. Smith and M. Demetsky, Traffic flow
forecasting: Comparison of modelling approaches,
J. Transp. Eng., Vol. 123 (4), pp. 261 266, (1997).
[13] A. Schmidt, M. Beigl, Hans-W. Gellersen, There is
more to context than location, Computers &
Graphics, Vol. 23 (6), Dec. pp. 893 901, (1999).
[14] K. Stefanidis, E. Pitoura, Related Work on Context-
Aware Systems. Work in Progress Report,
Department of Computer Science, University of
Ioannina, Greece. [Retrieved from]
http://softsys.cs.uoi.gr/deca/deca-survey.pdf, (2001).
[15] S. Sun, C. Zhang, and G. Yu, A Bayesian network
approach to traffic flow forecasting, IEEE Trans.
Intell. Transp. Syst., Vol. 7 (1), pp. 124 131,
(2006).
[16] S. Sun, C. Zhang, and Y. Zhang, Traffic flow
forecasting using a spatio-temporal Bayesian
Network predictor. In Artificial Neural Networks:
Formal Models and Their Applications (ICANN), pp.
273 278, (2005).
[17] G.Z. Tan, Z.P. Liu, Y.D. Wang, The Determination
and Analysis of Traffic Congestion Evacuation
Priority. 2
nd
IITA International Conference on
Geoscience and Remote Sensing, pp. 484 487,
(2010).
[18] M. Wachs, Fighting traffic congestion with
information technology. Issues in Science and
Technology. National Academy of Sciences.
[Retrieved from]
http://www.highbeam.com/doc/1G1-93659945.html,
(2002).
[19] B. M. William, Modeling and forecasting vehicular
traffic flow as a seasonal stochastic time series
process, Ph.D. dissertation, Dept. Civil Eng., Univ.
Virginia, Charlottesville, VA, (1999).
[20] H. B. Yin, S. C. Wong, J. M. Xu, and C. K. Wong,
Urban traffic flow prediction using a fuzzy neural
approach, Transp. Res., Part C Emerg. Technol.,
Vol. 10 (2), pp. 85 98, Apr. (2002).
[21] J.Y. Young, M.G. Cho, A Short-Term Prediction
Model for Forecasting Traffic Information Using
Bayesian Network. Third 2008 International
Conference on Convergence and Hybrid Information
Technology (ICCIT), pp. 242 247, (2008).
[22] G. Q. Yu, J. M. Hu, C. S. Zhang, L. K. Zhuang, and
J. Y. Song, Short term traffic flow forecasting based
on Markov chain model, in Proc. IEEE Intelligent
Vehicles Symp., Columbus, OH, pp. 208 212,
(2003).
[23] W. Zheng, D. Lee, and Q. Shi, Short-term freeway
traffic flow prediction: Bayesian combined neural
network approach, J. Transp. Eng., Vol. 132 (2), pp
114 121, (2006).

También podría gustarte