Está en la página 1de 39

Issue August 2015 | presented by

www.jaxenter.com

#46

The digital magazine for enterprise developers

Microservices
Are they for
everyone?

The future of the cloud


Databases are where its at

Performance in an API-driven world

The speed of Java 8 lambdas


and way more tips, tricks and tutorials

iStockphoto.com/VikaSuh

And why REST APIs are changing the game

Editorial

Looking beyond the hype

Index

The hype shows no sign of subsiding. Microservices,


DevOps, Continuous Delivery the latest trends in IT are
truly changing how businesses innovate. And most of all,
they are putting the programmer at the heart of good business
strategies. But theres a flipside. The gravity of hype can pull
many organisations towards concepts they simply arent
ready for.
So as youre watching the herd of enterprises flocking
towards DevOpsian IT, its good to stand back and have a
think. Are microservices really for everyone? How is it that
Etsy is making such strides in continuous delivery with a
monolithic system? And are there really no bumps on the
road to continuous delivery?

Microservices: Storm in a teacup,


or teacups in a storm?
More hype, anyone?

Holly Cummins

A tale of two teams

Smoothing the continuous delivery path

Network-based architectures: learning from Paris

7
10

Java performance tutorial How fast are the Java 8 streams?


Angelika Langer

A world beyond Java

13

Coding for desktop and mobile with HTML5 and Java EE 7


Geertjan Wielenga

JEP 222

JShell, the Java 9 REPL: What does it do?

14

Werner Keil

MySQL is a great NoSQL

Making the right database decisions

16

Aviran Mordo

Business intelligence must evolve

Rethinking how we think about self-service cloud BI

17

Database solutions

Dos and Donts

24

Colin Vipurs

Private cloud trends

Financial services PaaS and private clouds: Managing and


monitoring disparate environments

27

Intelligent traffic management in the


modern application ecosystem

29

Kris Beevers

Trade-offs in benchmarking
Cost, scope and focus

31

Aysylu Greenberg

Common threats to your VoIP system


Five tips to stay secure

32

Sheldon Smith

Why reusable REST APIs are changing


the game
No more custom API mazes

34

Ben Busse

Considering the performance factor in


an API-driven world
Milliseconds matter

38

Per Buer

Chris Neumann

The future of cloud computing

Testing the Database Layer

The future of traffic management technology

Eric Horesnyi

Lets talk speed

Coman Hamilton, Editor

Patricia Hines

Lyndsay Prewer

Fielding, Fowler and Haussmann

These are exactly the kinds of questions well be finding


answers to at the JAX London conference in October. And
with JAX London season just around the corner, weve asked
a selection of our conference speakers to give us a sneak
preview of what well be learning at the JAX London. From
database testing and smart benchmarking to microservices
reality checks and continuous delivery tips, this is a special
issue for anyone looking to update their IT approach. Were
even going to learn what Parisian history can teach us about
software architecture!

22

Zigmars Raascevskis

www.JAXenter.com | August 2015

Hot or Not

Programming in Schwarzenegger quotes


Programming in C can be fun and all. But wouldnt you rather make your commands in the voice of the Terminator? Who wouldnt want to return a string with
Ill be back? Listen to me very carefully, Hasta la vista, baby and Do it
now are clearly way more effective than boring old DeclareMethod, EndMethodDeclaration and CallMethod. And frankly, this language is simply far cooler than
esoteric alternatives like brainfuck, Hodor-lang and Deadfish. It may not be the
youngest language on the scene anymore, but old age certainly hasnt stopped Arnie
from being a bad-ass Terminator.

Java without the Unsafe class


Its being referred to as a disaster, misery and even the Javapocalypse.
Oracle has announced the projected removal of the private API used by almost
every tool, infrastructure software and high performance library built using Java.
Java 9 is showing the door to the sun.misc.Unsafe class. As far as ideas go, this one
doesnt sound so good at least on paper. Its particularly the library developers
that are annoyed by this change. Numerous libraries like Netty will collapse when
Oracle pulls out this Jenga block from the Java release. But then again, as the name
of the class suggests, it is unsafe. Change is a bitch.

The Stack Overflow trolls


Earlier this year, Google was forced to shut down Google Code, its version of
GitHub, because the product had become overrun with trolls. Meanwhile, the
popular IT Q&A site Stack Overflow has been losing dedicated members as a result of its toxic moderator behaviour, over-embellished points system and systemic
hatred for newbies. However, some users are debating whether the Stack Overflow
problem extends beyond a minority of trolls, and encompasses a fundamental
hostility at the heart of the websites culture of asking questions. Should novice
programmers be afraid of being laughed at when asking beginner questions?

Developer grief

Oracle vs. Google

We dont want developers to be sad. Or angry. Or


decaffeinated. But conference speaker and developer
Derick Bailey has shown us the rocky road that many
devs go down when it comes to the human side of
software development. The five stages of developer
grief is the account of the journey that many programmers experience when writing code and marketing their product. Bailey explains how denial, anger,
bargaining, depression and acceptance all leave their
mark on initially energetic devs, which results in
them feeling pretty e-motion sick from the ride. Feelings suck.

We know its Summer. But that doesnt stop


there from being quite a lot of not-so-hot things
happening in IT right now, like the final decision in the Oracle vs. Google debacle. Most of
the sensible parts of the software industry have
been hoping for a reduction in the destructive
copyright crackdown on Java usage. But with
President Obama himself coming down on Oracles side coupled with the US Supreme Court
decision against a review of their ruling, theres
little hope of a happy ending to the Android
API saga.

www.JAXenter.com | August 2015

Microservices

More hype, anyone?

Microservices: Storm in
a teacup, or teacups in
a storm?
Somehow, the buzz surrounding microservices has us believing that every single employee and
enterprise must break up their monolith empires and follow the microservices trend. But its not
everyones cup of tea, says JAX London speaker Holly Cummins.

by Holly Cummins
Folks, we have reached a new phase on the Microservices
Hype Cycle. Discussion of the microservices hype has overtaken discussion of the actual microservices technology.
Were all talking about microservices, and were all talking
about how were all talking about microservices. This article
is, of course, contributing to that cycle. Shall we call it the
checkpoint of chatter?
Lets step back. I think were all now agreed on some basic
principles. Distributing congealed tea across lots of teacups
doesnt make it any more drinkable; microservices are not a
substitute for getting your codebase in order. Microservices
arent the right fit for everyone. On the other hand, microservices do encourage many good engineering practices, such as
clean interfaces, loose coupling, and high cohesion. They also
encourage practices that are a bit newer, but seem pretty sensible, such as scalability through statelessness and development quality through accountability (you write it, you make
it work in the field).
Many of these architectural practices are just good software
engineering. Youll get a benefit from adopting them but if
you havent already adopted them, will you be able to do that
along with a shift to microservices? A big part of the micro
services debate now centres on the best way to transition to
microservices. Should it be a big bang, or a gradual peeling
of services off the edge, or are microservices something which
should be reserved for greenfield projects?

www.JAXenter.com | August 2015

Im part of the team that writes WebSphere Liberty. As a


super-lightweight application server, were more of an en
abler of microservices than a direct consumer. However, we
have experience of a similar internal transformation. We
had a legacy codebase that was awesome in many ways, but
it was pretty monolithic and pretty big. We needed to break
it up, without breaking it. We knew we could become more
modular by rebasing on OSGi services, a technology which
shares many characteristics (and sometimes even a name)
with microservices. OSGi services allow radical decoupling,
but their dynamism can cause headaches for the unwary.
What worked for us was writing a brand new kernel, and
adapting our existing libraries to the new kernel one by one.
Thinking about failure is critical. Imagine a little micro
service teacup, bobbing along in those rough network waters, with occasional hardware lightning strikes. Not only is
failure a possibility, its practically a certainty. Tolerance for
failure needs to be built in at every level, and it needs to be
exercised at every stage of testing. Dont get too attached to
any particular service instance. This was one of our biggest
lessons along the way. We made sure our design ensured that
code had the framework support to consume services in a
robust way, even though they were liable to appear and disappear. Along the way, we discovered that many of our tests,
and some of our code, made assumptions about the order in
which things happened, or the timing of services becoming
available. These assumptions were inevitably proved wrong,
usually at 1 am when we were trying to get a green build.

Microservices

A big part of the microservices debate now centres


on the best way to transition to microservices.

What I find exciting about the microservices discussion is


how its making us think about architectural patterns, team organisation, fault tolerance, and the best way to write code and
deliver services. Thats got to be a good thing, even if microservices themselves dont end up being everyones cup of tea.
Holly Cummins is a senior software engineer developing enterprise middleware with IBM WebSphere, and a committer on the Apache Aries project. She is a co-author of Enterprise OSGi in Action and has spoken at
Devoxx, JavaZone, The ServerSide Java Symposium, JAX London, GeeCon,
and the Great Indian Developer Summit, as well as a number of user groups.

The human implications of microservices


Although we tend to talk about the technological implications of microservices, its important to think about the human implications, too. Not everyone is comfortable coding
to cope with services dropping in and out of existence, so
you may find you end up jettisoning some people along with
the monolith. Build in a period of adjustment and education, and remember to take time to develop new skills as
well as new code. By the time most of our team shifted to
service-oriented development, we had a beta which clearly
demonstrated how well the new model worked. The coolness of where we were going was a good compensation for
occasional 1am head-scratching over unexpected behaviour.

Microservices: From dream


to reality in an hour
Hear Holly Cummins speak at the JAX London: Are microservices a
wonder-pattern for rescuing intractably complex applications? Or
are they just a restatement of the software engineering best practices we all should be following anyway? Or something in between?
How do they work? How should they be written? What are the
pitfalls? What are the underpinning technologies?

Advert

CD and CI

Smoothing the continuous delivery path

A tale of two teams


Continuous Delivery is gaining recognition as a best practice, but adopting it and iteratively improving it is challenging.

by Lyndsay Prewer
To paraphrase Wikipedia, Continuous Delivery is a software
engineering approach that produces valuable software in short
cycles and enables production releases to be made at any time.
Continuous Delivery is gaining recognition as a best practice,
but adopting it and iteratively improving it is challenging. Given the diversity of teams and architectures that do Continuous
Delivery well its clear that there is no single, golden path.
This article explores how two very different teams successfully practiced and improved Continuous Delivery. Both teams
were sizeable and mature in their use of agile and lean practices. One team chose microservices, Scala, MongoDB and
Docker on a greenfield project. The other faced the constraints
of a monolithic architecture, legacy code, .NET, MySQL and
Windows.

Patterns for successful practice


From observing both teams, some common patterns were visible that contributed to their successful Continuous Delivery.
Continuous Integration that works: Continuous Integration
(CI) is the foundation that enables Continuous Delivery. To be
a truly solid foundation though, the CI system must maintain
good health, which only happens if the team exercise it and
care for it. Team members need to be integrating their changes
regularly (multiple times per day) and responding promptly
to red builds. The team should also be eliminating warnings
and addressing long running CI steps. These important behaviours ensure that release candidates can be created regularly,
efficiently and quickly. Once this process starts taking hours
instead of minutes, Continuous Delivery becomes a burden
instead of an enabler.
Automated tests: Managing the complexity of software is
extremely challenging. The right mix of automated tests helps
address the risk present when changing a complex system,
by identifying areas of high risk (e.g. lack of test coverage or
broken tests) that need further investigation. When practicing
automated testing, its important to get the right distribution
of unit, integration and end-to-end tests (the well documented
test pyramid).
Both teams I worked with moved towards a tear-drop distribution: a very small number of end-to-end tests, sitting
on top of a high number of integration tests, with a moderate number of unit tests at the base. This provided the best
balance between behavioural coverage and cost of change,

www.JAXenter.com | August 2015

which, in turn, allowed the risk present in a software increment to be more easily identified.
Low cost deployment (and rollback): Once a release candidate
has been produced by the CI system, and the team is happy with
its level of risk, one or more deployments will take place, to a
variety of environments (normally QA, Staging/Pre-Production,
Production). When practicing Continuous Delivery, its typical
for these deployments to happen multiple times per week, if not
per day. A key success factor is thus to minimise the time and
effort of these deployments. The microservice team were able to
reduce this overhead down to minutes, which enabled multiple
deployments per day. The monolith team reduced it to hours, in
order to achieve weekly deployments.
Regardless of how frequent production deployments happen, the cost and impact of rolling back must be tiny (seconds), to minimise service downtime. This makes rolling back
pain-free and not a bad thing to do.
Monitoring and alerting: No matter how much testing
(manual or automated) a release candidate has, there is always a risk that something will break when it goes into Production. Both teams were able to monitor the impact of a
release in near real-time using tools such as Elastic Search,
Kibana, Papertrail, Splunk and NewRelic. Having such tools
easily available is great, but theyre next to useless unless people look at them, and they are coupled to automated alerting
(such as PagerDuty). This required a culture of caring about
Production, so that the whole team (not just Operations, QA
or Development) knew what normal looked like, and noticed when Productions vital signs took a turn for the worse.

Conclusion
This article has highlighted how different teams, with very
different architectures, both successfully practiced Continuous Delivery. Its touched on some of the shared patterns that
have enabled this. If youd like to hear more about how their
Continuous Delivery journey, including the different blockers and accelerators they faced, and the ever present impact
of Conways Law, then Ill be speaking on this topic at JAX
London on 13-14th October 2015.

Lyndsay Prewer is an Agile Delivery Lead, currently consulting for Equal Experts. He focuses on helping people, teams and products become even more
awesome, through the application of agile, lean and systemic practices. A
former rocket-scientist and software engineer, over the last two decades hes
helped ten companies in two hemispheres improve their delivery.

Architecture

Network-based architectures: learning from Paris

Fielding, Fowler and


Haussmann
For all the great strides that IT is taking to bring us to better futures faster, it turns out that
everything we need to know about the development of the web can be learned from the history of urban Paris.

by Eric Horesnyi
As developer or architect, we often have to communicate fine
concepts of network or system architecture to decision-makers. In my case, I have been using the smart city analogy for
twenty years. And to celebrate the 25th birthday of the Web, I
propose to draw an analogy in depth between a designed city,
Paris and the Web. Going through Fieldings thesis, we will
compare Paris to the Web in terms of constraints, inherited
features, architectural style choices and finally assess whether
these choices meet the objectives. All these with a focus on a
transformational period of Paris: 18531869 under Haussmann as Architect, with an approach worth applying to the
many large corporate information systems looking to adopt
a microservice style, as proposed by Fowler and Newman.
Here are the first two episodes out of seven that we will cover
during the session at the upcoming JAX London. Our audience
is, by design, either software experts interested in city architecture to illustrate the beauty of HTTP Rest and Continuous
Delivery, or anybody wanting to understand the success of the
Web, and get acquainted to key web lingos and concepts.

EPISODE I: Who was Haussmann, and his challenge for


Paris?
Eugene Haussmann was a civil servant chosen by
Napoleon III to accelerate the modernisation of
Paris in 1853. He reigned over Paris architecture
for sixteen years. When he took office, Paris UX
was awful, probably worth less than a star on the
Play or Apple store: high dropout rate (cholera
stroke every five years with 10,000 dead each
time), servers (houses) were congested (up to 1
person per square metre US translation: 100
people per townhouse floor), and datacentres
would work only intermittently: no power (gas,
water), no cooling (sewage). No cache (cellar),
no proxy (concierge), no streaming (subway)
a UX nightmare.

www.JAXenter.com | August 2015

Outside was even worse: ridden with cybercrime (you get it)
in obscure streets, slow with narrow access lines (streets) and
non-existent backbones (boulevards), without a shared protocol for polling traffic (sidewalk for pedestrians) or streaming
(subway), and no garbage collection (gotcha). Worse, when
a user would go from one page to another, TLP was terrible because of these congested un-protocoled lines, but they
would come home with viruses by lack of continuous
delivery of patches to the
servers (air circulation and
sun down these narrow
streets hidden by overly
high buildings).
To top it off, service
was fully interrupted
and redesigned regularly (Revolutions in 1789,
1815, three Glorieuses
in 1830, 1848), without backward compatibility. Although users
would benefit from

Eugne Haussmann

19,000 deaths of Cholera in


Paris, 1832

vers, crowded
r
e
s
d
e
t
s
e
g
n
o
C
Paris
in
s
t
n
e
m
t
r
a
p
a
7

Architecture

Pre-Haussmannian
unsafe, ridden with virustreet in Paris,
ses, no cooling

Liberty Leading the People,


July 28th, 1830 by Delacroix

these changes in the long run, they definitely did not appreciate the long period of adaptation to the new UI, not to mention calls and escalations to a non-existent service desk (votes
for poor and women). Well, actually, these small access lines
made it easy to build DDOS attacks (barricades), a feature the
business people did not like (Napoleon III).

EPISODE II: What properties did Haussmann select for his


system?

Some elements of style that Haussmann ultimately selected


were actually imposed upon him by his boss, Napoleon III,
nephew of Napoleon. My intention here is definitely not to
write an apology of Napoleon who spread war for years
across Europe, just to point to one feature he introduced in
the Process: the Code Civil. The Code Civil details the way
people can interact together, things they can do and cant. It
came out of heated debates during the French Revolution. Its
publication had an impact similar to the CERN initial paper
on HTTP in 1990. Code Civil rules citizens life the same way
HTTP is the protocol governing our exchanges on the Web.
A fundamental principle in the Code Civil is the separation
of concern: it defines what is a citizen (Client), and separates
its interests and interaction from companies, associations and
the state (servers .com, .org or .gov). Any mixing of interest is
unlawful (misuse of social assets), pretty much like HTTP is by
design Client Server and Stateless. This also means that a server
cannot treat two clients differently (equality), or two servers
unequally (this links to the current debate on Net Neutrality):

Paris numbering scheme, example on


an Haussmanian building
www.JAXenter.com | August 2015

First P

age

Civil,
of Code

1804

Client-Server: the most fundamental concept allowing for a


network-based architecture: the system is considered as a set
of services provided by servers (shops, public infrastructure
services, associations) to clients (citizens). Once roles and
possible interactions are defined, each can evolve independently: a citizen can grow from student to shop-owners, a
grocery store to a juice provider. People living in a building
are not forced to buy their meat from the store in the same
building, and a store-owner may practice whatever price he
wants. This principle of separation of concern allows for
citizens freedom to choose, and for entrepreneurial spirit
(ability to create, invest, adapt) to service citizens (with food,
entertainment, social bonds through associations or culture).
Stateless: no session state, request from client to server
must contain all necessary information for server to
process. Whoever the citizen, the server must serve him
without knowing more about him than other citizens. No
mixing of genres is allowed between client and server: all
citizens are treated equal, and must be serviced the same
way for the same request. Citizens cannot be bound to a
single service by services creating dependencies or local
monopoly over their services. This separation of concerns
from one building to another, and one person to a company is a foundation of citizen life even today (usually).
Another feature Haussmann had to take into consideration
was the addressing scheme of Paris, defined in 1805, similar
to our DNS scheme used for HTTP requests:

Napoleon III desc


mission to Haussmanribing his
n, 1853

Low-entry barrier for citizens,


Montmartre example of a popular
neighbourhood in Paris
8

Architecture

6. Congurable: easily modify


a building (component)
after construction (postdeployment)
7. Reusable: a building hosting an accounting firm one
day can serve as creamery
the next
Paris
lo
suppo
no
rting
hyper
ch
media
te
, e. g. Pigalle,
extended to new
8.
Visible: to provide best
Paris architecture
distri
buted
in
k
vario
us neighbourhoods
reaming networ
security and auditability of
gies, e. g. Metro st
the system, interactions between components needed
The main backbone, la Seine, defines the start of every
to be visible (people should see each other in the street)
street (our root)
9. Portable: style should work well in other regions, with
Each association (.org), state organization (.gov) and comother materials and weather conditions
pany (.com) can define its scheme within its own domain
10. Reliable: susceptible to failure (no single event could
stop water, gas or circulation for citizens)
Wanting to build on Code Civil/HTTP, Napoleon IIIs ego
did not tolerate his capital city to be less civilized than Lon- Looking into the challenges he wanted to address in Paris
don or emerging New York. In terms of performance, his through his architectural style, Haussmann weighted each of
network-based architecture needed to do the job on:
these properties for his evaluation criteria. The main objectives appeared to be:
Network (streets) performance: throughput (fast pedestrians and carriages), small overhead (no need for a citizen to
Low entry-barrier: citizens are not forced to live in Paris,
walk the street with a policeman to protect himself), bandand Haussmann wanted to provide them best possible UX
width (wide street)
to increase adoption. A citizen needed to be able to simply
User-perceived performance: latency (ability to quickly
find an address, and a builder to publish a new reference,
reach ground floor of buildings), and completion (get busiallowing for the creation of an initial system very quickly.
ness done)
Extensibility: a low-entry barrier would help create the
Network-efficiency: best way to be efficient is to avoid using
modern community Haussmann wanted, and in the longthe street too much. Examples are homeworking (differential
term, Paris needed to be ready for changes in its style to
data) and news or music kiosks avoiding music only in the
adapt to new technologies.
Opera or getting news from Le Monde headquarters (cache)
Distributed hypermedia: Paris needed to provide citizens
with life experience ranging from music (Opera and kiosk),
For Fielding, he would also select his architectural style
films (actual theatres), ecommerce (food from Les Halles)
against the following metrics:
and health (parks). All these experiences were rich in content and would attract many citizens, so much so that they
1. Scalable: make it possible for Paris to grow
needed to be distributed across the city.
2. Simple: citizens, civil servants and visitors would need to
Anarchic scalability: once the first set of new neighbourunderstand the way the city worked without a user manual
hoods would be in place, the city could grow in one direc3. Modiable: ability to evolve in the future through change
tion or another, at a very large scale, without the need for
4. Extensible: add new neighbourhood without impacting
a centralized control (anarchy) to ensure integrity of the
the system
entire system. This required each building to ensure its
5. Customizable: specialize a building without impacting others
own authentication, and be able to inspect incoming traffic
through firewalls (double door, concierge).
Independent deployment: each server (building) or application (neighbourhood) could be deployed independently
Fowler, Fielding, and Haussmann
from the others, without compromising the system. Legacy
Network-based Architectures
systems (older neighbourhoods that could/should not be
changed, e.g. Notre Dame de Paris) needed to be easily
Hear Eric Horesnyi speak at the JAX London(Oct. 1214, 2015).
encapsulated to interact and be part of the entire system.
Why is Paris so beautiful, Netix so scalable and REST now a
standard? This is about analyzing the constraints leading to architecture styles in network-based software as well as buildings.
Haussmann invented a scalable model for the city, Fielding established the principles of an internet-scale software architecture
(REST), and Fowler described in detail how microservices can get
an application to massively scale.

www.JAXenter.com | August 2015

Eric Horesnyi was a founding team member at Internet Way (French B2B
ISP, sold to UUNET) then Radianz (Global Finance Cloud, sold to BT). He
is a High Frequency Trading infrastructure expert, passionate about Fintech and Cleantech. Eric looks after 3 bozons and has worked in San
Francisco, NYC, Mexico and now Paris.

istockphoto.com/Viorika

Java

Java performance tutorial How fast are the Java 8 streams?

Lets talk speed


Java 8 brought with it a major change to the collection framework in the form of
streams. But how well do they really perform?

by Angelika Langer
Java 8 came with a major addition to the JDK collection
framework, namely the Stream API. Similar to collections,
streams represent sequences of elements. Collections support
operations such as add(), remove(), and contains() that work
on a single element. Streams, in contrast, have bulk operations such as forEach(), filter(), map(), andreduce() that access all elements in a sequence. The notion of a Java stream
is inspired by functional programming languages, where
the corresponding abstraction is typically called a sequence,
which also has filter-map-reduce operations. Due to this similarity, Java 8 at least to some extent permits a functional
programming style in addition to the object-oriented paradigm that it supported all along.
Perhaps contrary to widespread belief, the designers of the
Java programming language did not extend Java and its JDK
to allow functional programming in Java or to turn Java into

www.JAXenter.com | August 2015

a hybrid object-oriented and functional programming language. The actual motivation for inventing streams for Java
was performance or more precisely making parallelism
more accessible to software developers (see Brian Goetz, State
of the Lambda). This goal makes a lot of sense to me, considering the way in which hardware evolves. Our hardware has
dozens of CPU cores today and will probably have hundreds
some time in the future. In order to effectively utilize the hardware capabilities and thereby achieve state-of-the-art execution performance we must parallelize. After all what is the
point to running a single thread on a multicore platform? At
the same time, multithread programming is considered hard
and error-prone, and rightly so. Streams, which come in two
flavours (as sequential and parallel streams), are designed
to hide the complexity of running multiple threads. Parallel
streams make it extremely easy to execute bulk operations in
parallel magically, effortlessly, and in a way that is accessible to every Java developer.

10

Java

Our hardware has dozens of CPU


cores today and will probably have
hundreds some time in the future.
So, lets talk about performance. How fast are the Java 8
streams? A common expectation is that parallel execution of
stream operations is faster than sequential execution with
only a single thread. Is it true? Do streams improve performance?
In order to answer questions regarding performance we
must measure, that is, run a micro-benchmark. Benchmarking is hard and error-prone, too. You need to perform a
proper warm-up, watch out for all kinds of distorting effects from optimizations applied by the virtual machines JIT
compiler (dead code elimination being a notorious one) up to
hardware optimizations (such as increasing one cores CPU
frequency if the other cores are idle). In general, benchmark
results must be taken with a grain of salt. Every benchmark is
an experiment. Its results are context-dependent. Never trust
benchmark figures that you havent produced yourself in your
context on your hardware. This said, let us experiment.

Comparing streams to loops


First, we want to find out how a streams bulk operation
compares to a regular, traditional for-loop. Is it worth using
streams in the first place (for performance reasons)?
The sequence which we will use for the benchmark is an
int-array filled with 500,000 random integral values. In this
array we will search for the maximum value. Here is the traditional solution with a for-loop:
int[] a = ints;
int e = ints.length;
int m = Integer.MIN_VALUE;
for(int i=0; i < e; i++)
if(a[i] > m) m = a[i];

Here is the solution with a sequential IntStream:


int m = Arrays.stream(ints)
.reduce(Integer.MIN_VALUE, Math::max);

Workshop: Lambdas and


Streams in Java 8
This JAX London workshop led by Angelika Langer is devoted to
the stream framework, which is an extension to the JDK collection
framework. Streams offer an easy way to parallelize bulk operations on sequences of elements. The Stream API differs from the
classic collection API in many ways: It supports a uent programming style and borrows elements from functional languages.

www.JAXenter.com | August 2015

We measured on an outdated hardware (dual core, no dynamic overclocking) with proper warm-up and all it takes
to produce halfway reliable benchmark figures. This was the
result in that particular context:
int-array, for-loop : 0.36 ms
int-array, seq. stream: 5.35 ms

The result is sobering: the good old for-loop is 15 times faster


than the sequential stream. How disappointing! Years of development effort spent on building streams for Java 8 and
then this? But, wait! Before we conclude that streams are
abysmally slow let us see what happens if we replace the intarray by an ArrayList<Integer>. Here is the for-loop:
int m = Integer.MIN_VALUE;
for (int i : myList)
if (i>m) m=i;

Here is the stream-based solution:


int m = myList.stream()
.reduce(Integer.MIN_VALUE, Math::max);

These are the results:


ArrayList, for-loop : 6.55 ms
ArrayList, seq. stream: 8.33 ms

Again, the for-loop is faster that the sequential stream operation, but the difference on an ArrayList is not nearly as
significant as it was on an array. Lets think about it. Why
do the results differ that much? There are several aspects to
consider.
First, access to array elements is very fast. It is an indexbased memory access with no overhead whatsoever. In other
words, it is plain down-to-the-metal memory access. Elements in a collection such as ArrayList on the other hand
are accessed via an iterator and the iterator inevitably
adds overhead. Plus, there is the overhead of boxing and
unboxing collection elements whereas int-arrays use plain
primitive type ints. Essentially, the measurements for the ArrayList are dominated by the iteration and boxing overhead
whereas the figures for the int-array illustrate the advantage
of for-loops.
Secondly, had we seriously expected that streams would
be faster than plain for-loops? Compilers have 40+ years of
experience optimizing loops and the virtual machines JIT
compiler is especially apt to optimize for-loops over arrays
with an equal stride like the one in our benchmark. Streams
on the other hand are a very recent addition to Java and the
JIT compiler does not (yet) perform any particularly sophisticated optimizations to them.
Thirdly, we must keep in mind that we are not doing much
with the sequence elements once we got hold of them. We
spend a lot of effort trying to get access to an element and
then we dont do much with it. We just compare two integers,
which after JIT compilation is barely more than one assem-

11

Java

The point to take home


is that sequential streams
are no faster than loops.
bly instruction. For this reason, our benchmarks illustrate the
cost of element access which need not necessarily be a typical situation. The performance figures change substantially if
the functionality applied to each element in the sequence is
CPU intensive. You will find that there is no measurable difference any more between for-loop and sequential stream if
the functionality is heavily CPU bound.
The ultimate conclusion to draw from this benchmark experiment is NOT that streams are always slower than loops.
Yes, streams are sometimes slower than loops, but they can
also be equally fast; it depends on the circumstances. The
point to take home is that sequential streams are no faster
than loops. If you use sequential streams then you dont do it
for performance reasons; you do it because you like the functional programming style.
So, where is the performance improvement streams were
invented for? So far we have only compared loops to streams.
How about parallelization? The point of streams is easy parallelization for better performance.

Comparing sequential streams to parallel streams


As a second experiment, we want to figure out how a sequential stream compares to a parallel stream performance-wise.
Are parallel stream operations faster than sequential ones?
We use the same int-array filled with 500,000 integral values. Here is the sequential stream operation:
int m = Arrays.stream(ints)
.reduce(Integer.MIN_VALUE, Math::max);

This is the parallel stream operation:


int m = Arrays.stream(ints).parallel()
.reduce(Integer.MIN_VALUE, Math::max);

Our expectation is that parallel execution should be faster


than sequential execution. Since the measurements were
made on a dual-core platform parallel execution can be at
most twice as fast as sequential execution. Ideally, the ratio
sequential/parallel performance should be 2.0. Naturally,
parallel execution does introduce some overhead for splitting the problem, creating subtasks, running them in multiple threads, gathering their partial results, and producing the
overall result. The ratio will be less than 2.0, but it should
come close. These are the actual benchmark results:
sequential parallel seq./par.
int-array 5.35 ms 3.35 ms 1.60

www.JAXenter.com | August 2015

The reality check via our benchmark yields a ratio (sequential/parallel) of only 1.6 instead of 2.0, which illustrates the
amount of overhead that is involved in going parallel and
how (well or poorly) it is overcompensated (on this particular
platform).
You might be tempted to generalise these figures and conclude that parallel streams are always faster than sequential
streams, perhaps not twice as fast (on a dual core hardware),
as one might hope for, but at least faster. However, this is not
true. Again, there are numerous aspects that contribute to the
performance of a parallel stream operation.
One of them is the splittability of the stream source. An
array splits nicely; it just takes an index calculation to figure
out the mid element and split the array into halves. There is
no overhead and thus barely any cost of splitting. How easily
do collections split compared to an array? What does it take
to split a binary tree or a linked list? In certain situations you
will observe vastly different performance results for different
types of collections.
Another aspect is statefulness. Some stream operations
maintain state. An example is the distinct() operation. It is
an intermediate operation that eliminates duplicates from
the input sequence, i.e., it returns an output sequence with
distinct elements. In order to decide whether the next element is a duplicate or not the operation must compare to
all elements it has already encountered. For this purpose it
maintains some sort of data structure as its state. If you call
distinct() on a parallel stream its state will be accessed concurrently by multiple worker threads, which requires some
form of coordination or synchronisation, which adds overhead, which slows down parallel execution, up to the extent
that parallel execution may be significantly slower than sequential execution.
With this in mind it is fair to say that the performance
model of streams is not a trivial one. Expecting that parallel
stream operations are always faster than sequential stream
operations is naive. The performance gain, if any, depends on
numerous factors, some of which I briefly mentioned above.
If you are familiar with the inner workings of streams you
will be capable of coming up with an informed guess regarding the performance of a parallel stream operation. Yet, you
need to benchmark a lot in order to find out for a given context whether going parallel is worth doing or not. There are
indeed situations in which parallel execution is slower than
sequential execution and blindly using parallel streams in all
cases can be downright counter-productive.
The realisation is: Yes, parallel stream operations are easy
to use and often they run faster than sequential operations,
but dont expect miracles. Also, dont guess; instead, benchmark a lot.

Angelika Langer works as a trainer and consultant with a course curriculum of Java and C++ seminars. She enjoys speaking at conferences,
among them JavaOne, JAX, JFokus, JavaZone andmany more. She is author of the online Java Generics FAQs and a Lambda Tutorial & Reference at www.AngelikaLanger.com.

12

Java

Coding for desktop and mobile with HTML5 and Java EE 7

A world beyond Java


Full-blown applications programmed for your browser thats where its at, right now, says JAX London speaker Geertjan Wielenga. And this should be of some concern to Java developers out there.

by Geertjan Wielenga
We can no longer make assumptions about where and how
the applications we develop will be used. Where originally
HTML, CSS and JavaScript were primarily focused on presenting documents in a nice and friendly way, the utility of
the browser has exploded beyond what could ever have been
imagined. And, no, its not all about multimedia i.e., no,
its not all about video and audio and the like. Its all about
full-blown applications that can now be programmed for the
browser. Why the browser? Because the browser is everywhere: on your mobile device, on your tablet, on your laptop,
and on your desktop computer.
Seen from the perspective of the Java ecosystem, this development is a bit of a blow. All along, we thought the JVM
would be victorious, i.e., we thought the write once, run
anywhere mantra would be exclusively something that we
as Java developers could claim to be our terrain. To various
extents, of course, thats still true, especially if you see Android as Java for mobile. Then you could make the argument
that on all devices, some semblance of Java is present. The
arguments youd need to make would be slightly complicated
by the fact that most of your users dont actually have Java
installed i.e., they physically need to do so, or your application needs to somehow physically install Java on your users
device. Whether youre a Java enthusiast or not, you need to
admit that the reach of the browser is far broader and more
intuitively present than Java, at this point.
So, how do we deal with this reality? How can you make
sure that your next application supports all these different
devices, which each have their own specificities and eccentricities? On the simplest level, each device has its own screen
size. On a more complex level, not every device needs to enable interaction with your application in the same way. Some

www.JAXenter.com | August 2015

of those devices have more problems with battery life than


others. Responsive design via CSS may not be enough, simply
because CSS hides DOM elements. It does not prevent the
loading of resources, meaning that the heavy map technology
that you intend for the tablet user is going to be downloaded
all the same for the mobile user, even though it will not be
shown, thanks to CSS.

Did you know?


Did you know theres something called responsive Java
Script, which is much more powerful than responsive
CSS? Did you know that there are a number of techniques
you can use when creating enterprise-scale JavaScript applications, including modularity via RequireJS? Did you know
that AngularJS is not the only answer when it comes to JavaScript application frameworks?
And finally, are you aware of the meaningful roles that
Java, especially Java EE, can continue to play in the brave
new old world of JavaScript? These questions and concerns
will be addressed during my session at JAX London, via a
range of small code snippets and examples, i.e., you will certainly see as much code and technical tips and tricks, as you
will see slides. Undoubtedly, you will leave the session with
a lot of new insights and questions to consider when starting
your next enterprise-scale applications, whether in Java or in
JavaScript!

Geertjan Wielenga is Developer and author at Sun Microsystems and


Oracle, working on NetBeans IDE and the NetBeans Platform, speaker at
JavaOne, Devoxx, JAX London and other international software development conferences, Java and JavaScript enthusiast, JavaOne Rock Star.

13

istockphoto.com/dinn

Java

JShell, the Java 9 REPL: What does it do?

JEP 222
Among the few truly new features coming in Java 9 (alongside Project Jigsaws modularity) is a Java Shell that has recently been confirmed. Java Executive Committee member
Werner Keil explains how Javas new REPL got started and what its good for.

by Werner Keil
As proposed in OpenJDK JEP 222 [1], the JShell offers a
REPL (Read-Eval-Print Loop) to evaluate declarations, statements and expressions of the Java language, together with an
API allowing other applications to leverage its functionality.
The idea is not exactly new. BeanShell [2] has existed for over
15 years now, nearly as long as Java itself, not to mention
many scripting languages on Scala and Groovy also featuring
similar shells already.
BeanShell (just like Groovy, too by the way) made an attempt of standardisation by the Java Community Process [3]
in JSR 274 a JSR that did not produce any notable output,
in spite of the fact that (or perhaps because?) two major companies, Sun and Google, had joined the expert group. Under
the JCP.next initiative this JSR was declared Dormant.

Oracle who now owns former EG member Sun), raised some


eyebrows among JCP EC members. One concern was that
after the JCP had just merged its ME and SE/EE parts into a
single body, developing more and more platform features not
as JSRs but JEPs under the OpenJDK would create another
rift between ME/SE (JDK) and EE where most remaining JSRs
then resided.
Device I/O [4], derived from an Oracle proprietary predecessor under Java ME, was already developed as an OpenJDK
project. Without a JEP, it seems Oracle at least can also ratify
such projects without prior proposal. The farce around JSR
310, which neither produced an actual Spec document mandatory to pretty much all JSRs, nor (according to Co-Spec
Lead Stephen Colebourne) comes with a real API similar to
other SE platform JSRs like Collections, was another example
of where the JSR should have been withdrawn or declared

An eyebrow-raising approach
Adding a new Java feature like this via JEP, rather than waking up the Dormant JSR (which anyone could, including

www.JAXenter.com | August 2015

Figure 1: JShell
arithmetic

14

Java

The value for Java ME remains to be seen, especially


if down-scaling like Device I/O is even possible.
dormant when the JEP started. It was just meant to rubberstamp some JDK part by the EC, without the actual result of
the JSR outside of the OpenJDK.
Every class has some Javadoc, so that doesnt really count.
Given Oracles strong involvement we are likely to see more
JEPs under the OpenJDK. And having a transparent opensource effort behind these parts of the Java ecosystem is still
better than a closed environment, so even if it may disenfranchise and weaken some of the JCP (and EC), it is better than
no open development at all.

Potential uses of the JShell


Having such a shell in Java is certainly not a bad idea. Regardless of its development under Java SE, future versions of
Java EE may find a standard shell even more appealing than
Java SE. The value for Java ME remains to be seen, especially if down-scaling like Device I/O is even possible. But at
the very least, IoT devices running Java SE Embedded should
clearly benefit.
Windows PowerShell [5] has become a strong preference
for system administration or DevOps, at least on Windows
and .NET. This is used by its Play Framework for administrative tasks, while Groovy is used for similar purposes by
the Spring Framework, or under the hood of the JBoss Admin Shell [6]. Meanwhile, WebLogic Scripting Tool (WLST)
emerged from Jython, a Python shell on the JVM. Java EE
Reference Implementation GlassFish has an admin shell
called asadmin. Being able to tap into a unified Java shell
in future versions could certainly make life easier for many
Java-based projects, as well as products, developers and ops
using them.
Other interesting fields of use are domain-specific extensions. Groovy, Scala or other shell-enabled languages (both
on the JVM and outside of it) are very popular for business
or scientific calculations. Based on early impressions with
JShell[7] messages like assigned to temporary variable $3 of
type int can be quite misleading (Figure1).
In particular the financial domain tends to think of US dollars when they read $, so that still has room for improvement. But almost natural language queries such as Google
answers questions like what is 2 plus 2, or a pretty NoSQL
DB of its time like Q&A [8], offering such features ten years
before the Java language even started to have great potential. Instead of simply asking 2+2 questions, users may ask
what the temperature in their living room is, when backed
by a Smart Home solution. Or using JSRs like 354, the recently finished Money API [9], questions like 2$ in CHF or

www.JAXenter.com | August 2015

similar would make great sense too. Thats where temporary


variables quoting $ amounts would be a bit confusing, but
maybe the JDK team behind JShell finds other ways to phrase
that.
Another great example of a Java-powered REPL and expression language for scientific and other arithmetic challenges is Frink [10], named after the weird scientist character
in The Simpsons TV series. It answers all sorts of questions,
starting from date/time or time zone conversions (which java.
time aka JSR 310 could certainly be used for, too) or currency
conversions like:
"600 baht -> USD"

Frink provides much more mathematical and physical formulas, including unit conversion. Based on JSR 363, the
upcoming Java Units of Measurement standard [11], this
will be possible in a similar way. With Groovy, co-founder
Guillaume Laforge has documented a DSL/REPL for Units
of Measurements using JSR 275 a while back [12]. Their solution was used in real-life medical research for Malaria treatments. Of course, being written in Java, someone might also
simply expose the actual Frink language and system via JShell
under Java 9!
Werner Keil is an Agile Coach, Java EE and IoT/Embedded/Real Time
expert. Helping Global 500 enterprises across industries and leading IT
vendors, he has worked for over 25 years as Program Manager, Coach,
SW architect and consultant for Finance, Mobile, Media, Tansport and
Public sectors. Werner is an Eclipse and Apache Committer and JCP member in JSRs like 333 (JCR), 342 (Java EE 7), 354 (Money), 358/364 (JCP.next), Java
ME 8, 362 (Portlet 3), 363 (Units, also Spec Lead), 365 (CDI 2), 375 (Java EE Security) and in the Executive Committee.

References
[1] http://openjdk.java.net/jeps/222
[2] http://www.beanshell.org/
[3] http://jcp.org
[4] http://openjdk.java.net/projects/dio/
[5] https://en.wikipedia.org/wiki/Windows_PowerShell
[6] http://teiid.jboss.org/tools/adminshell/
[7] http://blog.takipi.com/java-9-early-access-a-hands-on-session-with-jshell-thejava-repl/
[8] https://en.wikipedia.org/wiki/Q%26A_(Symantec)
[9] http://www.javamoney.org
[10] https://futureboy.us/frinkdocs/
[11] http://unitsofmeasurement.github.io/
[12] https://dzone.com/articles/domain-specific-language-unit-

15

Databases

Making the right database decisions

MySQL is a great NoSQL


Nowhere else are business decisions as hype-oriented as in IT. And while NoSQL is all well
and good, MySQL is often the sensible choice in terms of operational cost and scalability.

by Aviran Mordo
NoSQL is a set of database technologies built to handle massive
amounts of data or specific data structures foreign to relational
databases. However, the choice to use a NoSQL database is
often based on hype, or a wrong assumption that relational
databases cannot perform as well as a NoSQL database. Operational cost is often overlooked by engineers when it comes
to selecting a database.
When building a scalable system, we found that an important factor is using proven technology so that we know how
to recover fast if theres a failure. Pre-existing knowledge and
experience with the system and its workings as well as being
able to Google for answers is critical for swift mitigation.
Relational databases have been around for over 40 years, and
there is a vast industry knowledge of how to use and maintain
them. This is one reason we usually default to using a MySQL
database instead of a NoSQL database, unless NoSQL is a
significantly better solution to the problem.
However, using MySQL in a large-scale system may have performance challenges. To get great performance from MySQL,
we employ a few usage patterns. One of these is avoiding database-level transactions. Transactions require that the database
maintains locks, which has an adverse effect on performance.
Instead, we use logical application-level transactions, thus
reducing the load and extracting high performance from the
database. For example, lets think about an invoicing schema.
If theres an invoice with multiple line items, instead of writing
all the line items in a single transaction, we simply write line
by line without any transaction. Once all the lines are written
to the database, we write a header record, which has pointers
to the line items IDs. This way, if something fails while writing the individual lines to the database, and the header record
was not written, then the whole transaction fails. A possible
downside is that there may be orphan rows in the database.
We dont see it as a significant issue though, as storage is cheap
and these rows can be purged later if more space is needed.

High-performance MySQL usage patterns


Here are some of our other usage patterns to get great performance from MySQL:
Do not have queries with joins; only query by primary key
or index.
Do not use sequential primary keys (auto-increment) because they introduce locks. Instead, use client-generated

www.JAXenter.com | August 2015

keys, such as GUIDs. Also, when you have master-master


replication, auto-increment causes conflicts, so you will
have to create key ranges for each instance.
Any field that is not indexed has no right to exist. Instead,
we fold such fields into a single text field (JSON is a good
choice).
We often use MySQL simply as a key-value store. We store
a JSON object in one of the columns, which allows us to extend the schema without making database schema changes.
Accessing MySQL by primary key is extremely fast, and we
get submillisecond read time by primary key, which is excellent for most use cases. So we found that MySQL is a great
NoSQL thats ACID-compliant.
In terms of database size, we found that a single MySQL
instance can work perfectly well with hundreds of millions of
records. Most of our use cases do not have more than several
hundred million records in a single instance. One big advantage to using relational databases as opposed to NoSQL is that
you dont need to deal with the eventually consistent nature
displayed by most NoSQL databases. Our developers all know
relational databases very well, and it makes their lives easy.
Dont get me wrong, there is a place for NoSQL; relational
databases have their limits single host size and strict data
structures. Operational cost is often overlooked by engineers
in favour of the cool new thing. If the two options are viable,
we believe we need to really consider what it takes to maintain it in production and decide accordingly.
Aviran Mordo is the head of back-end engineering at Wix. He has over
twenty years of experience in the software industry and has filled many
engineering roles and leading positions, from designing and building the
US national Electronic Records Archives prototype to building search engine infrastructures.

From 0 to 60 Million Users:


Scaling with Microservices
and Multi-Cloud Architecture
Hear Aviran Mordo speak at the JAX London: Many small startups
build their systems on top of a traditional toolset. These systems
are used because they facilitate easy development and fast progress, but many of them are monolithic and have limited scalability. As a startup grows, the team is confronted with the problem of
how to evolve and scale the system.

16

Data

Rethinking how we think about self-service cloud BI

Business intelligence
must evolve
Every employee and every end user should have the right to find answers using data
analytics. But the current reliance on IT for key information is creating an unnecessary
bottleneck, says DataHeros Chris Neumann.
by Chris Neumann
Self-service is a term that gets used a lot in the business intelligence (BI) space these days. In reality, data analytics has
largely ignored the group of users that really need self service,
even as that user base has grown. More than ever people realize
the value of data, but non-technical users are still left out of the
conversation. While everything from storage to collaboration
tools have become simple enough for anyone to download and
begin using, BI and data analytics tools still require end users
to be experts or seek the help of experts. That needs to change.
Users should be able to get up and running on data analytics and connect to the services they use most, easily. More employees in every department are expected to make decisions
based on their data, but that doesnt mean everyone needs
to be a data analyst or data scientist. Business users want to
analyse data that lives in the services they use everyday, like
Google Analytics, HubSpot, Marketo, and Shopify and even
Excel, and know the questions they need answered. What
they need are truly self-service tools to get those answers.

Calls for change


While vendor jargon and the obsession with big data may be
clouding the self-service cloud BI conversation, experts and
enterprises are recognizing that things need to change. Leading
analyst firms like Forrester and Gartner are recognizing that
BI must evolve. When business users depend on IT teams to
get answers, a bottleneck is created. End users are demanding
tools they can use on their own without having to go to IT.
There are a number of vendors connecting to cloud services.
But, connecting in a way that facilitates effective data analysis
presents a myriad of additional challenges from navigating
the sheer variety of formats to categorizing unspecified data.
At DataHero, weve built the requisite connectors for accessing the data within cloud services. Weve also taken the
next steps with a data classification engine that automates
ETL and recognizes that what a cloud service might call text
is actually an important field. In order to successfully integrate
these connections, solutions must automatically normalize the
data from disparate services, matching attributes and allowing the data to be combined and analysed. Without automatic

www.JAXenter.com | August 2015

normalization and categorization, self-service cloud BI isnt


possible.

The whole is greater than the sum of its parts


While self-service cloud BI is already possible, the users are
often new to the world of data analytics. That means that the
tools too must evolve as the users become more sophisticated
and new possibilities emerge.
For example, without data analytics, a marketer might log
into a Google Analytics dashboard, then MailChimp, then
Salesforce to take the pulse of a marketing campaign. Each service provides its own value, but when combined the marketer
can use a common attribute, like email address, and create a
third dataset. What comes out of that is a much more pure
answer to the marketers question: how successful is my campaign?
Google Analytics, MailChimp and Salesforce are a common
combination but there are many combinations that may be
just as valuable but have yet to be explored. With the proliferation of cloud applications, the possibilities are nearly endless.
The new users of BI and data analytics have also never had
the opportunity to work with one another. To continue with
the example, a marketer may have created the charts needed
to monitor KPIs and put them into a dashboard, but these
KPIs need to be shared with internal teams, clients and executives. Reporting is normally a one-way process when it should
be iterative and collaborative and allow clients and executives
to provide real feedback on the most up-to-date numbers.

The consumerization of BI
BI and data analytics have largely missed the consumerization
of IT trend, despite industry-wide use of the term self service.
That doesnt mean that change isnt coming. The shift to the
cloud is continuing to accelerate and the emerging self-service
cloud BI space is quickly heating up, driven by user demand
and a need to decouple analytics from IT.
Chris Neumann is the founder and Chief Product Officer of DataHero,
where he aims to help everyone unmask the clues in their data. Previously he was the first employee at Aster Data Systems and describes
himself as a data-analytics junkie, a bona fide techie and a self-proclaimed
foodie.

17

2010
e
c
n
I
s
In london

The Conference for Java & Software Innovation


grou
DisCou P
nT

save 3

0%

October 12 14th, 2015

Business Design Centre, London


The Enterprise Conference on Java,
Web & Mobile, Developer Practices,
Agility and Big Data!
follow us:

@JAXLondon

JAX London

JAX London

www.jaxlondon.com
Presented by

Organized by

October 12th 14th, 2015


Business Design Centre, London

Join us for JaX London 2015


JAX London provides a 3 day conference experience for cutting
edge software engineers and enterprise level professionals with
attendees from across the globe. JAX brings together the worlds
leading Java and JVM experts as well as many innovators in the
fields of Microservices, Continuous Delivery and DevOps to share
their knowledge and experience. In the spirit of agile methodo
logy and lean business, JAX London is the place to define the
next level of ultraefficient and superadaptive technology for
your organization.

Learn how to increase your productivity, identify which tech


nologies and practices suit your specific requirements and
learn about new approaches. Monday is a preconference work
shop and tutorial day. The halfday and fullday workshops are
overseen by experts. On Tuesday and Wednesday see the
proper conference taking place with more than 60 technical
sessions, keynotes, the JAX Expo, community events and more.
For more information and the latest news about the conference and
our speakers check out www.jaxlondon.com.

Keynotes
VC from the inside a techies perspective

Jeff sussna (ingineering.iT)


Jeff Sussna is Founder and Principal of Ingineering.IT, a
Minneapolis technology consulting firm that helps enter
prises and SoftwareasaService companies adopt 21st
century IT tools and practices. Jeff has nearly 25 years of
IT experience. He has led highperformance teams across
the Development/QA/Operations spectrum. He has a track record of
driving quality improvements through practical innovation. Jeff has done
work for a diverse range of companies, including Fortune 500 enterpris
es, major technology companies, software product and service startups,
and media conglomerates.
Jeff combines engineering expertise with the ability to bridge business,
creative, and technical perspectives. He has the insight and experience
to uncover problems and solutions other miss. He is a highly soughtaf
ter speaker and writer respected for his insights on topics such as Agile,
DevOps, Service Design, and cloud computing.
Jeffs interests focus on the intersection of development, operations,
design, and business. He is the author of Designing Delivery: Rethink
ing IT in the Digital Service Economy. Designing Delivery explores the
relationship between IT and business in the 21stcentury, and presents a
unified approach to designing and operating responsive digital services.

After many years in CTO roles with SpringSource, VMware, and Pivotal,
and having experienced what it is like to work in a VCbacked company,
in June of 2014 Adrian switched sides and joined the venture capital firm
Accel Partners in London. So what exactly does a technologist do inside
a venture capital firm? And having been part of the process from the
inside, how do investment decisions get made? In this talk Adrian will
share some of the lessons hes learned since embedding in the world of
venture capital, and how you can maximise your chances of investment
and a successful companybuilding partnership.

From Design Thinking to Devops and Back Again: unifying Design and operations

Developers love writing code but to build resilient industryscale systems


we often need to persuade others to make changes to both code and
working practices. As a coach, my job is to help developers spot areas
for improvement and act on their ideas. Core to this work is opening up
different ways of seeing the work that lies ahead.

The era of digital service is shifting customers brand expectations


from stability to responsiveness. Optimizing delivery speed is only half
of this new equation. Companies also need to optimize their ability to
listen and to act on what they hear. In order to maximize both velocity
and responsiveness, companies need to transform upfront design into
a continuous, circular designoperations loop that unifies marketing,
design, development, operations, and support.

Adrian Colyer (Accel Partners)


Adrian is a Venture Partner with Accel Partners in Lon
don, and the author of The Morning Paper, where he
reviews an interesting CSrelated paper every weekday.
Hes also an advisor to ClusterHQ, Skipjaq, and Weave
works. Previously Adrian served in CTO roles at Pivotal,
VMware, and SpringSource. Adrians extensive open source experience
includes working with the teams that created the Spring Framework and
related Spring projects, Cloud Foundry, RabbitMQ, Redis, Groovy, Grails,
and AspectJ, as well as with team members making significant contribu
tions to Apache Tomcat and Apache HTTP server.

rachel Davies (unruly)


Rachel Davies coaches product development teams at
Unruly (tech.unruly.co) in London. She is author of Agile
Coaching and an invited speaker at industry events
around the globe. Her mission is to create workplaces
where developers enjoy delivering valuable software. Ra
chel is a strong advocate of XP approaches and an organiser of Extreme
Programmers London meetup.

The Art of shifting Perspectives

In this new talk, I will share some stories of changes the teams I work
with have made and explain some mechanisms that we applied to make
changes. Teams I work with at Unruly use eXtreme Programming (XP)
techniques to build our systems. Modern XP has many counterintuitive
practices such as mob and pair programming. How did new ways of
seeing old problems help us resolved them?
Come along to this talk to hear about some practical techniques you can
use to help solve tricky problems and get others on board with your idea
by shifting perspective.

jaxlondon.com

October 12th 14th, 2015


Business Design Centre, London

Timetable
Monday October 12th
09:00 17:00

Design & Implementation of Microservices

James Lewis

Designing and Operating User-Centered Digital Services

Jeff Sussna

Workshop: Lambdas and Streams in Java 8

Angelika Langer, Klaus Kreft

Workshop: Crafting Code

Sandro Mancuso

Workshop on Low Latency logging and replay

Peter Lawery

Tuesday October 13th


09:00 10:00
10:15 11:05

KeYNOTe: From Design Thinking to DevOps and Back Again: Unifying Design and
Operations
Benchmarking: Youre Doing It Wrong

Jeff Sussna
Aysylu Greenberg

The Performance Model of Streams in Java 8

Angelika Langer

Open Source workflows with BPMN 2.0, Java and Camunda BPM

Niall Deehan

DevOps, what should you decide, when, why & how

Vinita Rathi

11:40 12:10
11:40 12:30

Java Generics: Past, Present and Future

Richard Warburton, Raoul-Gabriel Urma

Smoothing the continuous delivery path a tale of two teams

Lyndsay Prewer

14:30 15:20

2000 Lines of Java or 50 Lines of SQL? The Choice is Yours

Lukas Eder

From 0 to 60 Million Users: Scaling with Microservices and Multi-Cloud Architecture

Aviran Mordo

How to defeat feature gluttony?

Kasia Mrowca

Costs of the Cult of Expertise

Jessica Rose

Cluster your Application using CDI and JCache

Jonathan Gallimore

Distributed Systems in one Lesson

Tim Berglund

Garbage Collection Pause Times

Angelika Langer

Technology Innovation Diffusion

Jeremy Deane

Continuous delivery the missing parts

Paul Stack

Pragmatic Functional Refactoring with Java 8

Richard Warburton, Raoul-Gabriel Urma

Preparing your API Strategy for IoT

Per Buer

Use your type system; write less code

Samir Talwar

A pattern language for microservices

Chris Richardson

15:50 16:40

17:10 18:00

18:15 18:45

All Change! How the new Economics of Cloud will make you think differently about Java Steve Poole, Chris Bailey
Le Mort du Product Management

Nigel Runnels-Moss

20:00 21:00

KeYNOTe: VC from the inside - a techies perspective

Adrian Colyer

Wednesday October 14th


09:00 09:45

KeYNOTe: The Art of Shifting Perspectives

Rachel Davies

10:00 10:50

Advanced A/B Testing

Aviran Mordo

Architectural Resiliency

Jeremy Deane

Cassandra and Spark

Tim Berglund

Lambdas Puzzler

Peter Lawrey

Coding for Desktop and Mobile with HTML5 and Java EE 7

Geertjan Wielenga

Intuitions for Scaling Data-Centric Architectures

Benjamin Stopford

Microservices: From dream to reality in an hour

Dr. Holly Cummins

Does TDD really lead to good design?

Sandro Mancuso

DevOps and the Cloud: All Hail the (Developer) King!

Daniel Bryant, Steve Poole

Fowler, Fielding, and Haussmann Network-based Architectures

Eric Horesnyi

Java vs. JavaScript for Enterprise Web Applications

Chris Bailey

The Dark Side of Software Metrics

Nigel Runnels-Moss

The Unit Test is dead. Long live the Unit Test!

Colin Vipurs

Events on the outside, on the inside and at the core

Chris Richardson

Architecting for a Scalable Enterprise

John Davies

11:20 12:10

12:20 13:10

15:30 16:20

jaxlondon.com

October 12th 14th, 2015


Business Design Centre, London

JAX London Workshop Day


Workshop on Low Latency logging and replay

Designing and operating user-Centered Digital services

Peter Lawrey (Higher Frequency Trading Ltd)

Jeff Sussna ( Ingineering.IT)

A workshop for beginners to advanced developers on how to write and


read data efficiently in Java. The workshop will cover the following: An
advanced review of how the JVM really uses memory. what are refer
ences. what is compressed OOPS. how are the fields in an object laid
out. Using Maven to build a project using Chronicle. Setting up a simple
maven project. Using modules from maven central. Assembling a maven
build. How do memory mapped files work on Windows and Linux. Storing
data in a memory mapped file. Sharing data between JVMs via memory
mapped files. What is Unsafe and how does it work. Using Unsafe to
see the contents of an object in memory. Using Unsafe to access native
memory. Writing and read data to a Chronicle Queue. Using raw bytes.
Using a wire format. Designing a system with low latency persisted IPC.
simple order matching system example. Advanced content will be added
into the early sessions to keep advanced user interested and the later
topic will have prebuilt working examples to build on.

With software eating the world, 21stcentury business increasingly de


pends on IT, not just for operational efficiency, but for its very existence.
In a highly disruptive service economy, ITdriven businesses must con
tinually adapt to everchanging customer needs and market demands.
To power the adaptive organization, IT needs to become a medium for
continuous, empathic customer conversations. This workshop teaches
participants how to design and operate systems and organizations that
help businesses create value through customer empathy. It introduces
them to the theory and practice of Continuous Design, a crossfunctional
practice that interconnects marketing, design, development, and opera
tions into a circular design/operations loop. Participants learn how
to: align software designs with operational, business, and customer
needs maximize quality throughout the design, development, and
operations lifecycle create highly resilient and adaptable systems,
practices, and organizations. The workshop takes place in two sessions:
Introduction to Continuous Design and Applying Continuous Design.
Morning Introduction to Continuous Design: this session introduces
the principles of Continuous Design. It grounds those principles in the
historical, philosophical, and economic underpinnings that link method
ologies such as Design Thinking, Agile, DevOps, and Lean. By providing a
strong theoretical grounding in new ways of knowing, this session gives
participants the ability to evaluate the effectiveness of specific tools
and practices, and to continually adapt them to meet their own needs
and constraints. Afternoon Applying Continuous Design: this session
introduces a concrete methodology for applying Continuous Design to
realworld problems.

Workshop: Crafting Code


Sandro Mancuso (Codurance)
This course is designed to help developers write wellcrafted codecode
that is clean, testable, maintainable, and an expression of the business
domain. The course is entirely handson, designed to teach developers
practical techniques they can immediately apply to realworld projects.
Software Craftsmanship is at the heart of this course. Throughout, you
will learn about the Software Craftsmanship attitude to development
and how to apply it to your workplace. Writing Clean Code is difficult.
Cleaning existing code, even more so. You should attend if you want to:
Write clean code that is easy to understand and maintain. Become more
proficient in TestDriven Development (TDD): using tests to design and
build your code base. Focus your tests and production code according to
business requirements using OutsideIn TDD (a.k.a. the London School
of TDD) Clean code necessitates good design. In the process of driving
your code through tests, you will learn how to: Understand design prin
ciples that lead to clean code Avoid overengineering and large rewrites
by incrementally evolving your design using tests Once you have an
understanding of the principles at work, we will apply them to Legacy
Code to help you gain confidence in improving legacy projects through
testing, refactoring and redesigning. The content will be: TDD lifecycle
and the OutsideIn style of TDD Writing unit tests that express intent, not
implementation Using unit tests as a tool to drive good design Expres
sive code Testing and refactoring Legacy Code.

Workshop: Lambdas and streams in Java 8


Angelika Langer (Angelika Langer Training/Consulting),
Klaus Kreft (Klaus Kreft)
This workshop is devoted to the stream framework, which is an exten
sion to the JDK collection framework. Streams offer an easy way to
parallelize bulk operations on sequences of elements. The stream API
differs from the classic collection API in many ways: it supports a fluent
programming style and borrows elements from functional languages. For
instance, streams have operations such as filter, map, and reduce. The
new language features of lambda expressions and method references
have been added to Java for effective and convenient use of the Stream
API.In this workshop we will introduce lambda expressions and method/
constructor references, give an overview of the stream operations and
discuss the performance characteristics of sequential vs. parallel stream
operations. Attendants are encouraged to bring their notebooks. We will
not only explore the novelties in theory, but intend to provide enough in
formation to allow for handson experiments with lambdas and streams.

Design & implementation of Microservices


James Lewis (ThoughtWorks)
Microservices Architecture is a concept that aims to decouple a solution
by decomposing functionality into discrete services. Microservice archi
tectures can lead to easier to change, more maintainable systems which
can be more secure, performant and stable. In this workshop you will
discover a consistent and reinforcing set of tools and practices rooted
in the philosophy of small and simple that can help you move towards a
Microservice architecture in your own organisation. Small services, com
municating via the webs uniform interface with single responsibilities
and installed as well behaved operating system services. However, with
these finergrained systems come new sources of complexity. What you
will learn: During this workshop you will understand in more depth what
the benefits are of finergrained architectures, how to break apart your
existing monolithic applications, and what are the practical concerns of
managing these systems. We will discuss how to ensure your systems
can be made more stable, how to handle security, and how to handle
the additional complexity of monitoring and deployment. We will cover
the following topics: Principledriven evolutionary architecture Capability
modelling and the town planning metaphor REST, web integration and
eventdriven systems of systems Microservices, versioning, consumer
driven contracts and Postels law. Who should attend: Developers, Archi
tects, Technical Leaders, Operations Engineers and anybody interested
in the design and architecture of services and components.

jaxlondon.com

istockphoto.com/Peter Booth

Cloud

Database solutions

The future of cloud


computing
The cloud has changed everything. And yet the cloud revolution at the heart of IT is only getting started. As data becomes more and more important, were beginning to realise how
central a role the database will play in future.

by Zigmars Raascevskis
Cloud computing engines today allow businesses to easily extend their IT infrastructure at any time. This means that you
can rent servers with only a few clicks, and various software
stacks including web-servers, middleware and databases can
be installed and run on to these server instances with little-tono effort. With data continuing to aggregate at a rapid speed,
the database is becoming a large part of this infrastructure.
By leveraging conventional cloud computing, every business
can run its own database stack in cloud the same way as if it
were on-premise.
Theres still a huge amount of potential to accelerate speed
and efficiency by using a multi-tenant database. For multi-

www.JAXenter.com | August 2015

tenant distributed databases, a certain amount of servers in


a cloud footprint are set aside for managing databases, but
these resources are shared by many users. This opens up
the possibility for improving speed and efficiency of the IT
infrastructure within organizations. A combined database
footprint has massive resources and the ability to parallelize
a much wider range of requests than users with their own
dedicated servers. Such a setup allows faster run time and
avoids the painful sizing and provisioning process associated with on-premise infrastructure and traditional cloud
computing. So what should businesses look for when selecting a database solution? A multi-tenant database solution is
worth considering given it can help overcome the following
challenges.

22

Cloud

Distributed databases can serve as a solid foundation for distributed computing that is massively parallel and instantly scalable.
I Failure tolerance of distributed systems
By design, distributed systems with state replication are resis
tant against most forms of single machine failures. Guarding
against single machine hardware failures is relatively straightforward. With the distributed database design, every database
is hosted on multiple machines that replicate each partition
several times. Therefore, in the case of server failure, each system routes traffic to healthy replicas to make sure that data is
replicated elsewhere ensuring higher availability. However,
making distributed systems tolerant against software failures
is much more difficult due to common cause and presents a
difficult challenge. The ultimate power of distributed systems
comes from parallelism, but this also means that the same
code is executed on every server participating in fulfilling the
request. If working on a particular request causes a fatal failure that has a negative impact on the operation of a system
or even crashes it, this means the entire cluster is immediately
affected.
Sophisticated methods are necessary to avoid such correlated failures, which might be rare, that have devastating effects.
One method involves trying each query on a few isolated
computational nodes before sending it down to the entire
cluster with massive parallelism. Once failures are observed
in the sandbox, suspicious requests are immediately quarantined and isolated from the rest of the system.

II Performance guarantees in a multi-tenant environment


Another common problem that often manifests itself in public
clouds is the noisy neighbour issue. When many users share
computational resources, it is important to ensure that they
are prioritized and isolated properly so that sudden changes
in behaviour of one user do not have an adverse impact on another. A common approach for computing engines has been
isolation of resources into containers. This requires giving
each user a certain sized box that it cannot break out from
providing a level of isolation however, its not flexible in
terms of giving users enough resources exactly when they
need them. Effective workload scheduling, low-level resource
prioritization and isolation are key techniques to achieving a
predictable performance.
A multi-tenant database software stack actually provides
more opportunities to share and prioritize resources dynamically while providing performance guarantees. This is possible
because the database software can manage access of critical
resources like a CPU core or a spinning disk through a queue
of requests that are accessing the resource. The provisioning
process ensures that there are enough aggregated resources
in the cluster. However, in the case that some user behaves

www.JAXenter.com | August 2015

unpredictably, the software stack is able to control the queues


and can make sure that only the offender is affected and other
users whose resource usage patterns are unchanged remain
unaffected. Additionally, management of requested queues
can ensure, through prioritization, that the end users latency
metrics are optimised by picking the next request from the
queue.

III: ACID-compliant transactions: A NoSQL challenge


Another obstacle for massively paralleled distributed systems
has been consistency guarantees. For NoSQL distributed databases, ensuring transactional consistency and ACID properties have been a real problem. This is due to the fact that with
a distributed database, many nodes have to be involved in
processing the transaction and it is not obvious how to act in
cases of failure. Plus, the state of the cluster has to be synchronized to ensure consistency, which presents high overheads in
a highly distributed environment.
Instead of compromising performance or consistency, investment needs to be made to make database software scale
while preserving consistency. For example, transactional
consistency can be managed through the use of a transaction
log, which can in turn, be distributed and replicated for high
throughput and durability.
Distributed databases can serve as a solid foundation
for distributed computing that is massively parallel and instantly scalable. In this respect NoSQL technologies and
its community can leverage this trend to contribute to the
architecture of a future computer. By understanding
the benefits of a multi-tenant system and adopting the appropriate solutions, organizations can experience instant
scalability and massive parallelism within their own data
infrastructures.

Zigmars Raascevskis left a senior engineering position at Google to join


Clusterpoint as the company CEO, foreseeing that document oriented databases would take the market by storm. Prior to joining Clusterpoint,
Zigmars worked for 8years at Google, where among other projects he
managed the web search backend software engineering team in Zurich.
Before his Google career, Zigmars worked for Exigen, a leading regional IT company,
and Lursoft, a leading regional information subscription service company.

23

Tests

Dos and Donts

Testing the Database


Layer
Theres one thing we can agree on when it comes to database tests: they aint easy. Testing
guru and JAX London speaker Colin Vipurs runs through the strengths and weaknesses of common approaches to testing databases.

by Colin Vipurs

A final note on mocking no sane developer these days


would be using raw JDBC, but one of the higher-level abstractions available, and the same rules apply for these. Imagine
a suite of tests setup to mock against JDBC and your code
switches to Spring JdbcTemplate, jOOQ or Hibernate. Your
tests will now have to be rewritten to mock against those
frameworks instead not an ideal solution.

Over my many years of software development Ive had to perform various levels of testing against many different database
instances and types including RDBMS and NoSQL, and one
thing remains constant its hard. There are a few approaches
that can be taken when testing the database layer of your
code and Id like to go over a few of them pointing out the
strengths and weaknesses of each.

Testing Against a Real Database

Mocking

It may sound silly, but the best way to verify that your database interaction code works as expected is to actually have

This is a technique that I have used in the past but


I would highly recommend against doing now. In
my book Tests Need Love Too I discuss why you
should never mock any third-party interface, but just
in case you havent read it (you really should!) Ill go
over it again.
As with mocking any code you dont own, what
youre validating is that youre calling the third-party
code in the way you think you should, but, and heres
the important part this might be incorrect. Unless
you have higher lever tests covering your code, youre
not going to know until it hits production. In addition
to this, mocking raw JDBC is hard, like really hard.
Take for example the test code snippet in Listing 1.
Within this test, not only are there a huge amount
of expectations to setup, but in order to verify that all
the calls happen in the correct order, jMock states
are used extensively. Because of the way JDBC works,
this test also violates the guidelines of never having
mocks returning mocks and in fact goes several levels
deep! Even if you manage to get all of this working,
something as simple as a typo in your SQL can mean
that although your tests are green this will still fail
when your code goes to production.

www.JAXenter.com | August 2015

Listing 1
@Test
public void testJdbc() {
final Connection connection = context.mock(Connection.class);
final ResultSet resultSet = context.mock(ResultSet.class);
final PreparedStatement preparedStatement = context.mock(PreparedStatement.class);
final States query = context.states("query").startsAs("pre-prepare");
context.checking(new Expectations() {{
oneOf(connection).prepareStatement("SELECT firstname, lastname, occupation FROM users");
then(query.is("prepared"));
will(returnValue(preparedStatement));
oneOf(preparedStatement).executeQuery();
oneOf(resultSet).next(); when(query.is("executed")); then(query.is("available"));
oneOf(resultSet).getString(1); when(query.is("available")); will(returnValue("Hermes"));
oneOf(resultSet).getString(2); when(query.is("available")); will(returnValue("Conrad"));
oneOf(resultSet).getString(3); when(query.is("available")); will(returnValue("Bureaucrat"));
oneOf(resultSet).close(); when(query.is("available"));
oneOf(preparedStatement).close(); when(query.is("available"));
}});
}

24

Tests

it interact with a database! As well as ensuring youre using


your chosen API correctly this technique can verify things that
mocking never can, for example, your SQL is syntactically
correct and does what you hope.
In-Memory Databases: One the easiest and quickest ways
to get setup with a database to test against is to use one of
the in-memory versions available, e.g. H2, HSQL or Derby.
If youre happy introducing a Spring dependency into your
code, then the test setup can be as easy as this (Listing 2).
This code will create an instance of the H2 database, load
the schema defined in schema.sql and any test data in test-data.sql. The returned object implements javax.sql.DataSource
so can be injected directly into any class that requires it.
One of the great benefits of this approach is that it is fast.
You can spin up a new database instance for each and every
test requiring it giving you a cast iron guarantee that the data
is clean. You also dont need any extra infrastructure on your
development machine as its all done within the JVM. This
mechanism isnt without its drawbacks though.
Unless youre deploying against the same in-memory database that youre using in your test, inevitably you will run
up against compatibility issues that wont surface until you
hit higher level testing or god forbid production. Because

Listing 2
{code}
public class EmbeddedDatabaseTest {
private DataSource dataSource;
@Before
public void createDatabase() {
dataSource = new EmbeddedDatabaseBuilder().
setType(EmbeddedDatabaseType.H2).
addScript("schema.sql").
addScript("test-data.sql").
build();
}
@Test
public void aTestRequiringADataSource() {
// execute code using DataSource
}
}

The Unit Test is dead. Long


live the Unit Test!
Hear Colin Vipurs speak at the JAX London: Unit tests are the lifeblood of any modern development practise, helping developers
not only ensure the robustness of their code but to also speed up
the development cycle by providing fast feedback on code changes. In reality this isnt always the case and even with the most
diligent of refactorings applied, unit tests can actually become a
hindrance to getting the job done effectively.

www.JAXenter.com | August 2015

youre using a different DataSource to your production instance it can be easy to miss configuration options required
to make the Driver operate correctly. Recently I came across
such a setup where H2 was configured to use a DATETIME
column requiring millisecond precision. The same schema
definition was used on a production MySQL instance which
not only required this to be DATETIME(3) but also needs the
useFractionalSeconds=true parameter provided to the driver.
This issue was only spotted after the tests were migrated from
using H2 to a real MySQL instance.
Real Databases: Where possible I would highly recommend
testing against a database thats as close as possible to the one
being run in your production environment. A variety of factors
can make this difficult or even impossible, such as commercial
databases requiring a license fee meaning that installing on
each and every developer machine is prohibitively costly. A
classic way to get around this problem is to have a single development database available for everyone to connect to. This in
itself can cause a different set of problems, not least of which is
performance (these always seem to get installed on the cheapest and oldest hardware) and test repeatability. The issue with
sharing a database with other developers is that two or more
people running the tests at the same time can lead to inconsistent results and data shifting in an unexpected way. As the
number of people using the database grows, this problem gets
worse throw the CI server into the mix and you can waste a
lot of time re-running tests and trying to find out if anyone else
is running tests right now in order to get a clean build.
If youre running a free database such as MySQL or one
of the many free NoSQL options, installing on your local
development machine can still be problematic issues such
as needing to run multiple versions concurrently or keeping
everyone informed of exactly what infrastructure needs to be
up and what ports they need to be bound to. This model also
requires the software to be up and running prior to performing a build making onboarding staff onto a new project more
time consuming than it needs to be.
Thankfully over the last few years several tools have appeared to ease this, the most notable being Vagrant and Docker. As an example, spinning up a local version of MySQL in
Docker can be as easy as issuing the following command:
$ docker run -p 3306:3306 -e MYSQL_ROOT_PASSWORD=bob mysql

This will start up a self-contained version of the latest MySQL


mapped to the local port of 3306 using the root password
provided. Even on my 4year old MacBook Pro, after the initial image download, this only takes 12seconds. If you need
Redis2.8 running as well you can tell Docker to do that too:
$ docker run -p 6389:6389 redis:2.8

Or the latest version running on a different local port:


$ docker run -p 6390:6389 redis:latest

This can be easily plugged into your build system to make


the whole process automated meaning the only software your

25

Tests

developers need on the local machine is Docker (or Vagrant)


and the infrastructure required for the build can be packaged
into the build script!
Testing Approach: Now you have your database up and
running the question becomes how should I test?. Depending on what youre doing the answer will vary. A greenfield
project might see a relational schema changing rapidly in the
early stages whereas an established project will care more
about reading existing data. Is the data transient or long
lived? Most* applications making use of Redis would be doing so with it acting like a cache so you need to worry less
about reading existing data.
* Most, not all. Ive worked with a fair few systems where
Redis is the primary data store.
The first thing to note is that for functional tests the best
thing to do is start with a clean, empty database. Repeatability is key and an empty database is a surefire way to ensure
this. My preference is for the test itself to take care of this,
purging all data at the beginning of the test, not the end. In the

Listing 3
<dataset>
<USER FIRST_NAME="John"
SURNAME="Smith"
DOB="19750629"/>
<USER FIRST_NAME="Jane"
SURNAME="Doe"
DOB="19780222"/>
</dataset>

Listing 4
def "existing user can be read"() {
given:
sql.execute('INSERT INTO users (id, name) VALUES (1234, "John Smith")')
when:
def actualUser = users.findById(1234)
then:
actualUser.id == 1234
actualUser.name == 'John Smith'
}

Listing 5
def "new user can be stored"() {
given:
def newUser = new User(1234, "John Smith")
when:
users.save(newUser)
then
def actualUser = users.findById(1234)
actualUser.id == 1234
actualUser.name == 'John Smith'
}

www.JAXenter.com | August 2015

event of a test failure, having the database still populated is an


easy way to diagnose problems. Cleaning up state at the end
of the test leaves you no trace and as long as every test follows
this pattern youre all good.
A popular technique for seeding test data is to use a tool
like DbUnit which lets you express your data in files and have
it easily loaded. I have two problems with this; the first is
that if youre using a relational database there is duplication
between the DB schema itself and the test data. Not only does
a schema change require changing the dataset file(s) but the
test data is no longer in the test class itself meaning a context
switch between tests and data. For an example of a of DbUnit
XML file see Listing 3.
One question I usually hear from newcomers to DB testing is whether they should round-trip the data or poke the
database directly for verification. Round-tripping is an important part of the testing cycle as you really need to know
that the data youre writing can be read back. An issue with
this though is that that youre essentially testing two things at
once, so if there is a failure on one side it can be hard to determine what that is. If youre using TDD (of course you are)
then tackling the problem will likely feel very uncomfortable
as the time between red and green can be quite high and you
wont be getting the fast feedback youre used to.
The technique I have adopted is a hybrid approach that lets
me get the best of both approaches while mostly avoiding the
drawbacks of each. The first test I write will be a pure read
test that will insert the data by hand within the test itself. Although this seems like duplication, and it is a little bit, the test
code will bypass any logic the write path might make. For example, an insert that has an ON DUPLICATE KEY clause
will not do this and make the assumption this record does not
exist as the test is in complete control of the state of the data.
The test will then use the production code to read back what
the test has inserted and presto, the read back is verified. An
example of a read test can be seen in Listing 4.
Once the read path is green, the write tests will round-trip
the data using production code for both writing and reading.
Because the read path is known to be good, there is only the
write path to worry about. A failure on the read path at some
point in the future will cause both sets of tests to fail, but a
failure only on the write path helps isolate where the problem is. In addition, if youre using a test DSL for verifying
the read path, it can be reused here to save you time writing
those pesky assertions! An example of a round-trip test can
be seen in Listing 5.

Colin Vipurs started professional software development in 1998 and released his first production bug shortly after. He has spent his career working in a variety of industries using a wide range of technologies always
attempting to release bug-free code. He holds a MSc from Liverpool University and currently works at Shazam as a Developer/Evangelist. He has
spoken at numerous conferences worldwide.

26

Finance IT

Financial services PaaS and private clouds: Managing and monitoring disparate
environments

Private cloud trends


Not all enterprises and IT teams can enjoy the luxuries of the public cloud. So lets
take a look at the limits and the risks of the alternative: the private cloud and PaaS.

by Patricia Hines
Financial Institutions (FIs) find that deploying PaaS and IaaS
solutions within a private cloud environment is an attractive
alternative to technology silos created by disparate server
hardware, operating systems, applications and application
programming interfaces (APIs). Private cloud deployments
enable firms to take a software-defined approach to scaling
and provisioning hardware and computing resources.
While other industries have long enjoyed the increased
agility, improved business responsiveness and dramatic cost
savings by shifting workloads to public clouds, many firms
in highly regulated industries like financial services, healthcare and government are reluctant to adopt public cloud. As
a result of increased regulatory and compliance scrutiny for
these firms, the potential risks of moving workloads to public
clouds outweigh any potential savings.

Private cloud and PaaS trends


The definition of what comprises a private cloud deployment
vary, with some analysts and vendors equating private cloud
with Infrastructure as a Service (IaaS) and others broadening
the term to encompass both IaaS and Platform as a Service
(PaaS). Whatever the definition, many financial services firms
have already deployed private cloud, IaaS and PaaS technologies, often driven by platform simplification and consolidation initiatives.
Vendor platforms for private PaaS are gaining popularity
with a wide range of available proprietary and open source
solutions. Proprietary vendors include Apprenda and Pivotal (which is a commercial version built on Cloud Foundry).
Open source platforms include Cloud Foundry, OpenShift,
Apache Stratos and Cloudify. Many banks are choosing open
source-based solutions as an insurance policy against vendor
lock-in. Moreover, with the source code under the pressure
of public scrutiny, the quality of these applications is often
higher than their proprietary rivals.

Business drivers for private cloud and PaaS adoption


According to Forrester, the top two business drivers for
private cloud adoption are improved IT manageability and

www.JAXenter.com | August 2015

flexibility, followed by a transformed IT environment with


optimized systems of record and empowered developers. For
those citing improved IT manageability and flexibility, there is
a desire to collect, analyse and centralize error and event logs
to manage and monitor performance against SLAs. For those
adopting private cloud to empower developers, the choice is
viewed as a foundational element to allow developer self-service for provisioning application environments and deploying code throughout the application lifecycle. PaaS promises
to abstract applications from their underlying infrastructure,
enabling faster deployment and time to market.

Limitations of private cloud and PaaS


Most large banks have thousands of systems in place to support millions of customers. They host these systems on a
complex, heterogeneous mix of systems, many of which have
been in place for a long time. For example, many core banking systems are still running on IBM mainframes and AS/400
platforms because of their security, reliability, scalability and
resiliency. FIs continue to depend on third-party hosted applications for functions ranging from bill pay to credit checks,
which along with SaaS applications for CRM and HR management, will remain outside of the private clouds domain.
As firms evaluate their private cloud architecture, they need
to consider how they can achieve their business goals of improved IT manageability and empowered developers across a
heterogeneous, hybrid environment. Although it is possible
to re-host and re-architect core legacy systems onto modern
platforms like Java and .Net, these projects will extend far
into the future. As a result, financial institutions need to manage and monitor disparate environments, each with its own
challenges and restrictions, for the foreseeable future.
When a FI adopts private cloud and PaaS technologies to
simplify IT management for application deployment, they are
adding another technology stack to the already complex mix.
To make matters worse, some FIs have deployed (or are evaluating) multiple private cloud and PaaS platforms, often with
disparate capabilities and restrictions, and proprietary APIs.
With the mix of private cloud, IaaS, and PaaS environments
that must coexist with legacy infrastructure, critical health
managing and monitoring becomes more difficult.

27

Finance IT

Figure 1: MuleSoft

Even if a firm decides to eventually re-architect legacy applications for private PaaS hosting or move workloads across
multiple PaaS solutions, it is critical that organizations develop an overarching connectivity strategy to seamlessly tie
together systems, data and workflow that accommodates a
long-term migration journey. In order for the organization to
achieve a single pane of glass for managing and monitoring, organizations need the ability to connect and integrate
the various environments and enable service discovery, naming, routing, and rollback for SOAP web services, REST APIs,
microservices and data sources.

Managing disparate environments


The combination of endpoints data sources, applications,
web services, APIs and processes are ever growing and
evolving. In order to orchestrate a well governed but agile
application landscape, IT architects need to re-consider their
integration approach. A unified integration platform can handle any type of integration scenario, particularly high-complexity requirements for high performance, throughput and
security involving a combination of application, B2B, and
SaaS integration needs, whether on-premises or in the cloud.
Organizations facing the need to manage heterogeneous architectural environments have an opportunity to address a
wide range of requirements by means of a unified, full stack
for connectivity on one platform connectivity, orchestration, services, and APIs.
As firms adopt multi-vendor solutions, they need a way to
abstract the complexity of their private cloud vendor and architecture decisions. With a unified connectivity solution, you
can beta test multiple PaaS environments using an indepen
dent orchestration layer with a single API layer to back end
systems and databases. The connectivity layer helps you to
avoid PaaS vendor lock-in while increasing interoperability
and data portability.
A unified integration layer enables organizations to take
an API-led connectivity approach for xPaaS (Application
Platform-as-a-Service, Database Platform-as-a-Service, Middleware Platform-as-a-Service, etc.) integration and management. API-led connectivity packages underlying connectivity
and orchestration services into easily composable, discover

www.JAXenter.com | August 2015

able and reusable building blocks. Reusable building blocks


accelerate time to market for new products and services
whether packaged vs. custom, on-premise vs. off-premise.
Rather than each developer needing to have a deep understanding of an external applications API intricacies, they can
use the integration layer to compose their applications with
connectivity as needed to easily automate tasks, access databases and call web services by leveraging APIs.
Private cloud, IaaS and PaaS technologies are on the IT
agendas of many financial services firms. But those technologies are just one piece of the infrastructure puzzle. In order to
simplify IT management and empower developers, you need
a blending and bridging of environments that delivers agility
across infrastructure silos. MuleSofts Anypoint Platform is
the only solution that enables end-to-end connectivity across
API, service orchestration and application integration in a
single platform.
The single platform enables IT organizations to take a bimodal approach to private cloud management driving speed
to market and agility while enforcing a governance process to
avoid fragmentation and duplication of services. MuleSoft,
a proven on-premises, hybrid and cloud integration leader,
provides a virtual agility layer, allowing new services on the
PaaS to interact with legacy on-premise mainframes or SaaS
environments in the cloud (Figure1).
Each of the building blocks in Anypoint Platform delivers
purposefully productized APIs, powerful Anypoint core and
ubiquitous connectivity. Based on consistent and repeatable
guiding principles, the Anypoint Platform delivers tools and
services for runtime, design time, and engagement that enable successful delivery for each audience, whether internal
or external. MuleSofts Anypoint Platform is architecturally
independent it is agnostic in terms of private cloud, IaaS
or PaaS solutions, whether custom-built or purchased from a
third-party provider. Customers have the freedom and agility
to abstract connectivity and integration from the underlying
infrastructure, platform and application environments maximizing efficiency and business value.
Part of simplifying your architecture and becoming more
agile is having flexibility. MuleSofts unique connectivity approach allows you to plan for the future. You may start with
an established infrastructure provider and move to an emerging pure-play PaaS provider. You may build applications for
on-premises deployment but later decide to host them in
the cloud. Anypoint Platform has a single code base for onpremises, hybrid and cloud deployment, adapting to changing business and regulatory conditions. This single code base
ensures integration and interoperability across the enterprise
with transparent access to data, seamless monitoring and security, and the agility to respond to changing business needs.

Patricia Hines is the financial services industry marketing director at


MuleSoft, a San Francisco-based company that makes it easy to connect
applications, data and devices.

28

istockphoto.com/retrorocket

Web

The future of traffic management technology

Intelligent traffic management in the modern


application ecosystem
As application architecture continues to undergo change, modern applications are now living in increasingly distributed and dynamic infrastructure. Meanwhile, DNS and traffic management markets are finally
shifting to accommodate the changing reality.

by Kris Beevers
Internet based applications are built markedly different today than they were even just a few years ago. Application
architectures are largely molded by capabilities of the infrastructure and core services upon which the applications are
built. In recent years weve seen tectonic shifts in the ways
infrastructure is consumed, code is deployed and data is
managed.
A decade ago, most online properties lived on physical
infrastructure in co-location environments, with dedicated
connectivity and big-iron database back ends, managed by

www.JAXenter.com | August 2015

swarms of down-in-the-muck systems administrators with


arcane knowledge of config files, firewall rules and network
topologies. Applications were deployed in monolithic models,
usually in a single datacentre load balancers fronting web
heads backed by large SQL databases, maybe with a caching
layer thrown in for good measure.
Since the early 2000s, weve seen a dramatic shift toward
cloudification and infrastructure automation. This evolution has led to an increase in distributed application topologies, especially when combined with the explosion of
database technologies that solve replication and consistency
challenges, and configuration management tools that keep

29

Web

track of dynamically evolving infrastructures. Today, most


new applications are built to be deployed at minimum in
more than one datacentre, for redundancy in disaster recovery scenarios. Increasingly, applications are deployed at the
far-flung edges of the Internet to beat latency and provide
lightning fast response times to users whove come to expect
answers (or cat pictures) in milliseconds.
As applications become more distributed, the tools we use
to get eyeballs to the right place and to provide the best service in a distributed environment have lagged behind. When
an application is served from a single datacentre, the right
service endpoint to select is obvious and theres no decision to
be made, but the moment an application is in more than one
datacentre, endpoint selection can have a dramatic impact on
user experience.
Imagine someone in California interacting with an application served out of datacentres in New York and San Jose.
If the user is told to connect to a server in New York, most
times, theyll have a significantly worse experience with the
application than if theyd connected to a server in San Jose.
An additional 6080 milliseconds in round trip time is tacked
onto every request sent to New York, drastically decreasing
the applications performance. Modern sites often have 60
70 assets embedded in a page and poor endpoint selection can
impact the time to load every single one of them.

Solving endpoint selection


How have we solved endpoint selection problems in the past?
The answer is, we havent at least, not very effectively.
If you operate a large network and have access to deep
pockets and a lot of low-level networking expertise, you
might take advantage of IP anycasting, a technique for routing traffic to the same IP address across multiple datacentres.
Anycasting has proven too costly and complex to be applied
to most web applications.
Most of the time, endpoint selection is solved by DNS,
the domain name system that translates hostnames to IP addresses. A handful of DNS providers support simple notions
of endpoint selection for applications hosted in multiple datacentres. For example, the provider might ping your servers, and if a server stops responding, it is removed from the
endpoint selection rotation. More interestingly, the provider
may use a GeoIP database or other mechanism to take a guess
at whos querying the domain and where theyre located, and
send the user to the geographically closest application endpoint. These two simple mechanisms form the basis of many
large distributed infrastructures on the Internet today, including some of the largest content delivery networks (CDNs).
In todays modern Internet, applications live in increasingly
distributed and dynamic infrastructure. The DNS and traffic management markets are finally shifting to accommodate
these realities.
Modern DNS and traffic management providers are beginning to incorporate real-time feedback from application
infrastructures, network sensors, monitoring networks and
other sources into endpoint selection decisions. While basic
health checking and geographic routing remain tools of the
trade, more complex and nuanced approaches for shifting

www.JAXenter.com | August 2015

traffic across datacentres are emerging. For example, some of


the largest properties on the Internet, including major CDNs,
are today making traffic management decisions based not
only on whether a server is up or down, but on how
loaded it is, in order to utilize the datacentre to capacity, but
not beyond.
Several traffic management providers have emerged that
measure response times and other metrics between an applications end users and datacentres. These solutions leverage
data in real time to route users to the application endpoint
thats providing the best service, for the users network, right
now, ditching geographic routing altogether. Additional
traffic management techniques, previously impossible in the
context of DNS, are finding their way to market, such as
endpoint stickiness, complex weighting and prioritizing of
endpoints, ASN and IP prefix based endpoint selection and
more.
The mechanisms and interfaces for managing DNS configuration are improving, as new tools mature for making traffic
management decisions in the context of DNS queries. While
legacy DNS providers restrict developers to a few proprietary
DNS record types to enact simplistic traffic management behaviours, modern providers offer far more flexible toolkits.
This enables developers to either write actual code to make
endpoint selection decisions or offering flexible, easy to use
rules engines to mix and match traffic routing algorithms into
complex behaviours.

Whats next for traffic management technology?


As with many industries, traffic management will be driven by
data. Leading DNS and traffic management providers, such
as NSONE, already leverage telemetry from application infrastructure and Internet sensors. The volume and granularity
of this data will only increase, as will the sophistication of
the algorithms that act on it to automate traffic management
decisions.
DNS and traffic management providers have found additional uses for this data outside of making real-time endpoint
selection decisions. DNS providers are already working with
larger customers to leverage DNS and performance telemetry
to identify opportunities for new datacentre deployments to
maximize performance impact. DNS based traffic management will be an integral part of a larger application delivery
puzzle that sees applications themselves shift dynamically
across datacentres in response to traffic, congestion and other
factors.
Applications and their underlying infrastructure have
changed significantly in the last decade. Now, the tools and
systems we rely on to get users to the applications are finally
catching up.

Kris Beevers is an internet infrastructure geek and serial entrepreneur


whos started two companies, built the tech for two others, and has a
particular specialty in architecting high volume, globally distributed internet infrastructure. Before NSONE, Kris built CDN, cloud, bare metal, and
other infrastructure products at Voxel, a NY based hosting company that
sold to Internap (NASDAQ:INAP) in 2011.
https://www.crunchbase.com/person/kris-beevers#sthash.xmxgexW5.dpuf

30

Benchmarks

Cost, scope and focus

Trade-offs in benchmarking
Is it quality youre looking to improve? Or performance? Before you decide on what kind of a benchmark your system needs, you need to know the spectrum of cost and benefit trade-offs.
by Aysylu Greenberg
Benchmarking software is an important step in maturing a system. It is best to benchmark a system after correctness, usability, and reliability concerns have been addressed. In the typical
lifetime of a system, emphasis is first placed on correctness of
implementation, which is verified by unit, functional, and integration tests. Later, the emphasis is placed on the reliability
and usability of the system, which is confirmed by the monitoring and alerting setup of a system running in production for
an extended period of time. At this point, the system is fully
functional, produces correct results, and has the necessary set
of features to be useful to the end client. At this stage, benchmarking the software helps us to gain a better understanding
of what improvement work is necessary to help the system gain
a competitive edge.
There are two types of benchmarks one can create performance and quality. Performance benchmarks generally measure latency and throughput. In other words, they answer the
questions: How fast can the system answer a query?, How
many queries per second can it handle?, and How many
concurrent queries can the system handle? Quality benchmarks, on the other hand, address domain specific concerns,
and do not translate well from one system to another. For
instance, on a news website, a quality benchmark could be the
total number of clicks, comments, and shares on each article.
In contrast, a different website may include not only those
properties but also what the users clicked on. This might happen because the websites revenue is dependent on the number
of referrals, rather than how engaging a particular article was.

Benchmarking:
Youre Doing It Wrong
Hear Aysylu Greenberg speak at the JAX London: Knowledge of
how to set up good benchmarks is invaluable in understanding
performance of the system. Writing correct and useful benchmarks
is hard, and verification of the results is difficult and prone to
errors. When done right, benchmarks guide teams to improve the
performance of their systems. In this talk, we will discuss what you
need to know to write better benchmarks.

www.JAXenter.com | August 2015

Speaking of revenue, the goal of a benchmark is to guide


optimizations in the system and to define the performance
goal. A good benchmark should be able to answer the question How fast is fast enough? It allows the company to
keep the users of the system happy and keep the infrastructure bills as low as possible, instead of wasting money on
unneeded hardware.
Theres a spectrum of cost and benefit trade-offs a benchmark designer should be aware of. Specialized benchmarks
that utilize realistic workloads and model the production environment closely are expensive to set up. A common problem
is that special infrastructure needs to exist to be able to duplicate the production workload. Aggregation and verification
of results is also a very involved process, as it requires thorough analysis and application of moderately sophisticated
statistical techniques. On the other hand, micro-benchmarks
are quick and easy to set up, but they often produce misleading results, since they might not be measuring a representative
workload or set of functionality.
To get started with designing a benchmark, it is helpful
to pose a question for the system, e.g. How fast does the
page load for the user when they click to see contents of their
cart? By pairing that with the goal of the benchmark, e.g.
How fast does the page need to load for a pleasant user experience? this gives the team guidance for their optimization
work and helps to determine when a milestone is reached.
Benchmarking is both an engineering and a business problem. Clearly defining the question and the goal for the benchmark helps utilize compute and engineer hours effectively.
When designing a benchmark, its important to consider how
much bang for the buck the system will receive from the
benchmarking work. Benchmarks with wide coverage of the
systems functionality and thorough analysis of the results are
expensive to design and set up, but also provide more confidence in the behaviour of the system. On the other hand,
smaller benchmarks might answer narrow questions very well
and help get the system closer to the goal much faster.
Aysylu Greenberg works at Google on a distributed build system. In her
spare time, she works on open source projects in Clojure, ponders the
design of systems that deal with inaccuracies, paints and sculpts.

31

istockphoto.com/vladru
istockphoto.com/YanC

Security

Five tips to stay secure

Common threats to
your VoIP system
VoIP remains a popular system for telephone communication in the enterprise. But have
you ever considered the security holes this system is leaving you open to? And what company secrets are at risk of eavesdropping, denial of service and Vishing attacks?

by Sheldon Smith

I Transmission issues

Using a VoIP system to handle calls for your company?


Youre not alone. In 2014, the worldwide VoIP services
market reached almost $70 billion and is on pace for another banner year in 2015. Despite the usability, flexibility
and cost effectiveness of VoIP systems, companies need to
be aware of several common threats that could dramatically
increase costs or put company secrets at risk. Here are five of
the most common VoIP threats and how your company can
stay secure.

Unlike plain old telephone service (POTS), VoIP systems


rely on packet-switched telephony to send and receive messages. Instead of creating a dedicated channel between two
endpoints for the duration of a call using copper wires and
analog voice information, call data is transmitted using thousands of individual packets. By utilizing packets, its possible
to quickly send and receive voice data over an internet connection and VoIP technologies are designed in such a way that
packets are re-ordered at their destination so calls arent out
of sync or jittery.

www.JAXenter.com | August 2015

32

Security

Dealing with these threats


means undertaking a security audit of your network
before adding VoIP.
Whats the risk? The transmission medium itself. POTS
lines are inherently secure since a single, dedicated connection
is the only point of contact between two telephones. Though
when voice data is transmitted over the internet at large, it
becomes possible for malicious actors to sniff out traffic and
either listen in on conversations or steal key pieces of data.
The solution? Encrypt your data before it ever leaves local
servers. Youve got two choices here: Set up your own encryption protocols in-house, or opt for a VoIP vendor that bundles
a virtual private network (VPN), which effectively creates a
secure tunnel between your employees and whoever they
call.

II Denial of service
The next security risk inherent to VoIP? Attacks intended to
slow down or shut down your voice network for a period
of time. As noted by a SANS Institute whitepaper, malicious
attacks on VoIP systems can happen in a number of ways.
First, your network may be targeted by a denial of service
(DOS) flood, which overwhelms the system. Hackers may
also choose buffer overflow attacks or infect the system with
worms and viruses in attempt to cause damage or prevent
your VoIP service from being accessed. As noted by a recent
CBR article, VoIP attacks are rapidly becoming a popular
avenue for malicious actors UK-based Nettitude said that
within minutes of bringing a new VoIP server online, attack
volumes increased dramatically.
Dealing with these threats means undertaking a security
audit of your network before adding VoIP. Look for insecure
endpoints, third-party applications and physical devices that
may serve as jumping-off points for attackers to find their
way into your system. This is also a good time to assess legacy apps and older hardware to determine if theyre able to
handle the security requirements of internet-based telephony.
Its also worth taking a hard look at any network protection
protocols and firewalls to determine if changes must be made.
Best bet? Find an experienced VoIP provider who can help
you assess existing security protocols.

formation including phone numbers, account PINs and users personal data. Impersonation is also possible hackers
can leverage your VoIP system to make calls and pose as a
member of your company. Worst case scenario? Customers
and partners are tricked into handing over confidential information.
Handling this security threat means developing policies and
procedures that speak to the nature of the problem. IT departments must regularly review who has access to the VoIP system and how far this access extends. In addition, its critical
to log and review all incoming and outgoing calls.

IV Vishing
According to the Government of Canadas Get Cyber Safe
website, another emerging VoIP threat is voice phising or
vishing. This occurs when malicious actors redirect legitimate calls to or from your VoIP network and instead
connect them to online predators. From the perspective of
an employee or customer the call seems legitimate and they
may be convinced to provide credit card or other information. Spam over Internet Telephony (SPIT) is also a growing
problem; here, hackers use your network to send thousands
of voice messages to unsuspecting phone numbers, damaging your reputation and consuming your VoIP transmission capacity. To manage this issue, consider installing a
separate, dedicated internet connection for your VoIP alone,
allowing you to easily monitor traffic apart from other internet sources.

V Call fraud
The last VoIP risk comes from the call fraud, also called toll
fraud. This occurs when hackers leverage your network to
make large volume and lengthy calls to long-distance or premium numbers, resulting in massive costs to your company.
In cases of toll fraud, meanwhile, calls are placed to revenuegenerating numbers such as international toll numbers
which generate income for attackers and leave you with the
bill.
Call monitoring forms part of the solution here, but its also
critical to develop a plan that sees your VoIP network regularly patched with the latest security updates. Either create a
recurring patch schedule or find a VoIP provider that automatically updates your network when new security updates
become available.
VoIP systems remain popular thanks to their ease-of-use,
agility and global reach. Theyre not immune to security issues but awareness of common threats coupled with proactive IT efforts helps you stay safely connected.

III Eavesdropping
Another issue for VoIP systems is eavesdropping. If your traffic is sent unencrypted, for example, its possible for motivated attackers to listen in on any call made. The same goes
for former employees who havent been properly removed
from the VoIP system or had their login privileges revoked.
Eavesdropping allows malicious actors to steal classified in-

www.JAXenter.com | August 2015

Sheldon Smith is a Senior Product Manager at XO Communications. XO


provides unified communications and cloud services. XOs solutions help
companies become more efficient, agile, and secure. Sheldon has extensive product management and unified communications experience.

33

REST

No more custom API mazes

Why reusable REST APIs


are changing the game
REST APIs make our lives easier but were still in the dark ages when it comes to making our APIs
general purpose, portable and reusable. DreamFactory evangelist Ben Busse describes some common pitfalls of hand-coding custom REST APIs and explores the architectural advantages and technical characteristics of reusable REST APIs.

by Ben Busse
Where I work at DreamFactory, we designed and built some
of the very first applications that used web services on Salesforce.com, AWS and Azure. Over the course of ten years,
we learned many painful lessons trying to create the perfect
RESTful backend for our portfolio of enterprise applications.
When a company decides to start a new application project,
the business team first defines the business requirements
and then a development team builds the actual software. Usually there is a client-side team that designs the application
and a server-side team that builds the backend infrastructure.
These two teams must work together to develop a REST API
that connects the backend data sources to the client application.
One of the most laborious aspects of the development process is the interface negotiation that occurs between these
two teams (Figure1). Project scope and functional requirements often change throughout the project, affecting API and
integration requirements. The required collaboration is complex and encumbers the project.

the proliferation of mobile devices, the modern enterprise


may need hundreds or even thousands of mobile applications. Backend integration, custom API development, backend security and testing comprise the lions share of a typical
enterprise mobile application project (more than half of the
time on average).
Most enterprises today are woefully unable to address API
complexity at its root cause. Mobile projects typically have
new requirements that were not anticipated by the existing
REST APIs that are now in production. You could expand
the scope of your existing API services, but they are already
in production.
So the default option is to create a new REST API for each
new project! The API building process continues for each

Dungeon master development: Complex mazes of custom,


handcrafted APIs

You can get away with slow, tedious interface negotiation if


youre just building one simple application. But what if you
need to ship dozens, hundreds or even thousands of API-driven applications for employees, partners and customers? Each
application requires a backend, APIs, user management and
security, and youre on a deadline.
Building one-off APIs and a custom backend for each and
every new application is untenable. Mobile is forcing companies to confront this reality (or ignore it at their own peril).
With the acceptance of BYOD (bring your own device) and

www.JAXenter.com | August 2015

Figure 1: Interface negotiation

34

REST

The future: reusable REST APIs

The core mistake with the API dungeon is that development activity starts with business requirements and application design, and then works its way back to server-side
data sources and software development. This is the wrong
direction.
The best approach is to identify the data sources that need
to be API-enabled and then create a comprehensive and reusable REST API platform that supports general-purpose application development (Figure 3).
There are huge benefits to adopting a reusable REST API
strategy.

Figure 2: The API dungeon

APIs and documentation are programmatically generated


and ready to use.
Theres no need to keep building server-side software for
each new application project.
Client-side application design is decoupled from security
and administration.
The interface negotiation is simplified.
Development expenses and time to market are dramatically reduced.
Developers dont have to learn a different API for each
project.
RESTful services are no longer tied to specific pieces of
infrastructure.
Companies can easily move applications between servers
and from development to test to production.

Technical characteristics of a reusable API

Figure 3: Reusable REST APIs

new app with various developers, consultants and contractors. The result is custom, one-off APIs that are highly fragmented, fragile, hard to centrally manage and often insecure.
The API dungeon is an ugly maze of complexity (Figure2).
Custom, manually coded REST APIs for every new application project, written with different tools and developer
frameworks.
REST APIs are hardwired to different databases and file
storage systems.
REST APIs run on different servers or in the cloud.
REST APIs have different security mechanisms, credential
strategies, user management systems and API parameter
names.
Data access rights are confused, user management is complex and application deployment is cumbersome.
The system is difficult to manage, impossible to scale and
full of security holes.
API documentation is often non-existent. Often, companies cant define what all the services do, or even where all
of the endpoints are located.

www.JAXenter.com | August 2015

This sounds good in theory, but what are the actual technical characteristics of reusable REST APIs? And how should
reusable APIs be implemented in practice? The reality is that
theres no obvious way to arrive at this development pattern
until youve tried many times the wrong way, at which point
its usually too late.
DreamFactory tackled the API complexity challenge for
over a decade, built a reusable API platform internally for
our own projects and open sourced the platform for any
developer to use. We had to start from scratch many times
before hitting on the right design pattern that enables our
developers to build applications out of general-purpose interfaces.
There are some basic characteristics that any reusable
REST API should have:
REST API endpoints should be simple and provide parameters to support a wide range of use cases.
REST API endpoints should be consistently structured for
SQL, NoSQL and file stores.
REST APIs must be designed for high transaction volume,
hence simply designed.
REST APIs should be client-agnostic and work interchangeably well for native mobile, HTML5 mobile and
web applications.
A reusable API should have the attributes below to support a
wide range of client access patterns:

35

REST

Figure 4: SQL API and subsets

Noun-based endpoints and HTTP verbs are highly effective. Noun-based endpoints should be programmatically
generated based on the database schema.
Requests and responses should include JSON or XML
with objects, arrays and sub-arrays.
All HTTP verbs (GET, PUT, DELETE, etc.) need to be
implemented for every use case.
Support for web standards like OAuth, CORS, GZIP and
SSL is also important.
Its crucially important to have a consistent URL structure
for accessing any backend data source. The File Storage API
should be a subset of the NoSQL API, which should be a subset of the SQL API (Figure4).

Figure 6: Request URL

Parameter names should be reused across services where


possible. This presents developers with a familiar interface
for any data source. The API should include automatically
generated, live interactive documentation that allows developers to quickly experiment with different parameters (Figure5).
In general, the structure of the request URL and associated
parameters needs to be very flexible and easy to use, but also
comprehensive in scope. Looking at the example below, there
is a base server, an API version, the backend database (the
API name) and a particular table name in the request URL
string. Then the parameters specify a
filter with a field name, operator and
value. Lastly, an additional order parameter sorts the returned JSON data
array (Figure6).
A huge number of application development scenarios can be implemented
just with the filter parameter. This allows any subset of data to be identified
and operated on. For example, objects
in a particular date range could be
loaded into a calendar interface with
a filter string (Figure 7).
Complex logical operations should
also be supported and the filter string
interface needs to protect against SQL
injection attacks. Other database-specific features include:
Pagination and sorting
Rollback and commit
Role-based access controls on tables
Role-based access controls on records
Stored functions and procedures

Figure 5: Interactive, auto-generated API docs

www.JAXenter.com | August 2015

A comprehensive reusable REST API


should also support operations on arrays of objects, but you can also spec-

36

REST

Figure 7: Find Task records

ify related objects as a URL parameter.


This allows complex documents to be
downloaded from a SQL database and
used immediately as a JSON object.
The data can be edited along with the
objects (Figure 8). When committed
back to the server, all of the changes
are updated including parent, child and
junction relationships between multiple
tables. This flexibility supports a huge
number of very efficient data access
patterns.
The vast majority of application development use cases can be supported
with a reusable REST API right out of
the box. For special cases, a server-side
scripting capability can be used to customize behavior at any API endpoint
(both request and response) or create
brand new custom API calls. DreamFactory uses the V8 JavaScript engine for
this purpose.
Some of the special cases that you
might want to implement with serverside scripting include:
Custom business logic
Workflow triggers
Formula fields
Field validation
Web service orchestration

Conclusion

Figure 8: Loading a project and all related tasks

Figure 9: Field validation and worflow trigger

www.JAXenter.com | August 2015

REST API complexity is an important


problem for companies building APIdriven applications. The tendency to
build new APIs for each new project
has negative consequences over time.
Adopting a REST API platform strategy with reusable and general-purpose
services addresses this problem and
provides many benefits in terms of more
agile development and quicker time to
value.

Ben Busse is a developer evangelist with


DreamFactory in San Francisco. Hes passionate about open source software, mobile
development and hunting wild mushrooms
in northern California.

37

istockphoto.com/Inok

APIs

Milliseconds matter

Considering the performance factor in an APIdriven world


With visitors demanding immediate response times, the fate of a website and the performance of
APIs are becoming increasingly intertwined.

by Per Buer
In recent years, web APIs have exploded. Various tech industry watchers now see them as providing the impetus for
a whole API economy. As a result and in order to create a
fast track for business growth, more and more companies and
organizations are opening up their platforms to third parties.
While this can create a lot of opportunities, it can also have
huge consequences and pose risks. These risks dont have to
be unforeseen, however.
Companies checklists for building or selecting API management tools can be very long. Most include the need to offer security (both communication security -TLS- and actual
API security -keys-), auditing, logging, monitoring, throttling,
metering and caching. However, many overlook one critical
factor: performance. This is where you can hedge your bets
and plan for the potential risk.

www.JAXenter.com | August 2015

Theres an interesting analogy between APIs and the long


path websites have travelled since the nineties. Back then,
websites had few objects and not that many visitors so performance and scalability mattered less. This has changed

Preparing your API Strategy for IoT


Hear Per Buer speak at the JAX London: Not that long ago, API
calls were counted per hour. Evaluations for API management tools
typically have long lists of criteria, but performance is usually left
off. That might be fine in certain environments but not where IoT
and mobile are concerned. For these environments the number of
API calls has increased to the point that even the typical rate of
200 API calls per second is no longer enough.

38

APIs

Back then, websites had few


objects and not that many
visitors so performance and
scalability mattered less.
dramatically over the last decade. Today, increasingly impatient visitors penalise slow websites by leaving quickly, and
in many cases never returning. Microsoft computer scientist
Harry Shum says that sites that open just 250 milliseconds
faster than competing sites a fraction of an eye blink will
gain an advantage.
APIs have travelled a similar path. Ten to fifteen years ago
most API management tools out there had very little to do
and performance wasnt an issue. The number of API calls
handled was often measured in calls per hour. Consequently,
these tools were designed to deal with things other than thousands of API calls per second. What a difference a decade can
make! According to Statista, worldwide mobile app downloads are expected to reach 268.69 billion by 2017. But API
management tools havent caught up. Even nowadays many
of the products in the various vendors top right quadrant will
only handle rates of 200 API calls per second per server. Their
focus has been on features, not performance.
If you open up your API platform, you probably want a
lot of developers to use it. However, most web services have
introduced a rate limit for API calls. If set high enough, the
limit is reasonable to ensure availability and quality of ser-

vice. But what is high enough to provide a competitive advantage in our accelerated times? Take for example an industry
like banking, where many players are opening up their platforms in a competitive bid to attract developers who create
third-party apps and help monetise the data. The ones that set
the API call limit too low create a bad developer experience,
pushing them towards friendlier environments.
A limited number of API calls in web services also affects
the end-customer. Take for example online travel operators
or online media. In these environments a lot of data needs
to flow through the APIs. These are becoming more dependent on fast and smooth communication between their services
and their various apps. If these services slow down due to API
call limitations, customers will defect to faster sites.
I compared the situation of APIs with that of the web ten
years ago when performance started to matter. The situation
that actually developed is much more serious than I initially
predicted. Consumers increasingly demand instant gratification. This means that the window for companies to ensure
the performance of their APIs is closing. Being able to deliver
performance and set a higher limit of API calls can make a
huge difference. Otherwise, developers will go elsewhere to
help grow another companys business. If you want to futureproof for the API boom, its time to consider the performance
factor.

Per Buer is the CTO and founder of Varnish Software, the company behind the open source project Varnish Cache. Buer is a former programmer
turned sysadmin, then manager turned entrepreneur. He runs, cross country skis and tries to keep his two boys from tearing down the house.

Imprint
Publisher
Software & Support Media GmbH
Editorial Office Address
Software & Support Media
Saarbrcker Strae 36
10405 Berlin, Germany
www.jaxenter.com
Sebastian Meyen
Coman Hamilton, Natali Vlatko
Kris Beevers, Per Buer, Ben Busse, Holly Cummins, Aysylu Greenberg,
Patricia Hines, Eric Horesnyi, Werner Keil, Angelika Langer, Aviran Mordo,
Chris Neumann, Lyndsay Prewer, Zigmars Raascevskis, Sheldon Smith,
Colin Vipurs, Geertjan Wielenga
Copy Editor:
Jennifer Diener
Creative Director: Jens Mainz
Layout:
Flora Feher, Dominique Kalbassi

Sales Clerk:
Anika Stock
+49 (0) 69 630089-22
astock@sandsmedia.com
Entire contents copyright 2015 Software & Support Media GmbH. All rights reserved. No
part of this publication may be reproduced, redistributed, posted online, or reused by any
means in any form, including print, electronic, photocopy, internal network, Web or any other
method, without prior written permission of Software & Support Media GmbH.

Editor in Chief:
Editors:
Authors:

www.JAXenter.com | August 2015

The views expressed are solely those of the authors and do not reflect the views or position of their firm, any of their clients, or Publisher. Regarding the information, Publisher
disclaims all warranties as to the accuracy, completeness, or adequacy of any information, and is not responsible for any errors, omissions, inadequacies, misuse, or the consequences of using any information provided by Publisher. Rights of disposal of rewarded
articles belong to Publisher. All mentioned trademarks and service marks are copyrighted
by their respective owners.

39

También podría gustarte