Monday, June 13, 2016

Creating Lists with PDFBox-Layout

The last article gave you a brief introduction on what you can do with PDFBox-Layout. The new release 0.6.0 added support for indentation and lists, and that’s what this article is about.

Indentation

Indentation is often used to structure content, and it is also the base for creating lists. Let’s start with some simple example using the Indent element.

paragraph.addMarkup(
    "This is an example for the new indent feature. Let's do some empty space indentation:\n",
        11, BaseFont.Times);
paragraph.add(new Indent(50, SpaceUnit.pt));
paragraph.addMarkup("Here we go indented.\n", 11, BaseFont.Times);
paragraph.addMarkup(
    "The Indentation holds for the rest of the paragraph, or... \n",
    11, BaseFont.Times);
paragraph.add(new Indent(70, SpaceUnit.pt));
paragraph.addMarkup("any new indent comes.\n", 11, BaseFont.Times);

So what do we do here: we add an indent of 50pt width. This indent will be automatically inserted after each newline until the end of the paragraph…or a new indent is inserted. That’s what we do here, we insert an indent of 70pt:

indention

An indent may also have a label (after all, this is the foundation for lists). By default the label is right aligned, as this makes sense for lists. But you may specify an alignment to fit your needs:

paragraph = new Paragraph();
paragraph
    .addMarkup(
        "New paragraph, now indentation is gone. But we can indent with a label also:\n",
        11, BaseFont.Times);
paragraph.add(new Indent("This is some label", 100, SpaceUnit.pt, 11,
    PDType1Font.TIMES_BOLD));
paragraph.addMarkup("Here we go indented.\n", 11, BaseFont.Times);
paragraph
    .addMarkup(
        "And again, the Indentation holds for the rest of the paragraph, or any new indent comes.\nLabels can be aligned:\n",
        11, BaseFont.Times);
paragraph.add(new Indent("Left", 100, SpaceUnit.pt, 11,
    PDType1Font.TIMES_BOLD, Alignment.Left));
paragraph.addMarkup("Indent with label aligned to the left.\n", 11,
    BaseFont.Times);
paragraph.add(new Indent("Center", 100, SpaceUnit.pt, 11,
    PDType1Font.TIMES_BOLD, Alignment.Center));
paragraph.addMarkup("Indent with label aligned to the center.\n", 11,
    BaseFont.Times);
paragraph.add(new Indent("Right", 100, SpaceUnit.pt, 11,
    PDType1Font.TIMES_BOLD, Alignment.Right));
paragraph.addMarkup("Indent with label aligned to the right.\n", 11,
    BaseFont.Times);
document.add(paragraph);

indentionWithLabel

Lists

As already said, indentations where introduced in order to support list, so let's build one. It's nothing but indentation with a label, where the label is a bullet character:

paragraph = new Paragraph();
paragraph.addMarkup(
    "So, what can you do with that? How about lists:\n", 11,
    BaseFont.Times);
paragraph.add(new Indent(bulletOdd, 4, SpaceUnit.em, 11,
    PDType1Font.TIMES_BOLD, Alignment.Right));
paragraph.addMarkup("This is a list item\n", 11, BaseFont.Times);
paragraph.add(new Indent(bulletOdd, 4, SpaceUnit.em, 11,
    PDType1Font.TIMES_BOLD, Alignment.Right));
paragraph.addMarkup("Another list item\n", 11, BaseFont.Times);
paragraph.add(new Indent(bulletEven, 8, SpaceUnit.em, 11,
    PDType1Font.TIMES_BOLD, Alignment.Right));
paragraph.addMarkup("Sub list item\n", 11, BaseFont.Times);
paragraph.add(new Indent(bulletOdd, 4, SpaceUnit.em, 11,
    PDType1Font.TIMES_BOLD, Alignment.Right));
paragraph.addMarkup("And yet another one\n", 11, BaseFont.Times);

list

Ordered Lists with Enumerators

Ordered lists are quite helpful to structure and reference text. You already have all ingredients to build an ordered lists; just use increasing numbers as labels, and that's it. What would I need an API for to do that?!? How about a list with roman numbers:
RomanEnumerator e1 = new RomanEnumerator();
LowerCaseAlphabeticEnumerator e2 = new LowerCaseAlphabeticEnumerator();
paragraph = new Paragraph();
paragraph.addMarkup("Also available with indents: Enumerators:\n", 11,
    BaseFont.Times);
paragraph.add(new Indent(e1.next() + ". ", 4, SpaceUnit.em, 11,
    PDType1Font.TIMES_BOLD, Alignment.Right));
paragraph.addMarkup("First item\n", 11, BaseFont.Times);
paragraph.add(new Indent(e1.next() + ". ", 4, SpaceUnit.em, 11,
    PDType1Font.TIMES_BOLD, Alignment.Right));
paragraph.addMarkup("Second item\n", 11, BaseFont.Times);
paragraph.add(new Indent(e2.next() + ") ", 8, SpaceUnit.em, 11,
    PDType1Font.TIMES_BOLD, Alignment.Right));
paragraph.addMarkup("A sub item\n", 11, BaseFont.Times);
paragraph.add(new Indent(e2.next() + ") ", 8, SpaceUnit.em, 11,
    PDType1Font.TIMES_BOLD, Alignment.Right));
paragraph.addMarkup("Another sub item\n", 11, BaseFont.Times);
paragraph.add(new Indent(e1.next() + ". ", 4, SpaceUnit.em, 11,
    PDType1Font.TIMES_BOLD, Alignment.Right));
paragraph.addMarkup("Third item\n", 11, BaseFont.Times);
document.add(paragraph);
enumerators
Enumerators ease the task of generating ordered lists programmatically (and the markup API could not live without 'em ;-). Currently the following enumerators are supported:
  • ArabicEnumerator (1, 2, 3, 4...)
  • RomanEnumerator (I, II, III, IV...)
  • LowerCaseRomanEnumerator (i, ii, iii, iv...)
  • AlphabeticEnumerator (A, B, C, D...)
  • LowerCaseAlphabeticEnumerator (a, b, c, d...)

Markup

The markup API eases the burden of creating features programatically, and for sure you can do indentation and lists with markup. We're gonna start with some simple indentation. You start the indentation with -- at the beginning of a new line:
"--At vero eos et accusam\n"

Indents run until the end of a paragraph, or until another indentation starts. But you can you can explicitly end the indentation with !-:
"-!And end the indentation.\n"
indentation-markup

The default indent is 4 characters, but you can customize this by specifying the desired indentation in pt or em, so the markup --{50pt} will give you an indent of 50pt. Be aware that this size is per indent level, so if you prefix a space, you will get 100pt in this case.

To create lists with bullets, use the -+ markup:
"-+This is a list item\n"
"-+Another list item\n"

You can specify different levels of indentation just be prefixing the indent markup with one or multiple spaces:
" -+A sub list item\n"

You can customize the indent size and the bullet character (after all, this is the indentation label). Let's do an indent of 8 characters and use >> as the bullet: -+{>>:8em}
list-markup

Enumerated lists are supported also, just use the -# markup:

"-#This is a list item\n"
"-#Another list item\n"
" -#{a:}A sub list with lower case letter\n"
"-#And yet another one\n\n"

enumerators-markup
Again, you can customize the indent size and also the enumeration type. The default type is arabic numbers, but let's use the roman enumerator: `-#{I:6em}. The following enumerator types are built in:
  • 1 arabic
  • I roman, upper case
  • i roman, lower case
  • A alphabetic, upper case
  • a alphabetic, lower case
So let's use some custom separators here:

"-#This is a list item\n"
"-!And you can customize it:\n"
"-#{I ->:5}This is a list item\n"
"-#{I ->:5}Another list item\n"
" -#{a ~:30pt}A sub list item\n"
"-#{I ->:5}And yet another one\n\n";
custom-lists-markup

But you may also write your own enumerator and use it in markup, see class EnumeratorFactoryfor more details on that.

Regards
Ralf
The human animal differs from the lesser primates in his passion for lists.
H. Allen Smith

Sunday, April 17, 2016

PDF text layout made easy with PDFBox-Layout

More than a decade ago I was using iText to create PDF documents from scratch. It was quite easy to use, and did all the stuff I needed like organizing text in paragraphs, performing word wrapping and marking up text with bold and italic. But once upon a time Bruno Lowagie - the developer of iText - switched from open source to a proprietary license for reasons I do understand.

So when I now had to do some PDF processing for a new project, I was looking for an alternative. PDFBox is definitely the best open source choice, since it is quite mature.But when I was searching on how to do layout, I found a lot of people looking for exactly those features, and the common answer was: you have to do it on your own! Say what? Ouch. There must be someone out there, who already wrote that stuff... Sure there is, but google did not find him. So I started to write some simple word wrapping. And some simple pagination. And some simple markup for easy highlighting with bold an italic. Don't get me wrong: the stuff I wrote is neither sophisticated nor complete. It is drop dead simple, and does the things I need. But just in case someone out there may find it useful, I made it public under MIT license on GitHub.

column


PDFBox-Layout

PDFBox-Layout acts as a layer on top of PDFBox that performs some basic layout operations for you:
  • word wrapping
  • text alignment
  • paragraphs
  • pagination
The API actually has two parts: the (low-level) text layout API, and the document layout API.

The Text Layout API

The text layout API is thought for direct usage with the low level PDFBox API. You may organize text into blocks, do word wrapping, alignment, and highlight text with markup. Means: most features described in the remainder of this article may be used directly with PDFBox without the document layout API.  For more details on this API see the Text API Wiki page. What the document layout API gives you as a surplus, is paragraph layout and pagination.

The Document Layout API

The ingredients of the document layout API are documents, paragraphs and layouts. It is thought to easily create complete PDF documents from scratch, and performs things like word-wrapping, paragraph layout and pagination for you.
Let's start with a simple example:

document = new Document(Constants.A4, 40, 60, 40, 60);

Paragraph paragraph = new Paragraph();
paragraph.addText("Hello Document", 20, PDType1Font.HELVETICA);
document.add(paragraph);

final OutputStream outputStream = 
    new FileOutputStream("hellodoc.pdf");
document.save(outputStream);

We start with creating a Document, which acts as a container for elements like e.g. paragraphs. You specify the media box - A4 in this case - and the left, right, top and bottom margin of the document. The margins are applied to each page. After that we create a paragraph which is a container for text fragments. We add a text "Hello Document" with the font type HELVETICA and size 20 to the paragraph. That's it, let's save it to a file. The result looks like this:

hello

Word Wrapping

As already said, you can also perform word wrapping with PDFBox-Layout. Just use the method setMaxWidth() to set a maximum width, and the text container will do its best to not exceed the maximum width by word wrapping the text:

Paragraph paragraph = new Paragraph();
paragraph.addText(
    "This is some slightly longer text wrapped to a width of 100.", 
    11, PDType1Font.HELVETICA);
paragraph.setMaxWidth(100);
document.add(paragraph);

wrapped1

If you do not specify an explicit max width, the documents media box and the margins dictate the max width for a paragraph. Means: you may just write text, write text and more text without the need for any line beaks, and the layout will do the word wrapping in order to fit the paragraph into the page boundaries.

Text-Alignment

As you might have already seen, you can specify a text alignment on the paragraph:
Paragraph paragraph = new Paragraph();
paragraph.addText(
    "This is some slightly longer text wrapped to a width of 100.", 
paragraph.setMaxWidth(100);
paragraph.setAlignment(Alignment.Right);
document.add(paragraph);

wrapped-right

The alignment tells the draw method what to do with extra horizontal space, where the extra space is the difference between the width of the text container and the line. This means, that the alignment is effective only in case of multiple lines. Currently, Left, Center and Right alignment is supported.

Layout

The paragraphs in a document are sized and positioned using a layout strategy. By default, paragraphs are stacked vertically by the VerticalLayout. If a paragraph’s width is smaller than the page width, you can specify an alignment with a  layout hint:

document.add(paragraph, 
    new VerticalLayoutHint(Alignment.Left, 10, 10, 20, 0));

You can combine text- and paragraph-alignment anyway you want:

aligned

An alternative to the vertical layout is the Column-Layout, which allows you to arrange the paragraphs in multiple columns on a page.

Document document = 
    new Document(Constants.A4, 40, 60, 40, 60);
 
Paragraph title = new Paragraph();
title.addMarkup("*This Text is organized in Colums*", 
    20, BaseFont.Times);
document.add(title, VerticalLayoutHint.CENTER);
document.add(new VerticalSpacer(5));

// use column layout from now on
document.add(new ColumnLayout(2, 10));

Paragraph paragraph1 = new Paragraph();
paragraph1.addMarkup(text1, 11, BaseFont.Times);
document.add(paragraph1);
...

column

But you may also set an absolute position on an element. If this is set, the layout will ignore this element, and render it directly at the given position:

Paragraph footer = new Paragraph();
footer.addMarkup("This is some example footer", 6, BaseFont.Times);
paragraph.setAbsolutePosition(new Position(20, 20));
document.add(paragraph);

Pagination

As you add more and more paragraphs to the document, the layout automatically creates a new page if the content does not fit completely on the current page. Elements have different strategies how they will divide on multiple pages. Text is simply split by lines. Images may decide to either split, or - if they fit completely on the next page - to introduce some vertical spacer in order to be drawn on the next page. Anyway, you can always insert a NEW_PAGE element to trigger a new page.

Markup

Often you want use just some basic text styling: use a bold font here, some words emphasized with italic there, and that's it. Let's say we want to use different font types for the following sentence:

"Markup supports bold, italic, and even mixed markup."

If you want to do that using the standard API, it would look like this:

Paragraph paragraph = new Paragraph();
paragraph.addText("Markup supports ", 11, PDType1Font.HELVETICA);
paragraph.addText("bold", 11, PDType1Font.HELVETICA_BOLD);
paragraph.addText(", ", 11, PDType1Font.HELVETICA);
paragraph.addText("italic", 11, PDType1Font.HELVETICA_OBLIQUE);
paragraph.addText(", and ", 11, PDType1Font.HELVETICA);
paragraph.addText("even ", 11, PDType1Font.HELVETICA_BOLD);
paragraph.addText("mixed", 11, PDType1Font.HELVETICA_BOLD_OBLIQUE);
paragraph.addText(" markup", 11, PDType1Font.HELVETICA_OBLIQUE);
paragraph.addText(".\n", 11, PDType1Font.HELVETICA);
document.add(paragraph);

That's annoying, isn't it? That's what the markup API is intended for. Use * to mark bold content, and _ for italic. Let's do the same example with markup:

Paragraph paragraph = new Paragraph();
paragraph.addMarkup(
    "Markup supports *bold*, _italic_, and *even _mixed* markup_.\n", 
    11, 
    PDType1Font.HELVETICA, 
    PDType1Font.HELVETICA_BOLD,
    PDType1Font.HELVETICA_OBLIQUE,
    PDType1Font.HELVETICA_BOLD_OBLIQUE);
document.add(paragraph);

To make things even more easy, you may specify only the font family instead:

paragraph = new Paragraph();
paragraph.addMarkup(
    "Markup supports *bold*, _italic_, and *even _mixed* markup_.\n",
    11, BaseFont.Helvetica);

markup


That’s it

This was a short overview on what PDFBox Layout can do for you. Have a look at the Wiki and the examples for more information and some visual impressions.

Monday, January 4, 2016

Avoid Vertical Limits in Microservice Architectures

The microservice architecture allows us to partition an application into tiny sub-applications, which are easy to maintain and to deploy. This pattern is already widely adopted to implement backend systems. But the frontend is usually still one large application, at least when it comes to deployment. This article describes some thoughts on how to address this problem.
The microservice architecture is en vogue, everybody seems to know all about it, and feels like having to spread the truth. Including me ;-) But honestly, we are just about to learn how to cope with this kind of architecture. I guess we are like a kid that just managed to make some first steps, when we suddenly try to start running… that’s usually the moment when you fall over your own feet. Microservices are no free lunch, it definitely has its price (that’s the part we have already learned). For developers it feels uncomfortable, since instead of developing one application, they have to deal with dozens or hundreds. Usually the microservice-believers are the people who drive and maintain an application over a long period of time; those poor chaps that know the pain of change. And that is were their trust – or better: hope – in microservices comes from.

So what is this article about? I’m currently working in a web project for a customer, where we are using Spring Boot to create a microservice-based backend. We heavily use the Job DSL to drive our CI pipeline, and also all kind of configuration and tooling. And the front end is a single page application built with AngularJS. So far, so good, but there is a catch. One goal of this microservice idea is to have independent development teams, that drive features within their own development cycle. In the backend this is feasible due to the microservice architecture. But the front end is still one large application. Angular allows to partition the application into modules, but after all it is assembled to one application, so this is a single point of integration. Let me explain that point using a practical example. It is a stupid-simple web shop, stripped down to the bones. It is just a catalog of items, which we can put into a shopping cart:



I have prepared the example as a Spring Boot based application in the backed, with some AngularJS in the frontend. You will find the code on GitHub, the version we start with, is on the branch one-ui. All projects have maven and gradle builds, the readme will provide you enough information to run the application.
Just some note before we start: I’m not a web developer. In the last 20 years I have build – beside backend systems – various UIs in Motif, QT, Java AWT, Swing and SWT. But I have never done web frontends, there has never been the chance or need for that. So please be merciful if my angular and javascript looks rather… er, surprising to you ;-)
The architecture of our stupid simple web shop looks like this:

initial2

The browser accesses the application via a router, which limits access to the backend, helps with the CORS problem, and may also perform some security checks. The web-server hosts the AngularJS application and the assets. The AngularJS application itself communicates with the API-Gateway, which encapsulates the microservices in the backend, and provides services and data optimized for the UI. The API-Gateway pattern is often used in order to not to bloat the microservices with functionality where it not belongs to. We will see its advantages later on. The gateway talks to the backend microservice, which then perform their dedicated functionality using their own isolated databases (the databases has been omitted for simplicity in the example code).

So far nothing new, what’s the point? Well, what is that microservice idea all about? Ease of change. Make small changes easy to apply and deploy. Minimize the risk of change by having to deploy only a small part of the application instead of a large one-size-fits-all application. In the backend this is now widely adopted. The microservices are tiny, and easy to maintain. As long as their interfaces to other services remain (compatible), we may replace them with a new version in a running production environment, without affecting the others. And – if done right – without downtime. But what about the frontend? The frontend is mostly still one large application. If we make a small change to the UI, we have to build and redeploy the whole thing. Even worse, since it is one application, bug fixes and features are often developed in parallel by large teams, which makes it difficult to release minor changes separately. At last, that’s what the microservice story is all about: Dividing your application into small sub-applications which can be developed and deployed separately, giving each its own lifecycle. And consequently, this pattern should be applied to the complete application, from database via services to the frontend. But currently, our microservice architecture thinking ends at the UI, and that’s what I call a vertical limit.

We investigated how other people are dealing with this situation, and – no wonder – there were a lot of complaints about the same problem. But surprisingly, most advices to address this were to use the API-Gateway… er, but this does not solve the problem?!? So let’s think about it: we are dividing our backend into fine grained microservices in order to cope with changes. Consequently, we should exactly the same in UI. We could partition our UI into components, where features make up the logical boundaries. Let’s take our silly vertical shop example to exercise this. We separate this UI at least into two logical components: the catalog showing all available items, and the shopping cart. Angular provides a mechanism to build components: directives! The cart and the catalog are already encapsulated in angular directives (what a coincidence ;-). So what if we put those parts each on its own web-server? Let’s see how our architecture would look like:

multipleUI_oneGateway2

Hmm, obviously the API gateway is still a common denominator. The gateway’s job is to abstract from the backend microservice and to serve the UI in the best suited way. So consequently, we should divide the gateway also. But since it so closely related to the UI, we are packaging ‘em both into one service in our example. Let’s do so, you will find the code for this version on the branch ui-components. Now the architecture looks like this:

 componentsDashed

Hey wait! I’ve had look at your UI code. It’s not just the directives, there are also angular services. And the add-button in the catalog directive is effectively calling addItem() on the cart-service. That’s quite true. But is that a problem? In our microservice backend, services are calling other services also. This is not a problem, as long as the service’ API doesn’t change, resp. remains compatible. The same holds on the javascript side. As long as our API doesn’t change (remains compatible), new service rolllouts and internal changes are not a problem. But we have to design these interfaces between components wisely: (angular) services may be used by other components, so we have to be careful with changes. Another way of communication between components is broadcasting events. Actually the cart component is firing an event every time the cart changes, in order to give other components a chance to react on that. So we have to be cautious with changes to events also. New resp. more data is not a problem, removing or renaming existing properties is. So we simply put all functionality dealing with the cart on the cart web-server, and the catalog stuff on the catalog web-server… including their dedicated directives, services, assets a whatsoever. Our communication looks like this:

communication

Big words and bloated architecture. It ain’t worth it! You think so? Let’s make a small change to our application and examine its impact: Our customer does not like our shopping cart: “It just shows each items article number, and its amount. That’s not quite helpful. The user needs the article’s name…its price…and the sum of all items in the cart.”


Well, yeah, she’s right. So let’s fix that. But the cart microservice does not provide more information. Article names and prices are the catalog service’ domain. So the cart service could call the catalog service and enrich its data? But that’s not the cart’s domain, it should not have to know about that. No, providing the data in a way appropriate for the UI is the job of the API gateway (finally it pays ;-). So instead of delegating all calls to the cart service, the cart’s API gateway merges the data from the cart with the catalog data to provide cart items with article names and prices:

cartGateway

Now we simply adapt the UI code of the cart in order to use and visualize this additional data. You will find the source code of this final version on the master. All changes we made are isolated in the cart UI resp. its API gateway, so we only have to redeploy this single server. Let’s do so. And if we now reload our application, tada:


So, we made a change to the shopping cart without affecting the remaining application. Without redeploying the complete UI. All the advantages we take the microservice-burden on our shoulders for, now pays in the front end also. Just treat your front end as a union of distinct features. In agile development it is all about features resp. stories, and now we we are able develop and release features in its own lifecycle. Hey wait, the index.html still assembles the cart and the catalog to a single page. So there is still a single point of integration! Yep, you’re right. This is still one UI, but we have componentized it. Once a component is referenced in a page, that’s it. Any further changes to internals of that component does not affect the hosting page. Our microservices are also referenced by other parties, so this is quite similar.

The point is to avoid vertical limits in this kind of architecture. We have to separate our application into small sub-applications in all layers, namely persistence, service backend and frontend. In the backend this cut is often domain-driven, and in the front end we can use features to partition our application. For sure there are better ways and technologies to implement that, but I guess this is a step into the right direction.
We learn from failure, not from success!
Bram Stoker

1) This was poor attempt to avoid the word monolith, which is – at the time of writing – mistakenly flavored with a bad smack.

Wednesday, December 9, 2015

Java and JSON ain't friends

Most application frameworks provide some REST support, which is – depending on the language you are using – either dirt cheap, or quite complex. In the Java world part of these frameworks is some kind of mapping from JSON to Java and vice versa, most of them using the Jackson mapping framework. It feels quite natural: you model your domain objects directly in Java. If you don’t have any constraints, the JSON might even follow your model. If the JSON is predefined (as part of the API), you can either design your Java classes so they fit the generated JSON, or provide Jackson with some mapping hints to do so. But you know all that, right? So what am I talking about here?

The point is: domain models may vary in size from a few properties to x-nested sky high giants… and so are the resulting Java model classes. What makes things even worse, is that domain models change over time. Often you don’t know all the requirements front of, also requirements change over time. So domain models are subject to change. All that is still not a big problem, as long as you don’t need to communicate those domain models to others. If other parties are involved, they have adapt to the changes. Let’s take the following JSON describing a person:
{
 "id":"32740748234",
 "firstName":"Herbert",
 "lastName":"Birdsfoot",
}
We can write a simple Java class that will de-/serialize from/to this JSON:
public class Person {

 private String id;
 private String firstName;
 private String lastName;

 public String getId() {
  return id;
 }

 public void setId(String id) {
  this.id = id;
 }

 public String getFirstName() {
  return firstName;
 }

 public void setFirstName(String firstName) {
  this.firstName = firstName;
 }

 public String getLastName() {
  return lastName;
 }

 public void setLastName(String lastName) {
  this.lastName = lastName;
 }
}
We can write a simple test to verify proper mapping to JSON:
public class PersonTest {

 private final ObjectMapper mapper = new ObjectMapper();
 private String personJsonString;

 @Before
 public void setUp() throws Exception {
  personJsonString = IOUtils.toString(this.getClass()
    .getResourceAsStream("person.json"));
 }

 @Test
 public void testMapJsonToPerson() throws Exception {
  final Person person = mapper.readValue(personJsonString, Person.class);
  checkPerson(person);
 }

 protected void checkPerson(final Person person) {
  assertNotNull(person);
  assertEquals("32740748234", person.getId());
  assertEquals("Herbert", person.getFirstName());
  assertEquals("Birdsfoot", person.getLastName());
 }
If we run the test, everything is nicely green:

PersonTestGreen

That was easy. But now we have new requirements: we need to extend our person with some address data:
{
 "id":"32740748234",
 "firstName":"Herbert",
 "lastName":"Birdsfoot",
 "address":{
  "street":"Sesamestreet",
  "number":"123",
  "zip":"10123",
  "city":"New York",
  "country":"USA"
 }
}
If we run our test against that JSON we will get red

PersonTestRed

If you have a look at the StackTrace, Jackson is complaining about unknown properties
com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException: 
Unrecognized field "address" (class rst.sample.Person), not marked as 
ignorable (3 known properties: "lastName", "id", "firstName"])
 at [Source: {
 "id":"32740748234",
 "firstName":"Herbert",
 "lastName":"Birdsfoot",
 "address":{
  "street":"Sesamestreet",
  "number":"123",
  "zip":"10123",
  "city":"New York",
  "country":"USA"
 }
}; line: 5, column: 13] (through reference chain: rst.sample.Person["address"])
 at com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException.
    from(UnrecognizedPropertyException.java:51)
    ...
Now we have two choices. We can extend our Java model by the missing properties. That’s quite easy. And if the producer of that JSON is using Java either, we might just copy their model, can’t we? Well….you may. I have seen this in a current microservice project. People have been passing model classes around on every change, often asking for some common domain model lib. Don’t do that. Never ever. First of all ask yourself: are you really interested in the data introduced by the change, or are you only adapting to satisfy the serialization? If you need all the data, you have to adapt. If you don’t need the data, don’t adapt, there are mechanisms to prevent serialization complaints. There is e.g. an easy way to tell Jackson to ignore superfluous properties:
@JsonIgnoreProperties(ignoreUnknown=true)
public class Person {
...
If you run your test again, everything is green again. In fact, application frameworks utilizing Jackson like e.g. Spring are configuring Jackson to always ignore unknown properties, since this makes sense in most situations. But be aware of it since it is not explicit to you.

PersonTestGreen

So what about that anger on common domain model libs? I have seen this quite often in customer projects: developers start to create some common domain model libs, so everyone out in the project may use it. The point is: over time people are extending the domain models with their specific needs… whether anyone else needs it or not. And this leads to models bloated with any kind of niche domain functionality depending on a hell lot of other bloated domain objects. Don’t do it. Duplicate the models and let every party evolve its own view to the domain.
You have to choose where to pay the price of complexity. Because DDD is about reducing complexity in the software, the outcome is that you pay a price with respect to maintaining duplicate models and possibly duplicate data.
Eric EvansDomain Driven Design
But – as always – things might not be that easy. What if you do not care about that superfluous data, but you have to pass it to another party. Hey, we give them the person ID, so they can retrieve all the data they want on demand. If this is the case, you are safe. But sometimes you don’t want to pass data by handing a (foreign) key, which actually means: by reference. Depending on the business case you may have to pass a snapshot of the current data, means: by value. So what about that case, do I have to copy the model classes again in order to specify all possible properties?!? Damn, in dynamic languages like Javascript or Clojure the JSON is generically “mapped” to the object, and I do not have to care for any class schema. Couldn’t we do that in Java also, at least in some – well, less comfortable – way? If you search online for solutions on that problem, you will often find this one:
public class Person {
 
 private String id;
 private String firstName;
 private String lastName;
 private final Map<String, Object> map = Maps.newLinkedHashMap();
...
 @JsonAnySetter
 public void add(final String key, final Object>  value) {
  map.put(key, value);
 }

 @JsonAnyGetter
 public Map<String, Object> getMap() {
  return map;
 }
}
Let’s give this solution a try. We will extend our test in order to check if the JSON output created by serialization equals the original input:
public class PersonTest {
...
 @Test
 public void testMapJsonToPersonToJson() throws Exception {
  final Person person = mapper.readValue(personJsonString, Person.class);
  final String newJson = mapper.writeValueAsString(person);
  JSONAssert.assertEquals(personJsonString, newJson, true);
 }
}
Let it run and, tada:

PersonTestRed2

… it fails, er?!? Yep, the solution described above works for simple properties, but not for nested ones. So we got to do better. Instead of Object, use Jackson’s JsonNode:
public class Person {
...
 private final Map<String, JsonNode> map = Maps.newLinkedHashMap();
...
 @JsonAnySetter
 public void add(final String key, final JsonNode value) {
  map.put(key, value);
 }

 @JsonAnyGetter
 public Map<String, JsonNode> getMap() {
  return map;
 }
}
Run the test again, and <drumroll>:

PersonTestGreen

Phew, green :-)

So this is the solution that solves our problem. It is both compatible to changes – as long as the properties we are actually using are not subject to change – and reconstructs the original JSON as we received it. Work done.

That’s it for today
Ralf
The only way to have a friend is to be one.
Ralph Waldo Emerson

Monday, September 14, 2015

Use MTOM to Efficiently Transmit Binary Content in SOAP

Currently JSON-based REST services are en vogue, but when it comes to integrating enterprise services, SOAP is still widely used. In a recent project I had to deal with binary content sent to a third party SOAP-Service. Thanks to great tooling, calling a SOAP service is not a big deal. But the binary data varied in size from a few kB to many MB, and this brought up some issues in transmission size and memory usage. That's where MTOM comes to the rescue, a standard for efficiently tranmitting binary data in a SOAP request. This article published on DZone describes, what MTOM can do for you by converting a tiny spring client-server project from default SOAP to MTOM.

Wednesday, June 10, 2015

Spring Integration Tests with MongoDB rulez


While unit testing is always preferable, integration tests are a good and necessary supplement to either perform end to end tests, or tests involving (third party) backends. Databases are such a candidate where integrations might make sense: usually we encapsulate persistence with some kind of repository service layer, which we can mock in tests running against the repository. But when it comes to testing the repository itself, integration tests are quite useful. Spring integration tests allow you to test functionality against a running Spring application, and thereby allows to test against a running database instance. But as you do in unit tests, you have to perform a proper set up of test data, and clean up the database afterwards. That's what this article published on DZone is about: proper database set- and clean-up in Spring integration tests with MongoDB.

Monday, April 13, 2015

Job DSL Part III

The previous part of this little series on the Job DSL gave you some examples on maintenance, automating the job creation itself and creating views. This last installment will complete the little round trip through the Job DSL with some hints on documentation, tooling and pitfalls.

Documentation

If you search the internet for the Job DSL one of the first hits will be the corresponding wiki. This is the most valuable source of information. It is well structured and maintained, so new features and missing pieces are filled in regularly. If you are looking for any details on jobs, the job reference is your target. If you like to generate a view, there is a corresponding view reference.

job-dsl-wiki-png

Job DSL Source

The documentation on the Job DSL is quite extensive, but so is the Job DSL itself. They are steadily closing the gaps, but sometimes a piece of information is missing. A prominent example: enumeration values. There are some attributes, that only accept a definite set of values, so you have to know them. Let’s take the CloneWorkspace Publisher as an example. There are two attributes with enumeration values: String criteria = 'Any', String archiveMethod = 'TAR'

cloneWorkspace-wiki

But what about all other values that are acceptable for criteria and archiveMethod? The documentation (currently) says nothing about that. In cases like this it is the easiest thing to have a look at the source code of the Job DSL:

cloneWorkspace-source

Ah, there you go: criteria accepts the values Any, Not Failed and Successful. And archiveMethod has TAR and ZIP. But how can I find the appropriate source for the Job DSL? If you have a look at the Job DSL repository, you will find three major packages:helpers, jobs and views. As the name implies, jobs contains all the job types, and views the different view types. All other stuff like publishers, scm, triggers and the like are located in helpers, so that’s usually the place to start your search. Our CloneWorkspace Publisher is – good naming is priceless - a publisher, so if we step down from the helper to the publisher package: ta-da, here it is :-)

See you at the playground

Sometimes it’s not easy to get your DSL straight. Examples are outdated, you do not get the point, or you just have a typo. Anyway, you type your DSL into the Jenkins editor, save your change and retry again and again, until you fix it.But all this is quite time consuming and developers are impatient creatures: we are used to syntax highlighting and incremental compilation while-u-write. This kind of typing feels a bit historical, so there should be something more adequate, and here it is: The Job DSL Playground is a web application, that let’s type in some DSL (with syntax highlighting) in the left editor side, and shows the corresponding Jenkins config.xml on the other side:

playground

Using the playground has two major benefits. First: no edit-save cycles, so you are much faster. Second: you see the generated configuration XML, which can be useful when you set up a DSL by reverse engineering; means: you have an existing configuration and you want to create a DSL generating exactly that one. I highly recommend you to give it a try, it’s pretty cool.

Nuts and Bolts

The Job DSL is a mature tool and bugs are seldom, but sometimes the devil is in the detail. I’d like you to introduce to some pitfalls I fell into when working with the Job DSL… and how to work around ‘em.

ConfigSlurper

The ConfigSlurper is a generic Groovy DSL parser, which we have used in our examples to parse the microservice.dsl.
def microservices = '''
microservices {
  ad {
    url = 'https://github.com/ralfstuckert/jobdsl-sample.git'
    branch = 'ad'
  }
  billing {
    url = 'https://github.com/ralfstuckert/jobdsl-sample.git'
    branch = 'billing'
  }
  // ... more microservices
}
'''

def slurper = new ConfigSlurper()
def config = slurper.parse(microservices)

// create job for every microservice
config.microservices.each { name, data ->
  createBuildJob(name,data)
}


If you try to use the ConfigSlurper like this in Jenkins you will get an error message:

Processing provided DSL script
ERROR: Build step failed with exception
groovy.lang.MissingMethodException: No signature of method: groovy.util.ConfigSlurper.parse() is applicable for argument types: (script14284650953961421329905) values: [script14284650953961421329905@1563f9f]
Possible solutions: parse(groovy.lang.Script), parse(java.lang.Class), parse(java.lang.String), parse(java.net.URL), parse(java.util.Properties), parse(groovy.lang.Script, java.net.URL)

Possible solution is parse(String)?!? Well, that’s what we do, isn’t it? After searching for a while a stumbled over a post which explained that there is a problem with the ConfigSlurper in the Job DSL, and the workaround is to fix the class loader:

def slurper = new ConfigSlurper()
// fix classloader problem using ConfigSlurper in job dsl
slurper.classLoader = this.class.classLoader
def config = slurper.parse(microservices)

Ah, now it works :-)  This problem may be fixed by the time you are reading this, but just in case you experience this bug, you now have a workaround.

Loosing the DSL context

When I tried out nesting nested views for the post Brining in the herd I stumbled over the following problem, that you sometimes loose the context of the Job DSL when nesting closures. My first attempt was to just nest some nested views. I invented a new attribute group in the microservice.dsl, so I can assign a microservice to one of the (fictional) groups backend, base or frontend. For each of these groups a nested view is created. These group views are supposed to contain a nested view for each microservice in that group, which in turn contains the build pipeline view. Say what?!? The following pictures will show the target situation:

nested-overview

nested-base

nested-base-help

That’s what I wanted to build, so I started straight ahead. I used the groupBy() Groovy method to create a map with the group attribute as keys, and the corresponding microservices as values. Then iterate over theses groups and create a nested view for these. In each group, iterate over the contained microservices, and created a nested Build Pipeline View:

// create nested build pipeline view
def microservicesByGroup = config.microservices.groupBy { name,data -> data.group } 
nestedView('Build Pipeline') { 
   description('Shows the service build pipelines')
   columns {
      status()
      weather()
   }
   views {
      microservicesByGroup.each { group, services ->
          nestedView("${group}") {
            description('Shows the service build pipelines')
            columns {
               status()
               weather()
            }
            views {
               services.each { name,data ->
                  view("${name}", type: BuildPipelineView) {
                     selectedJob("${name}-build")
                     triggerOnlyLatestJob(true)
                     alwaysAllowManualTrigger(true)
                     showPipelineParameters(true)
                     showPipelineParametersInHeaders(true)
                     showPipelineDefinitionHeader(true)
                     startsWithParameters(true)
                  }
               }
            }
         }
      }
   }   
}

Makes sense, doesn’t it? But what came out, is that:

nested-bad

Ooookay. The (nested) Build Pipeline views are on the same nest-level as our intermediate group views backend, base and frontend. If you have a look at the generated config.xml you will see that there is only one <views> element, and all <view> elements are actually children of that element…what happened? Obviously creating the Build Pipeline view has been applied to the outer NestedViewsContext. I don’t know too much about Groovy, but closure code is applied to the delegate, so the delegate seems to be wrong here. Let’s see if can fix that by applying the view creation to the correct delegate:

def microservicesByGroup = config.microservices.groupBy { name,data -> data.group } 
nestedView('Build Pipeline') { 
   description('Shows the service build pipelines')
   columns {
      status()
      weather()
   }
   views {
      microservicesByGroup.each { group, services ->
         view("${group}", type: NestedView) {
            description('Shows the service build pipelines')
            columns {
               status()
               weather()
            }
            views {
               def viewsDelegate = delegate
               services.each { name,data ->
                  // Use the delegate of the 'views' closure 
                  // to create the view.
                    viewsDelegate.buildPipelineView("${name}") {
                     selectedJob("${name}-build")
                     triggerOnlyLatestJob(true)
                     alwaysAllowManualTrigger(true)
                     showPipelineParameters(true)
                     showPipelineParametersInHeaders(true)
                     showPipelineDefinitionHeader(true)
                     startsWithParameters(true)
                  }
               }
            }
         }
      }
   }   
}

So now we explicitly use the surrounding views closure’s delegate to create the view, and…yep, now it works:

nested-overview

If you now inspect the config.xml you will actually find an outer and three inner <views> representing the groups, where each group contains the <view> elements for the Build Pipelines. Fixing the delegate is not a cure for cancer, but it will save your day in situations like these.

Done

That’s all I’ve got to say about the Job DSL :-)

Have a nice day 
Ralf
Sure it's a big job; but I don't know anyone who can do it better than I can.
John F. Kennedy
Update 08/17/2015: Added fixes by rhinoceros in order to adapt to Job DSL API changes