Rhize Up

Rhize Up w/ David Schultz: UNS and Event Driven Architecture (feat. Andy German and Geoff Nunan)

David Schultz

David: Let’s revisit some of the topics we visited from the UNS. Just as you describe it, there’s a lot going on with the manufacturing data hub, but it’s just to ensure that there’s good fidelity of the data, once somebody needs to consume it.

I think that’s really the value we’re trying to create here; getting data in a database is not hard, collecting data is not necessarily hard, it’s always my ability to ask the data questions and to get actionable stuff out, so there’s a lot that goes into that. But, you mentioned a word, “orchestrates.” Geoff, I want to talk a little bit about a topic we talked about earlier in the UNS, this idea of orchestration versus choreography. 

To me, orchestration would suggest that if I’m the conductor of an orchestra, I’m going to be that central point—that anything that occurs when it’s time for the violins to play, I’m going to bring the violins in or when it’s time for another part of the orchestra section, they need to do something different, I’m going to orchestrate that. It means that there’s the conductor who is involved in all aspects of what’s happening with an orchestra, versus choreography.

ORCHESTRATION VS. CHOREOGRAPHY

We think of this in terms of dance is, you know, the violins in this case are going to do something based on what, say, the cellos just did, but there’s really no method to ensure that that occurs. So when you talk about these concepts, you know, orchestration versus choreography, what does that mean within the context of the manufacturing data hub?

Geoff: Yeah, it’s a good question. What it means is two different approaches to how logic happens. It’s best demonstrated with an example. So if I go back to the car that I was talking about and, you know, this scenario of monitoring the engine temperature and I want to have a service, some code, some way that listens for the engine temperature going above some sort of limit and books my car into the mechanic, if that happens. Probably says something about how old my car is, now that I’m thinking about this. 

So if we think about that example, we’ve got telemetry, we’ve got some rules about, you know, temperature going above a set point, and we’ve got some function that’s, that’s, you know, doing something about it. In choreography, what that means is that all of that’s really in one service. So one bit of code is deciding what it’s going to listen to, what the rules are for what it should do about it, and then going and doing the thing. And maybe doing that thing is publishing some other event that some other service is going to be listening to, to do its thing. 

So I might have two different bits of code, one, you know, listening to the temperature and detecting the alarm condition and publishing an alarm, and then I might have another bit of code that’s listening to the alarm and doing its thing of saying when that alarm happens, I need to book the car in for a service. But the logic of the subscription knows what I need to listen to sits with the code that’s taking the action—that’s choreography. 

Orchestration, which you said is the conductor up front, separates those two things out. The subscription, listening to the event is separate from taking the action. So, we publish an event, but there’s a central orchestrator that’s saying when this event occurs, that action should happen, but the action is not part of the logic that’s saying, you know, when this event occurs, that action should happen. Those rules are separate from taking the action. 

Now, why is that important? What difference does it make? Well, both approaches can get you a system that books your car in for a service when the engine gets too hot, but there’s very different impacts of the two. The choreography approach, where you’ve got two systems, they’re logically decoupled, they’ve kind of separate systems and they’re both listening to their own events, determining what they’re going to do, and then go and do it. 

It’s very difficult to trace through from one end to the other, what happened from, you know, the temperature in the car through to booking the service. In a simple example like I’ve just given you can probably look at the logs and look at the events and figure out what happened. But let’s say you’ve got hundreds of different services doing very complex stuff. It’s very difficult to trace through, from one end to the other, what happened here and if something goes wrong, it’s very difficult to figure out why it went wrong. 

The bigger you build a system like that, the more brittle it becomes when you’re relying on a completely decoupled service, working with contract. You can’t really know what the impact of changing that is going to be across all of the other services in your system. An orchestration pattern decouples what should happen from going and doing it. But because you’ve got this conductor at the front who’s got the full picture of, you know, joining the dots across all the services, you’ve got a central place to look, trace, and create the paper trail, of the temperature to go high, that created an event. The event came to the orchestrator. The orchestrator said, okay, that temperature went high, we need to now put the car in for a service, so I’ll call that service that goes and does that action. 

You’ve got the end-to-end thread if you like. I don’t want to use digital thread, but it’s kind of that concept, of what happened then what action was taken and what event did that result in, and then what happened in response to that event. So I can see the end-to-end process. That’s essentially the difference between choreography and orchestration. 

David: So it’s I guess, what I’m getting at here is it’s not necessarily a good versus bad, it’s that they are two different ways of accomplishing the same thing, but we want to be mindful of what is the impact. For instance orchestration might require more services, it might require a little bit heavier of a system, but it ensures delivery, versus choreography; it separates out, it decouples, it has lighter overhead, or less overhead, or it potentially could. They both could be applied, so I think the important thing here is to understand when to use one versus the other. 

Geoff: Absolutely, and so much of what everybody does in manufacturing is pattern-based.

We see a problem, there’s probably somebody who’s seen that problem before. There’s already a couple of different relevant patterns on how that can be solved. And our role as architects is kind of saying, well, what’s the relevant pattern that could be applied to this problem, and when should I choose one and not the other?

David: Sure. Perfect. So, you know, thinking of some design considerations on that. You know, Andy, another topic we talked about within the unified namespace conversation we had was this idea of composability versus inheritance. Even though these are generally more attributed to when you’re doing programming, I think it’s important here to understand around, say, data models. So inheritance, you know, say I want to model a pump and I’m going to create a base pump or my core pump, and then I’m going to build all my other pumps off of it. So maybe I’ll have a standard pump, I’ll have a complex pump, or you know, something along those lines, but they’re all going to inherit from that core value. 

The benefit is that if I need to make a change to that pump core, then the others will inherit that, you know, versus composability, where all my pumps are going to stand alone, and it gives me a lot of flexibility in what is the data, what does that data model look like? 

This to me, I think, becomes another important, you know, design consideration. You know, where we landed is you know, absolutely use composability to make sure that all your data models can stand on their own. I think that’s a general rule of thumb, but, you know, maybe there can be some times where we want to use inheritance. So, you know, again, like I asked Geoff, you know, what does this mean in terms of the context of a manufacturing data hub? How do we apply these to contexts in there?

COMPOSABILITY VS. INHERITANCE

Andy: Yeah, there’s a couple of quite abstract concepts there. You know, composability versus inheritance to try and bring a kind of a concrete example to this is, within ISA-95, within the equipment hierarchy. I say, well, we’ve got a couple of key ways to expose how equipment is configured and composed and what properties they have and that kind of thing. So ISA-95 allows for equipment, which are equipment instances offensively, and equipment classes and the equipment classes themselves define what properties and what attributes certain types of equipment should have.

In the equipment class hierarchy, what we do is we build a hierarchy of objects. So it may be that we’ve got an object of type, a pharmaceutical vessel, for example. This object type might have a volume attribute, and then a little bit further down the chain, we might have a pharmaceutical vessel, but this pharmaceutical vessel is a fermentation tank. It would inherit properties from the parent. It would have a volume, but it would also have a valve that might be a dump valve on there or something like that that allows you to move the materials into the next part of the process. 

So we’ve got the fermentation, so any equipment instance that is defined as a type of fermentation vessel would automatically inherit the properties of the pharmaceutical vessel and the fermentation vessel. So the equipment instance would have two properties that have been inherited by the equipment class hierarchy. When we consider properties of equipment, what we, what we do at Rhize is we look at the class that has been implemented and we work up the hierarchy and collect all the attributes from the parents, grandparents, great-grandparents of that particular piece of equipment of those particular equipment classes, and we apply those to the instance.

Something we also support is multiple inheritance, so an equipment can inherit from the fermentation vessel class, but it could also inherit from an equipment type which was only capable, for example, of something like that which would lend it a bunch more properties. We talked about inheritance properties and methods and that kind of thing, when these objects, this equipment, the actual instance of equipment is instantiated, inherits all of these properties.

There’s an element of composition in there as well, because we’ve got this multiple inheritance, which means we can inherit from different classes. The composition side of this also is we need to look at how we organize objects within a structure, within a hierarchy. If you look at an inheritance hierarchy as a hierarchy, this equipment is our fermentation vessel. What you would also want to do is understand what equipment is contained within other equipment. 

This goes back to sort of the classic way of looking at equipment hierarchies that they, you know, we’ve got enterprise sign, area line, we’ve got equipment that contains all the pieces of equipment. So we might have a production line that has a fermentation vessel and a heating vessel, that has a different type of vessel, I’m not sure what that might be. We’re building not only an inheritance hierarchy on one side that brings properties to an individual piece, but the composition hierarchy allows us to set up parents and siblings in terms of position and containment within that. So that’s probably the two key areas that probably want to bring to light. 

We’ve also got an inversion of that composition hierarchy when we need it. If you’ve got, for example, a hierarchy of vehicles, in the BMW range of vehicles, you might have a 3 series and you might have a 5 series and you might have an X5 or something like that. Those would all inherit properties from the upper class, for example, which might be that these are all cars and they all have doors and they all have wheels and that kind of thing. Contained within BMW 3 series would be an engine, and that engine would have properties of its own. It may have an engine speed, for example, an RPM on that. 

You would want to look at the relationship between the car and the engine, understand where their inheritance goes or falls down in that instance. So the engine wouldn’t inherit the properties of the car, even though it is a subclass of the car—it’s contained within the car itself, it’s still got four wheels. What you could say about the car that the car does have, if the RPM is at 3000 on the engine, then you could actually say that the car has an RPM of 3000. We’ve got this kind of inheritance on the way down, and also this sort of pushing of property values upwards to parent objects. We’ve also got this idea that the objects can contain other objects. It starts to become abstract. What you end up doing if you’re in a project and you’re trying to solve problems and try to understand how objects are interacting with each other and how you need to place this data that’s coming in from the shop floor, then you need more than just one tool to be able to do that, to be able to express yourself in it in a correct way. 

You do need to have a composition hierarchy. You need to have a top-down inheritance hierarchy. You need to have the inverted inheritance hierarchy, and you probably also need to be able to express relationships between objects that are more like a network than a specific hierarchy.

I think that that general domain and those discussions, certainly ISA-95, allows you to sort of explore this problem domain with a richness of tools, that allows you to sort of arrive at a solution that’s sophisticated enough to really represent what you’ve got. As you said, you know, two key tenets of that, composability versus inheritance, that’s kind of a starting point of how you would want to think about how these objects are related. 

That’s just the equipment hierarchy. If you start talking about recipes and process segments and, and and that kind of thing, you get this sort of repeating network effect, but actually the patterns are the same. You just need the vocabulary there to be able to sort of deal with that and further the discussion.

Geoff: When we were talking about properties. So we’re talking about equipment properties there, whether it’s the car or with the 3000 rpm, it’s a property of the car property of the engine. When you start to think about it, I’m going back to the Uber broker functions here, a lot of us are talking about my Uber broker, I need to be able to do stuff. When you talk about behavior of things as well as the properties of things, that’s where it starts to get really interesting with composability versus inheritance. 

If I look at the engine and I say ‘I want this engine to have the behavior of whatever temperature it automatically books you in for a service, you know, I need to be able to place that behavior on that piece of equipment.’ But not all engines might have that—maybe it’s an optional behavior that I want to put in—it does add this sort of extra layer of sophistication to the requirements of the platform when you say, yeah, I’ve got this property modeling of the equipment to say what properties does it have, but then I’ve also got this behavior dimension to it or what what behaviors does it all have. That’s where you really end up using both the composability and the inheritance model to kind of get what you need done.

In the Rhize platform, the rule engine is associated to the classes of things. So we define the rules and the behaviors on classes of things. If you model your classes as behaviors, you can then compose a piece of equipment by saying, okay, this piece of equipment has this behavior, and that behavior and that behavior. Technically in the platform you need to have quite a complicated capability which has multiple inheritance, which is a well-known computer science problem. If your platform has that, you can do all this kind of stuff.

David: Yeah, absolutely. For me, it’s in the labs. Where I was going with this is that it’s a data governance issue. It’s that the, you know, ‘thou shalt’ versus the ‘that may’ and you know, where I look at this conversation is that there are always implications for whatever decision that we’re making here, and you really need to think through what it means to inherit. I would call that the ‘thou shalt’ piece, you know, versus the composition or the composability of the ‘thou may,’ you can extend this thing out, but you need to at least have these certain core things, and when you describe it, but, you know, after that, hey, use it however you want and whatever it makes sense for that use case.  I think it’s important to understand these concepts for sure. 

So one of the topics, Geoff, that we’ve talked about is the concept of event-driven architectures. It’s almost like when we talk about event-driven, you know, whether it be Pub/Sub or there’s an event that occurred, I want to make sure we understand exactly what we mean by that. There’s certain patterns that emerge in this. 

There’s a gentleman, Martin Fowler. He did a YouTube talk, and we’ll link that here below, that  identified four patterns that are associated with this concept of ‘event-driven.’ Of course, there’s event notification, which is, you know, the car arrived, or there’s event-queried state transfer, which is here’s the payload that’s associated with it—I’m going to give you the information about it or even event sourcing of where it came from. Then you get into this CQRS or command query responsibility segregation. There’s some concepts that go along with that. 

Can you walk us through what all these things mean, so that we understand when we’re talking event driven, these are the types of things we’re looking at?

4 EVENT-DRIVEN ARCHITECTURES

Geoff: Yeah, and I think if we can, we’ll put a link into the presentation from Martin Fowler. Fowler, the recognized expert in event driven architectures generally, into the thread here. It’s a really good topic to explore here. What is an event-driven architecture? Because we talk about it, right? I may say UNS is a platform for an event-driven architecture. What is an event-driven architecture? It’s not just publishing based on when something changes, that’s kind of an incredibly simplistic description of event-driven architecture. So what is it? 

Well, it turns out it’s actually this umbrella term that can mean kind of at least four different things we can be talking about. What are these four different things? Well, we can be talking about notification of something. 

 

EVENT NOTIFICATION

This is the telemetry, the use case that those of us that have been in sort of skater systems and machine control, mostly what we’re talking about there is telemetry, which is a notification of something has changed. It can be as simple as that. 

As you go from raw telemetry data into more complex events—I’ve had an over temperature alarm in my car—then it gets more context and there’s more state attached to that event. We can still be talking about a notification there, I’ve had an over temperature alert in my car, and I can publish just the car’s registration plate. It doesn’t tell me any context, it just gives me an ID that I can use to go and get the context if I want it. So that’s event notification. 

 

EVENT-DRIVEN STATE TRANSFER

There’s this other pattern of event-driven architecture, which is event state transfer, event-driven state transfer, where if I wanted another system to be able to fully respond to this over temperature alert in my car, I might need to give it some more information, some more context than just the registration plate of my car. Maybe that’s enough, but maybe it needs some more. So, how high was the temperature? How long did it stay high for? How many kilometers or miles, depending on which country you’re in, had the car traveled? When was it last serviced? All this state, this context of what happened at that point in time. Maybe I want to actually transfer that from one system to another system. So this is kind of event-driven state transfer, which is a different version of event-driven architecture. 

 

EVENT SOURCING

Some people, when they’re talking about event-driven architectures, are actually talking about another version, which is event sourcing. It actually means the collection of events over time to build up state, and this is the bank account use case. My bank account, when I look at the balance of it, shows me a balance, but that balance is calculated based on all the transactions that have occurred, so I’ve had all these events where I’ve spent money and received money. Normally it’s spent, not so often received, but they’re all the events that happen, but, how much money have I got in my account? Well, I add up all the events to figure out how much money I’ve got in the account. 

This is a pattern of event-driven architecture called event sourcing where I never really transfer the state. Each system builds up their own view of the state based on the events that have occurred, so I’m transferring the events and each system is kind of figuring it out for themselves.  

 

COMMAND QUERY RESPONSIBILITY SEGREGATION

And then there’s this fourth, this fourth different thing, which is event driven architecture as well, called command query responsibility segregation, which is a bit of a mouthful, which is a whole different architectural pattern for how to do this kind of at scale, across distributed systems, and separate out the format of the data that might get published in an event from the format that I might want when I’m querying for that data. So allow me, if I think about this in UNS context, it’s what I publish might be in a different format, or I might want it to be in a different format when I subscribe to it than when I publish to it. That’s kind of command query responsibility segregation, CQRS. 

So, event-driven architecture can mean this simple event notification, the telemetry use case of ‘here’s my temperature,’ you know, ‘here’s my speed’ or this real-time publishing of current state, or notification of telemetry, but it also can mean any one of these other 3 or 4 quite complex architecture patterns. The more complex or the bigger the organization, or the more complex the use case that you’re looking at, you may need to consider some of these other patterns and ways of doing things.

David: So getting into the event-driven architectures, we tend to use that term very loosely, but there’s a lot that goes into it. I think much like the other topics that we talked about, whether it’s composability or choreography, there’s things that are associated with it, so it’s important to understand these are all the various things that can emerge when we start talking about event-driven architectures and, all that’s involved in that. I think why this is an important topic is just to understand when we say event-driven architecture, what does that mean in terms of what it is that we’re trying to build, what’s the problem we’re trying to solve? 

There’s a lot that goes into that. So, any other thoughts that you had on that, Geoff, for event-driven architectures? 

Geoff: Yeah. yeah, lots. I speak about this stuff a lot, all the time. One of the questions I get asked quite often, two which are related, which is, you using a graph database, why didn’t you just use one of the existing ones? We kind of had to to go down a path of a very specialized database to make this whole thing work, and why didn’t you use Neo4j or one of the other graph databases? 

The answer to that is it’s too slow, and they don’t generally do safety schema and ACID compliance, which is a bit of a techie term that just means keeping the data consistent. They don’t do those things well at scale. When you start to look at this more complex, event-driven architecture patterns, it drives very specific requirements into the database. We use a graph-based database, but it’s a very specialized graph database to make this all work. 

The question we get is, ‘couldn’t you just do this in an SQL database?’ We kind of talked about that a little bit and said, well, you can at a small scale. Generally, an SQL database will start to struggle at about 12 traversals in a query, so if you’ve got 12 joints in one query, that’s about hitting the maximum of what an SQL query can comfortably handle. A lot of the queries that we do when we talk about command query responsibility segregation, and some of these other event-driven architecture patterns, need to be this kind of really wide traversal with sometimes recursive queries in there that mean that we hit the limits. 

Then you’ve got to think about how do I make this all robust and reliable and how do I get it all to be higher availability? Because we’re in manufacturing, you can’t just turn a factory off whenever you want to upgrade it, and those sorts of things. There’s event-driven architecture patterns, and architectures have the answers built into them for we deal with all this stuff. 

Publishing an event when something changes, yeah that’s a good start, but there’s so much more, so much more that goes in there. 

Our job as software developers and as community members is to try and deal with all hard stuff, and provide that to people who just want to get on and build applications.

David: Yeah, absolutely. I mean, that’s why we bring it up. I think this is important and it’s not something you want to overlook.

People on this episode