Rhize Up

Rhize Up w/David Schultz: Defining the UNS (feat. Jeremy Theocharis, Kudzai Manditereza, Aron Semle)

July 15, 2024 David Schultz Season 1 Episode 3

David: All right, let’s go ahead and get this fired off. Good morning, good afternoon, good evening, and welcome to the Rhize Up podcast. My name is David Schultz, and today, we’re going to be talking about the Unified Namespace, or UNS for short. I put this together because there still seems to be some confusion. We think we know what it is, but we’re not sure exactly what it is.

So, the intent here is to try to socialize and define a Unified Namespace. As I said earlier, I spent about three hours with some people over the weekend, and we were just hashing it out.

WHAT IS UNIFIED NAMESPACE?

David: Where we landed is the Unified Namespace is an approach for an event-driven architecture that includes a Pub/Sub technology like a broker and some sort of DataOps tools that allow us to create data models and publish and subscribe these data models through some kind of defined, topic structure.

That’s where we landed. Hopefully, by the end of this podcast, we’ll be able to gel around a clearer definition. But with that, I am now joined by three very well-known people within the industry. I would call them the experts within Unified Namespace. They all have different approaches to it.

So, let’s go around the room real fast and make an introduction. How would you do that if you could just briefly describe a Unified Namespace when someone asks you what it is? So, Aron, we’ll go ahead and start off with you.

Aron: I think you did a great job, David. I think I’ll just steal yours. So, I’m Aron Semle, CTO at HighByte. My background is in industrial manufacturing for the last 15 years. I started at Kepler back in 2010. Then PTC, and now on to HighByte. So if I had to define Unified Namespace, I think the definition you gave right now, David, is probably a really good report by exception, real-time data, and single source of truth.

And I think the other folks will get into that. It’s nuanced, right? Where the confusion is, it’s part of the magic of it. And then the problem with it. But looking forward to the discussion.

David: Yeah, absolutely. Thank you. Welcome, Aron. Jeremy?

Jeremy: Yeah, I’m Jeremy Theocharis, cofounder of United Manufacturing Hub. Because most of our customers are large enterprises, when we go in with United Manufacturing Hub, we need to convince hundreds of people about the Unified Namespace. So, what we like to do is make things very, very boring.

By boring, I mean Unified Namespace is actually nothing really exciting. It’s just IT best practices applied. So what is it for us? It’s an event-driven architecture with message brokers. Very simple. Very well known. So what is it not? It’s, therefore, not a database because it’s not real-time. It’s also not point-to-point.

But people typically understand with Unified Namespace that, at the core, it’s actually something very boring. A lot of people and industries already do it, but they just don’t call it Unified Namespace.

David: Sure, absolutely. And Kudzai.

Kudzai: Cool. Thank you, David, and thanks, everyone. So yeah, my name is Kudzai Maditereza, I’m a developer advocate for HiveMQ. I do a lot of evangelizing for the Unified Namespace. I’m also a founder of Industry40tv, a media education company running a popular YouTube channel and podcast.

Unified Namespace is quite a nuanced topic but fundamentally is about the event-driven architecture behind OT/IT data integration. But a lot of times, I find that explaining it in mostly technical terms loses some of the audience.

So I think it’s more apt to talk about Unified Namespace, at least in my case, as being that data platform for continuous innovation. And I think we’ll get a little deeper into that. For me, it’s the platform that gives real-time access to all data from all parts of your enterprise OT and IT, allowing you to continuously innovate on top of that data without having gone through a lot of friction around it.

So yeah, excited to be here and looking forward to exploring it some more.

David: Excellent. Well, it’s great to have you here as well.

And so I defined it as an event-driven architecture. There is a broker, and there are some DataOps in there. So Kudzai, could you lead us off by talking a little bit about the broker’s role in this whole process? Why is Pub/Sub so important to a Unified Namespace?

Kudzai: Okay, to set the context, it’s important to talk about how we arrived at this idea of a Unified Namespace. Right? And what are the challenges, really, that we’re trying to solve there? I think that will help us paint a good picture of the idea of Pub/Sub and the broker.

Typically, a company has this idea of becoming a data-driven company because that helps address many different data use cases that you might want to address to maintain a competitive advantage. So whether it’s existing energy efficiency, OEE, or advanced analytics use cases you want to address, a Unified Namespace is something that, for you to address all these use cases, you need to connect it directly to all the different systems on the shop floor and to IT. So, you’ve got this OT/IT integration whereby you’ve got apps that are connected directly to equipment. So, each time you need to address a certain use case, you need to set up that connectivity infrastructure. There’s a cost related to that. And there’s repetition also involved there. The time it takes for you to bring up, especially from different sources, up to IT, it takes a lot of time, months and even years.

The idea of the Unified Namespace is to say you’ve created that hub of information where you have all these connections already set up, and you’ve got all this information already being pushed to that central repository of information that gives you that real-time snapshot of all the, the current events and state of your business without having to worry about the underlying, connections. So, this idea that you are pushing data from the edge into this central repository of information and also from IT. It’s information where you have this common format data unified and that is accessible through that single interface.

So, this is where the broker comes in. It acts as the platform that allows components or systems to connect from the edge and push the hub of information that then gives access to real-time events. And what really makes MQTT or the broker ideal for this scenario is that you don’t want to just push data using it as if you are using a legacy protocol, where you just like saying the topic is say, a tech name, right? You want to be able to organize the data in a way that can be discovered semantically by anyone who understands the organization and the structure of the organization.

So MQTT gives you the ability to be able to organize information using its MQTT topic namespace. This is the part that the MQTT broker plays in the Unified Namespace. It’s the source of real-time access to the status through a single interface without you having to connect to a thousand different servers on the shop floor. By connecting to one broker endpoint, you get access to all the unified, normalized data in a way that has already been organized and is being pushed using published communication.

I don’t want to get in much depth about what Pub/Sub is, but at a high level, it’s the idea of the broker’s role in a Unified Namespace architecture. So you still, at the end of the day, need some components that help get it out. And I think this is where Aron will be able to expand a bit on how you get that data out and then how you marry OT data to IT data within an MQTT information hub.

David: Before we move on to Aron, I want to make sure that I understand the broker’s role. It’s around this event-driven piece. But I need an easy way for edge devices or other systems that are going to instantiate that connection. So that’s one of the first pieces of it is edge data can connect to something rather than having maybe a server client as the technology that I have to go and find all these. It enables easy access for all that data to get there. Then of course, now it’s that whenever there’s new information that’s part of the event-driven architecture as new information is available, I don’t have to ask you for it. The Pub/Sub enables that new information—you just send it to me when you have it, and I’ll be sure that it gets out there. Is that a pretty good synopsis of the role of the broker here?

Kudzai: Yes, it is. And also maybe just to expand on that a bit, this idea of event-driven. As you mentioned you don’t get to a situation whereby applications are reaching down to this job flow to ask for this data. Instead, it is pushed from the edge into the data infrastructure. What that allows you to do is create this idea of a single source of truth because all of the data that you’re currently viewing is the data that is actually being pushed from the edge, from the source of the data.

You can easily build your business model around it because it’s not data that has been aggregated through different layers. It’s coming straight from the device at the edge into the data infrastructure. And, also the idea that you don’t have an application that is pulling to get that information, it means that whenever there’s a network failure with Pub/Sub, the edge component or the device is able to buffer that information and retransmit whenever that connection is restored, which in any other scenario, you’d lose that information whenever there’s a break in connection.

So this is the importance of having that broker or Pub/Sub architecture in a Unified Namespace.

David: All right. Excellent. Yeah, I certainly don’t want to overlook the importance of the ability to buffer data to ensure we’re getting everything there. So, I guess in that case, make sure that when you’re choosing a broker, you choose something that has that ability and all the technology supported.

So, Kudzai, you mentioned that as you were talking. Oh, go ahead, Jeremy, you had something.

Jeremy: This is more of a scientific explanation. We tend to go a little bit more theoretical. One of our most popular blog articles, we actually compared different MQTT brokers through it. We only got two cease and desist letters for that article, but it’s still online in its original form.

In the introduction, we try to give a more scientific reason why you need the message broker in industrial IT systems or, in general, in a system where you process a lot of data. You always have multiple building blocks. You cannot do everything in one application. And in order to connect those building blocks, thinking top-down, there are only three options on how you can do it.

You can send it through databases. That’s the first thing. So we have all the PLCs. They all send data in the database. Then you could do some batch processing on it. But as soon as it shows us that architecture, with a standard storing approach, you cannot do anything with real-time data anymore. All this stream processing. So data comes in and you change the color of a traffic light. It’s not possible anymore.

The other way you can connect those building blocks is in service calls. So point-to-point. So you do a lot of point-to-point connections, and it’s actually a fairly reasonable approach. However, in manufacturing, a lot of these blocks tend to stay there for like ten years. At Google or Netflix, they can do it with point-to-point connections, but they also rewrite their whole applications every three years. I mean with factory stuff tends to be like this, then you have these spaghetti diagrams because you would never refactor.

And then there’s only a third option, which is called asynchronous message passing. It’s basically a message broker, which, unfortunately, adds a new component that can fail, which is bad. Then, the other side allows everyone to start talking to the message broker to exchange information in real-time. And then you put it in the database. So everything has advantages and disadvantages, but then manufacturing where the other two choices are mostly not a good way to go, a message broker is the only viable option left. It’s usually a very good option.

David: It always helps when other people provide their own input. Certainly, a picture speaks more than a thousand words. It really helps me understand and unpack this, which is my favorite part of doing podcasts. I learn so much when I’m on it. So, again, I can’t thank everybody for their time and participation.

As Kudzai was talking, one of the things that came up was that we want to have these semantic data models. We need to publish them to a defined topic name structure. Because if I’m just a device, and we all just start willy nilly doing things, I don’t know that we’ll ever be able to leverage the value of this Unified Namespace.

Aron, I think it’s HighByte that might have even pioneered the term DataOps. So, when I talk to people about HighByte and what it does, I say it’s a DataOps tool. I even mentioned that in my definition. So can you share with us what exactly is a DataOps tool? What is HighByte doing and why is it so critical to the architecture and design of a Unified Namespace?

DATAOPS AND THE UNS

Aron: Yeah. Sure thing, David. I think we started back in 2018. I think we looked back and did a Google search on industrial DataOps back then, and there were 70 results. And I think now there’s 300,000. So we were talking about it pretty early. And obviously, that’s changed over time. But still, it’s a pretty consistent message.

When we think about DataOps in industry, the classic example of a use case I like to give is a large industrial enterprise. I might have 50 facilities. I should be able to have a team of half a dozen folks that have the people that process the technology to manage my data movement from the edge to the cloud, in the cloud, back to the factory at scale—with the goal of driving the cost of change down to near zero.

So what does that mean? Here’s a classic example. Some of our larger customers have AGVs in their factories—autonomous guided vehicles. So, forklifts drive around with nobody controlling them. A common problem with those is they’ll get stuck somewhere in the factory, and you need to be notified, and someone needs to go get them. It’s kind of like your Roomba getting stuck under your kitchen chair, right? So that’s a use case. It’s a very practical one.

Let’s say you have 20 facilities that have AGVs across three or four vendors. What you do with the DataOps platform in that team of six folks is you can go in and say, “What’s the data we need from these AGVs to be able to detect when they get stuck?”

With that, you create a data model. And that data model is nothing more than maybe its location, last time it chirped data, maybe it was talking MQTT and a few other additional bits of information that say, “This is the data we need from the AGV, regardless of the vendor, to be able to enable this use case.”

So, with DataOps, you can quickly enable that in one factory. We’re not talking about months of integration. We’re talking about weeks or less. You can then pipe that data to some system in the cloud. Maybe it’s going to Snowflake S3, maybe it’s going through MQTT as part of the UNS. Then, what you would do is you’d prove out that use case really quickly with the factory to say, okay, can we detect these AGVs getting stuck?

Once you’ve done that, you should be able to take that use case, that data model and the technology you use to push that data in, replicate it across all of your factories—all 20—in less than a month, and enable that use case.

At the same time, you might find that you tried that use case, but it delivered no value. You just throw it away. So it’s DataOps of the concept of I’m going to create this process, people and technology that allows me to test these use cases really quick and then either scale them up quickly or reduce them. This is part of the UNS story, too. Once you make this data readily available, people will start to use it, and then they’ll want changes to that data.

They’ll say, “Hey, we don’t need these attributes in the data payload, but could we have these, or could we scan at a faster level?” The key to DataOps is ensuring people, process and the technology can make those changes at scale, at near zero cost. You should be able to add an additional attribute of data, wire that up across all your facilities, and publish that to your end system with very little work.

That’s the core of DataOps. We think about DataOps in terms of UNS, the part that applies directly to UNS, is that data model, that semantic model, like you said. If I have every PLC just chirping raw data into the UNS, it’s not very usable. Traditionally, factories don’t have this concept of a nice rich namespace with a logical way we think of the factory. And then that data mapped to the actual data in the PLCs. MQTT and DataOps is a way to model that data at the edge to create those models and then move it semantically into MQTT. In this example, in terms of UNS, in a way that a human could come in and look at and say, “This makes sense. I can see this site area line, this is the machine, this is the temperature sensor, regardless of the vendor.”

It abstracts away the details and the mess of the different implementations and technologies at the edge, so by the time it gets into the broker, it makes logical sense.

David: Excellent. So if I understand, we now have this broker. We’re going to use MQTT in our example here for this Unified Namespace because it is fairly common. I have client instantiated data. I can update data as there’s new information available. And I can also buffer that if for some reason there’s a loss of connectivity that’s there.

The DataOps piece comes into what’s sometimes called data governance. There’s going to be a data contract between the producers and the consumers of the data. The “contract” will say I’m not just going to willy-nilly publish raw data, I’m going to create these agreed-upon, semantic data models. I always use a pump or a compressor, some type of asset to say this is what that payload is going to look like. And then I’m also going to publish it to a topic structure that has already been defined.

So, commonly, we’ll use the master data model. There’s a lot of references to ISA 95, using that hierarchy of enterprise site, area, line, cell. So we want to publish data that’s there. That’s the piece that DataOps takes care of. We now have an enabling technology, but let’s make sure that we’re sending data in a way that everybody can understand. And if I needed to go find data, I would know where to get it. Does that make sense?

Aron: Yeah, exactly. And I think DataOps and UNS overlap quite a bit. DataOps is the flexibility to model and move that data wherever you want. MQTT happens to be a really easy, useful way inside the factory and even out for multiple subscribers to come in and access that data. So, it kind of gives you that semantic ability but also the flexibility to move it wherever it needs to go.

David: So, continuing on with many of the conversations around the UNS, one of the first questions that almost everybody asks early on is, well, is there a database? Where do I store my data? Because if it’s just fine publishing to it, does that mean that the UNS is a database? Actually, no, it’s just the current state of the business is the way I understand that. But maybe, Jeremy, you can help us understand a little bit about, does a UNS store data, or if it doesn’t, then how do you go back in and retrieve some of that data?

Jeremy: Yeah. This is also something like our scientific approach to the message broker, and it always sounds a little bit unnecessary, but in situations like this, it really helps bring the discussion forward. So, I already said, there are only three ways we can connect building blocks. And if you do it by your database, then you don’t have this real-time.

So, we already established the Unified Namespace in its core is not a database. It has a message broker. But then there comes the questions: With all the real-time data, I want to calculate OEE, I want to see the story. Where is the story in it? How would it work?

You have a Unified Namespace or a message broker. You have this DataOps or the contextualization layer, which helps you model everything, and then you have an additional database, so that whatever you have in the Unified Namespace, you can actually also put it in a database.

And there I think you can go deep down into rabbit holes, because what do you want to store? Because you can send a lot of information through the Unified Namespace. I would say the easiest thing is time series data. So you have it, then you send it through, and then you can define a certain schema if data is in a certain type of format. For example, if we have this topic, it will then light up in a time series database. And you can apply the same for other things. For example, if you have more of an ISA-95 style of information—like work orders, new product etc.—you could store it in a more relational database.

Some people in the community try to combine it all. They say, “Let’s also put the database in there and let’s try to mix it up and put a graph database in there.”

From our experience, it just makes communication very hard. So, there’s a message broker for real-time data, a database, and a microservice component that could also be the DataOps layer or something else. That component subscribes to the message broker to certain types of topics and then stores the information in the database.

And this is how we’ve seen most companies approach it. Of course the exact tools change—type of databases, message broker. But with this approach, we aim to give it to any IT person, so that they will look at it and say, “okay, yeah, this is boring.”

Then, there is this step between what if I want to store data, but not permanently, only buffer it for a short period of time. For example, during connection outages this gets a little bit tricky because you don’t want to use a database because they’re only done permanently, but also MQTT, I’ve not seen actually broken yet with like a proper queue system in there.

So we use, for example, Apache Kafka for this as an addition. You could also do it in your own microservice that buffers the data in between. But these are the main considerations. Now, we could go into all types of details. I could talk for hours about this.

David: Thank you for that. So if I understand, we’re just limiting the Unified Namespace, and how we are defining it. There’s no persistent data or data storage. It’s purely just for the Pub/Sub functionality, using a DataOps tool to make sure that we’re bringing in semantic data models to a defined topic structure. And then, depending on how you want to query or retrieve that data, that becomes more of a function of what you’re trying to do. Did I understand that right?

Jeremy: I don’t know about the last part. Can you rephrase that?

David: Yeah. So now the endpoint depends on how I want to consume that data or query that data. That’s not something I’m necessarily going to retrieve from the Unified Namespace. There’s going to be a database or some other piece of technology. That’s where I go to retrieve the data. It’s just that everything has traversed that, and there’s an agreement on the data I’m going to find. Is that right?

Jeremy: Yeah, exactly. And what makes it really difficult is to ensure a consistent data model across everything. I’m still writing an article on this that’s half-published. It takes a lot of time because now you have data models in the PLCs itself. Then you have a data model in the UNS, and then you have the data model in the database. And you want people to still be on that set of the PLC, to still understand whatever happens to the database. So this is something that is really challenging, to design your data model so it works across all three of those components.

David: Yeah, I think that’s a very fair statement. And actually, one of the things that I wanted to talk to Aron about is the ideals as we’re building our data models. This is very common in programming. We have this concept of inheritance versus composition.

INHERITANCE VS. COMPETITION IN DATA MODELS

David: When I’m thinking of creating these data models, I might be inheriting something where I’m a member. So I’ll go back to my easy examples, like the pump. I might just have a basic pump model, but some of the assets in my equipment, some of the pumps, now have more information available on them.

Do I want them to be a child of a pump? Or should I just create two completely separate independent models of this type of data? Or do I start bringing other data models and nesting them within an overall model? Suddenly, it can be pretty cumbersome and complicated on top of the three use cases you already mentioned, Jeremy. So, from a DataOps standpoint, Aron, how do you generally approach the data modeling and all these various scenarios?

Aron: Yeah. I look at those two extremes. You could start really simply and just model the bare minimum of what you need. We see some customers do that. We think they probably have more success being use-case-driven. As I mentioned, the AGV case. Start there. Go. Learn. Adapt.

You’ll have other customers that will go to the other extreme. I would call that digital twin, right, where it’s almost academic. They’ll spend six months trying to model the factory perfectly, using UML on a bunch of stuff, and then trying to break that back into semantic models. They’ll end up with the perfect model. They’ll go to implement it and find out it’s really difficult. And hey, the end systems can’t use this anyway because it has too much hierarchy, or we messed up here.

The sweet spot is probably somewhere in the middle. In terms of inheritance versus composition, we have a whole modeling guidebook. I think we wrote it a year or two ago, and it’s actually really useful. So I encourage people to go look at that. There’s a lot of practical advice there in terms of modeling, which is new to the space.

But in terms of inheritance and composition, with inheritance, we tend to lean towards composition more. And this is kind of academic as well. But the thing with inheritance is that you kind of create a hierarchy as a result of inheritance.

As you mentioned, David, you have a base model of a pump, which you derive for the Mitsubishi pump, and then different pump types. For the audience, you think of inheritance; you think it’s a relationship. So I have a car that’s my base model car, and then I have a Volvo and a Subaru. I start to add attributes off of those base models and build that up. You end up with this implicit hierarchy that’s somewhat coupled. What you end up doing is, if you go with the inheritance route, later down the road, you have to go change some base models. And there might be implications in that across all of your model definitions that you didn’t quite anticipate. That can make changes more difficult.

So like the classic example, if folks are familiar with OSIsoft, now owned by AVEVA PI System. They have a product called Asset Framework, which is over ten years old now which is kind of a pioneer in modeling historical data on a story. And if you’re familiar with their system, they use templates, which are models, and they use inheritance. And if you’ve ever dealt with a really large PI AF system, you’ll instantly be able to tell why inheritance gets tricky. That coupling starts to make things really difficult.

So, composition, on the flip side, is a little more flexible. It just says, hey, I have a model, and I want to inject in that model a set of attributes that are, let’s say asset metadata. So, the name of that manufacturer, the asset, when it was last serviced, the year it was installed, that kind of thing. You can place that in any of your model definitions that are an asset, change that set, and have a smaller change set across all of your model definitions. Because only the things that are composed of that will change. So it gives you some more flexibility.

Composition is probably the better way to go, but it’s a great question. I would also say don’t go digital twin. Digital twins work really well in building automation because buildings don’t change. They’re built once, and there’s a code and everything set up.

When you look at discrete process or batch manufacturing process, maybe not so much. But with a lot of data, things are in constant flux. If you try to design your models around digital twin to perfectly replicate that physical asset digitally, you’re just going to do a lot of work. And you might only leverage 20% of the value of that work at the end of the day, given the data you use.

David: Excellent. I’ve always liked the idea of inheritance because, from a data governance standpoint, it means I make one change, and I know that’s going to proliferate or federate across the entire system, versus the composition where I might have to go in and do some other activities.

But I hear what you’re saying. There’s a lot more flexibility. I tend to be more on the composition side myself. It just means part of our data governance is when there is an update, we’ll call it a best practice that everybody involved in this DataOps tool is responsible for updating their assets accordingly. Is that a good approach to handling that?

Aron: Yeah. And another practical example. You have an injection molding machine that has two motors, and then you have one that has three. You could do that inheritance, but doing that model via composition is a little easier. The flip side of that is again, you might not want that level of detail. By the time you get up to your cloud system and the AI and ML, because if I’m coding against the data you’re providing, and I’ve got to deal with this one has two motors and three motors and five motors, I might not care. That just creates complexity for me.

So you always have to think about that end-consumer in your use case of the data and provide just enough that’s needed to simplify their lives. So, pros and cons.

CHOREOGRAPHY VS. ORCHESTRATION OF DATA

David: Yeah. Very good. Great discussion. Thank you, Aron. So, continuing on the same idea, there’s another concept that’s known as choreography versus orchestration.

Moving this data around, how do we want to ensure that the right data ends up at the right place? So, the concept here is choreography. If you’re watching a play or an actual dance group, the choreography is I’m going to do something based on what somebody else is doing, but nobody’s necessarily controlling me or telling me what to do. I have to be paying attention to where everybody else is so I know where I’m supposed to be and what I’m trying to do. So it’s very independent versus orchestration. Think when you go to the symphony or the orchestra, the conductor is always providing that guidance. And I know I’m not going to play my instrument until the conductor has looked at me and said, okay, now it’s time for you to play. And there’s a lot more control over that.

The benefit of choreography, of course, is that all my systems can act independently. There’s a lot less overhead that’s associated with that. Versus orchestration where now you have a system that is telling everything what to do, and it has to manage everything. And there’s a lot that goes into that. In this conversation about orchestration versus choreography, Jeremy, what are your thoughts on that? How do you approach it? What do you think works? What should you try to avoid?

Jeremy: Yeah. I think we can approach this topic from multiple perspectives. Let me start with one, and then you can ask a follow-up question.

The first point is that on the topic of orchestration versus choreography, you could also rephrase it to something like central governance versus edge-driven. I would even say that orchestration, the central governance, is actually a failure mode for industrial IoT for big projects because this is how IT tends to think. Typically, in IT, problems are a little bit different. You have one cloud and one server; everything is very close. So, what you can do is you can have one central Git repo, and one central database. And from this, everything else builds.

I would say it’s a failure mode in manufacturing because in manufacturing, you are working with distributed systems. So what does it mean? You cannot assume there’s always going to be an internet connection. The connections between all the components are often very unstable. You don’t want your production to stand still if the internet connection goes down. Also, from a security perspective, you have all these demilitarized zones, and every layer acts independently from other layers. Of course, if there are longer outages, let’s think about the PLC. If there are longer outages, no information comes in from the SCADA system or the MES system. So it would probably stop after an hour, but it could work independently for a certain time frame.

Manufacturing itself is very much edge-driven. So it’s about choreography. There are a lot of different perspectives. The result of edge architecture is that it’s getting quite complex. If you’re designing a system and you cannot apply these typical cloud IT principles, and you start doing it, you might not be prepared for the result. Because how does it look then if you start with choreography, how does it end up? You have a lot of edge devices everywhere. There could be multiple message brokers. People in IT often ask, “Why would you have multiple message brokers? Just have one.” No, we need every component to act independently from the others, so we will have a message broker at least four per side, if not even per production line.

Then you need to sync between it. And now, you can encounter a lot of different problems that arise from this architecture. I mean, the first thing is you could argue, okay, maybe let’s streamline everything. Maybe you can comment on that. I think it’s impossible to streamline that in manufacturing because everything is built to be edge-driven. So, if you want to work with it, you’re going into distributed systems. And there are a lot of challenges that arise that a lot of people also ignore. But as soon as they start to scale out, they realize this.

One example of data flow is what type of data you’re sending through the system. So now imagine we have a message broker for each production line and one for the site, and we have one in the cloud to gather all the data. If we know the multiple cases you have, for example, if you have time series data, it’s really important that 99.9% of the data arrives in the cloud.

If there is an internet outage, it’s important that the data is buffered so that it never gets lost and stored on disk. But to be honest, it’s acceptable if sometimes data points get lost. So if you send a data point every second, and then it gets lost once, okay. But to be honest, I think a lot of people have this problem, but they just simply don’t notice because it’s fine.

But then, when you’re talking about things like adding a new order or sending commands, for example, in the cloud, you have a button to start a machine. Then things start to get complicated because how should the system behave if there’s no internet connection? You don’t want this command to get stuck somewhere and then execute hours later, so this is what makes choreography so difficult.

We try to build a product that gives you the overview over that. So you have all of these instances, you have all of these edge devices. We give you a single view that allows you to see the health of each of your nodes, the health of each of your network devices. So in case something goes wrong, you are notified about it. And you don’t have to go through hundreds of VPN tunnels to get to that edge device; you just need to check that the problem is somewhere else. But, now I’m stopping my monologue and getting over to you, David, for any follow-up questions. I hope this was what you asked with choreography versus orchestration.

David: Thank you for that. I tried to pick words that were going to be very difficult for you to say. And I’m glad I was successful with that.

So, your comment about orchestration versus choreography is that you were talking about the pitfalls, especially if you do a full-blown orchestration where the command is central to that governance.

I wondered if maybe we could come back to the Goldilocks principle. Can there be a balance? There will be some choreographed systems, but other systems are more orchestrated because we want to have more control over that data and understand what’s going on. Would that be a good approach? Maybe we could try to keep it as choreographed as possible but add orchestration as needed.

Jeremy: Yeah. So what we try is to bring both worlds together because, at the core, it needs to be choreography. Everything needs to work independently of each other. But on the other side, just from a maintenance perspective, you don’t want to have all this chaos. You want to have central view. You want something like Git and GitOps to configure it. So you need to meet in the middle. Our current hypothesis is the data infrastructure needs to be distributed. But then the configuration of that can still be centralized. So that would be the current hypothesis on where the trade off needs to be.

David: All right. Excellent. Thank you for the follow up on that.

So, another question that we get asked a lot, and I think Kudzai, you can speak to this, is about some of this conversation around inheritance and composition or orchestration versus choreography.

QUALITY OF SERVICE WITHIN MQTT

David: The conversation is all about needing to know that my message made it to where it was intended to be sent. While we’re not here to design a point-to-point per se, I want to know that if I have a change that is relevant to some other system, there needs to be some assurance that that message got there. What comes up oftentimes is, specifically within MQTT, this concept of your quality of service or your QOS.

So, while QOS2 will ensure that the message makes it there, my understanding is that it is really just the relationship between the client and the server, not necessarily from the client to the other client. So, can you just talk a little bit about the quality of service in general and its implications? And then, for those that must make it there overnight messages, what does that look like using QOS2 between two clients?

Kudzai: The same applies. That’s the same question that we often get from customers—they want to send MQTT commands. And just before I go into the QOS, most of the time, we basically tell the customers that if you need to control your robot, like you need to perform some kind of closed-loop process control, you are better off using OPCUA or some other field-level protocol because it guarantees synchronized communication, where you get the response, the immediate feedback about what’s going on. So, MQTT is really not built for sending those critical commands or messages.

As far as quality of service is concerned, yes, you are 100% right. The relationship between the quality of services between the client and the broker. With quality of service, the broker guarantees message delivery to the actual client. What that means is that when the publisher sends a message, it will get an acknowledgment that it’s been received by the broker and that the message is being sent to the client. However, what happens after the message is handed over to the client is not something that you get within the MQTT framework. So this is something that would typically be addressed in a direct, point-to-point protocol. However, with them out of service, there is the guarantee that even when the network fails during that exchange, all of the clients that did not receive that message during that network disconnection, the broker would follow up with all those clients and make sure that they do send an acknowledgment. Then that actual acknowledgment is sent back to the publisher or whoever is interested in knowing what acknowledgment has been sent. So, that information is not lost. Right?

So the quality of service guarantees you all of that information as long as you’re using persistent sessions. But again, if you want to control a robot or if you want to get data out of a database and search through what happened, you are better off connecting directly to that application and then that snapshot of information, that view, you could then publish it to the MQTT, Unified Namespace for all the other components that might be interested in that result. So, MQTT is only for real-time events instead of across your entire enterprise and not for transactional interactions.

Jeremy: To follow up on what Kudzai said, first of all, please believe me. I studied mechanical engineering, so I studied the vibrations caused by CNC machine milling. I realized that sometimes referring to scientific IT literature. And taking a look at, hey, how do these big companies solve it? That really helps.

I think this question of quality of service is interesting because it is already kind of like a soft problem. So what “quality of service” means is that it guarantees the communication between the client and the broker that it happens only once. But still, it doesn’t guarantee that through the system. For example, if you have one message broker that originates from one client, it actually ends up at the second client. Or even if you have multiple message brokers that actually end up through multiple brokers as a bridge to the cloud, whatever, it doesn’t guarantee you that. So, you always need to solve this on the application layer.

It’s not the problem of MQTT. It’s just how IT protocols work. Even the underlying fundamental part, even in TCP, you have this concept of resending messages, but it only goes from the network stack of one device to the other network stack. So if anything gets lost there, theoretically, there’s already a retry mechanism in there. Also with MQTT, every day, there is a retry mechanism, but still in MQTT there’s the option that the message gets delivered somewhere. For example, it gets delivered to the client, but something goes wrong between the delivery of the message and the actual processing of it. How we typically solve that, and how we can send commands through the system is by giving the message a certain type of payload, to actually design the protocol and the application level to actually ensure that some of the problems don’t arrive.

For example, you would have, like, a request and response mechanism even through MQTT. And if you really need to ensure that something was really executed, maybe we can link to the article further down. It really goes well into the details of exactly this quality of service discussion. So if anyone is having exactly these type of thoughts, I can definitely recommend going into this article here.

David: Yeah. You shared that article during our conversation about this, and I found it exceptionally helpful because it brings everything that we’re doing full circle. One of the questions and the struggles that I have had is that we’ve always talked about this Unified Namespace both for the data and the events and transactions. And to me, these events and transactions are asynchronous. And it’s not like, say, time series data that’s just constantly flowing in. As we talk about events, it’s either the car arrived somewhere, that’s an event, or you could put an event frame around—we had a production time or downtime or there’s a shift. An event that could be time-bound is another aspect of it. And then, of course, there are transactions that are at the end or the beginning of those events. There are certain payloads that publish things off.

So, you know, maybe these are just last questions. In general, is a UNS really geared for time series data, or can it be used for events and transactions? If so, what is the best way to approach the events and transactions? So at this point, anybody can chime in on thoughts around that idea.

Jeremy: I think that the UNS is suited to some type of degree for events. I’m still investigating this, but my gut feeling tells me the more complicated it is, the more it’s better to have it in a different type of system.

One example, a state machine, you know what to enforce, that the production is always in a reliable state. So that means a line PLC or a SCADA system. So if you want to use UNS as a communication through everything. So, send a machine command, and you can show with the techniques that I mentioned that data actually arrives there.

But who is taking care of the state of the other production line? So, what do I mean with state? When the machine is starting, in order to go from the “starting” state to the “started” state, PLC must first respond with a certain type of command. There must be a reset. Whatever. This type of logic of events. It’s much better suited in an external system because UNS cannot handle the state thing, but you can use the UNS to send this information through it.

I don’t know if it makes sense. And maybe others can chime in when you have some more experience in this regard.

David: Yeah. I think it’s going to be a topic that certainly gets discussed on future podcasts. So, we are a little long on the podcast. I do apologize. I believe Kudzai, you need to head out real fast. Do you have any final closing thoughts? We started off with “let’s define a UNS.” Do you think we’ve achieved what it is we set out to do here?

Kudzai: Thank you. Yeah, I think we did. So I’ll quickly go through some final thoughts about the Unified Namespace and the idea of events and transactions. Again, just to reiterate, MQTT is a protocol that gives access to the real-time events and state of your enterprise. It’s not so much used for controlling robots or accessing the database. That needs to be handled outside of the Unified Namespace with a direct connection to the database or a robotic system using a field pass protocol or Rest API endpoint, whatever the case may be. So MQTT holds that current state of information, and using features such as retained message, it makes sure that you’ve got Unified Namespace always living within your MQTT broker, not as a database, but just as that snapshot. So, whoever joins the network is able to discover what the current state of any component that is connected to the Unified Namespace is.

And if the network happens to disconnect on any one of the clients, that information persists, but not as a database. Again, just as a mechanism to make sure that that information is redistributed whenever these different components rejoin the network.

We spoke about the database connectivity aspect earlier. For example, for HiveMQ, we provide a connector that allows you to persist all of this Unified Namespace. So you could say, select different namespaces you want to consume where they connect, I’m going to say Snowflake. If you want to sit in a namespace that you want to persist to, you use a connector that allows you to do that. If you want to persist data to a MongoDB, there’s a connector that allows you to do that. Or if you just want to send it over to Kafka, there’s a connector that allows you to do that. So MQTT itself doesn’t persist, but it gives you a mechanism to be able to persist your data outside of the broker.

Lastly on the idea of orchestration and choreography. So I mostly like to think of it as it helps to look at Unified Namespace as a way of sharing data rather than collecting data. I think we’ve been accustomed to the idea that data is a byproduct that merely needs to be collected. But it’s the mindset shift where you start to look at it as this idea of having to share data, which really encourages owners of that data to actually package it in a way that is usable on the other end because now it is being shared rather than just being made available for collection.

And it’s kind of like a mix compromise. We have central governance, and you’ve got this distributed approach to it. So it’s a balance that needs to be struck.

David: Yeah. So it sounds like there’s a lot that goes into a Unified Namespace. And the ways it can be approached and depending on the use case really will indicate how you go about it and actually developing the formal architecture to it.

So, Kudzai, if you need to drop, please do. I do appreciate your time. Thank you for that. But before we sign off, Aron, we’re talking a little bit about time series data, versus events and transactions. Any thoughts that you’d like to offer on the topic?

Aron: Yeah, as you say, hats off to Kudzai and Jeremy for their time. I think to say MQTT, don’t try to do transactions through there. Don’t try to do commands. I think it is a message we kind of need to deliver to the broader market. And it’s refreshing to see that because we are in the early days of UNS. It’s even on the Discord channels and stuff now—people trying to figure it out and it’s like, just don’t do it. In most cases, it’s not the way to do those.

What’s interesting about the point-to-point connections is that you need a guarantee the data is going to be delivered to S3 or some other source, and you’re running it through MQTT, can you kind of hack it, using the client IDs and stuff to say, “hey, if this client drops, then start storing.” Yeah, you can do it. But it might be a sign you might want to send that data to an alternate path. Maybe that should go to MQTT, but it should have a direct connection to S3 or Blob to guarantee that delivery.

So what I think is interesting is UNS has caught fire. It’s definitely here to stay. I think MQTT is a core component of that report by exception. But you know, all these conversations and confusion is really the market saying this is great, but what about this? What about this? And what about this? Because we have slightly different use cases in manufacturing, MQTT design for oil and gas, and the protocol itself doesn’t encompass all of those.

And Kudzai was very honest about that. And I think that’s great. I think MQTT is a core part of UNS. And I think the interesting part is how does UNS grow? How do us, as vendors and technologists, support our customers to enable these other use cases through best practices? I think, David, it’d be interesting to come back in six months and be like, okay, what did we say that was right? And what do we say that’s now changed? I think it is moving that fast.

David: Oh, that’s an interesting perspective. I hadn’t thought about how we could revisit this. And what I said at the beginning of this call is, well, actually that’s not what a UNS is. So, I think maybe the struggle is that we’re still aligning ourselves on what the technology is and what’s in the approach because the intent is that we want to create these data-driven organizations. We want them to have access to all of that data in real-time, and they can use that for solving problems, whether it’s asset health or predicting the failure of a piece of equipment. And what does all that look like? So, before we sign off, Jeremy, any final thoughts you want to share with us?

Jeremy: Yeah, to add to that, the opinion I stated is clear. So, I said that it would be fine to send some comments to the UNS. But for more complex things, it’s probably not that well suited. But for controlling the production, you can still use UNS, but it’s not that intuitive to do. Just sending messages through it, there’s a lot of complexity behind it. So if you’re unsure about whether to do this or not, I would tend to say probably for now, you can use it, but be careful of what you’re doing.

David: Excellent. Thank you. And really, Kudzai was talking of Industry40tv or the HighByte website, and certainly, with UMH and Jeremy, there’s a lot of great content about things to think about as far as we’re doing this.

So, at the beginning, I define the UNS as it’s an approach to an event-driven architecture that utilizes a Pub/Sub technology like MQTT, along with a DataOps platform, that we’re creating semantic data models and publishing it to a defined topic structure. It seems like there’s a lot of other considerations in terms of composition versus inheritance or orchestration versus choreography and taking advantage of some tools that are in there. Just make sure that what tool you’re using is the right tool for the job.

Well, thank you, everybody, for your time. I want to thank the other people who were involved in all these conversations as well because it’s certainly something that I’m trying to get my head wrapped around. But thank you, everybody, for your time. I look forward to seeing you on our next podcast.

People on this episode