Rhize Up

Rhize Up w/ David Schultz: Giving Manufacturing Data Full Context (feat. Andy German and Geoff Nunan)

July 23, 2024 David Schultz Season 1 Episode 7

David: Another concept that we talked about – and this is some pretty heavy stuff – is when we first did our first two episodes regarding the ISA-95 standard. When you go back to when it was created – and even in some recent books that have been published on it – the intent of the standard was asking how we exchange information between our level four and our level three systems. What does that look like? How do we integrate our business systems with our manufacturing and/or operation systems?

That’s the intent of the standard. How do we want to model the data? What do we want it to look like? What are all the things that go into it? But as we’ve evolved the Manufacturing Data Hub, we almost look at it as an ontology for describing the manufacturing process.

Can you go into a little more detail about what we mean by that? It’s not just how we move data around. We’re using this as the ontology to describe what our company does.

THE RICHNESS OF THE ISA-95 LANGUAGE

Geoff: Yeah, and it’s such an important piece. For those who are observant in backgrounds, you’ll be able to tell that my background and Andy’s background are remarkably similar. It’s because we’re sitting in different rooms at the same hotel at the moment.

But, we were chatting yesterday, actually, about how the ISA-95 language is this incredibly rich, very specific language that gives you words to describe, very precisely, exactly what you’re talking about in this very complex manufacturing domain that we are working. It’s a very detailed structure of laying stuff out.

The language lets us communicate in a way that those who are familiar with the ISA-95 language can understand. Andy, you and I were chatting about this in some engagements with customers, and we’ve gotten deeper into a problem because we’ve had this common language. I wonder if you have an example that maybe we could talk about here.

Andy: Yeah. Particularly in IT or technically-led engagements, where you’re looking to solve a specific set of problems or a specific set of use cases, you can often find yourself constrained by the domain that you can see before you, particularly if you’ve not got too much experience in general manufacturing.

So you can end up in a situation where you can only deal with what you can see and with what the customer communicates with you. You can only deal with that which has already been laid out.

A specific instance is where you’ve got data streams of incomplete data coming up from the shop floor. They look a bit like MES data, but there’s not there’s not much structure to it. This data can end up landing in a data science or machine learning environment, and that’s where the problem-solving begins, with really great technical people looking at the problem and taking the approach of domain-driven design.

The domain-driven design approach is where we talk to our customers and our stakeholders, and we understand that language as they understand it. Then we bake that into our solution, and the objects that we create and the things that we’re dealing with have the same names. Through that, what we can end up doing is having this greater, closer affinity with the customers and the stakeholders.

But there’s a flaw in that plan where the language might not be right. The language might not be sophisticated enough. So what we end up with is domain language that is quite limited, and data science or engineers who are trying to solve very difficult computer science problems are kind of restricted into talking about columns, rows, streams, tables, and databases. What they’re not talking about is materials. They’re not talking about whether this is a work order, or is an operations segment, or is it a process segment. What ISA-95 allows is that it gives you a domain language and a place to put your data and concepts that you can bring in to take your discussion a bit further.

So, at first glance at a particular situation, let’s see, you’ve got a material table somewhere. There’s a material object, and that material object has a column in a first-stage prototype MES. And that material table has got a run speed. This is the speed at which this material runs on a specific machine. Then you try to take this situation further and apply this to this new MES system you’ve got on a second line, but for this same material, the run speed is different on the second line. So, what you can end up doing is adding another column to your database saying “run speed for line two,” and your journey through the problem space is defined by each time you run into the next problem and the next problem and the next problem after that.

That’s a really simple example that will be obvious to a lot of people, but it happens at evolution. Sometimes we find customers starting from scratch with their domain, whether that backend is SQL server or PostgreSQL or something like that. They cannot actually get the conversation through all the way to where it needs to be. With the way we approach consulting, we’ve got a root-and-branch approach to really analyzing the process and trying to understand what the data is. And it may be that the run speed does belong as a material property. It may be that the run speed belongs not to the material property but to a process segment. Or it could mean that it belongs on an operations segment or even on a work request. So maybe somebody in planning decides what the speed needs to be at the time of planning because of some variable that’s only available at that point in the process maturity.

So, with ISA-95, what we give ourselves the opportunity to do is make a selection. We’re not confined to just this little materials table because that’s the extent of our domain knowledge so far on that project. We’ve got a big menu, a big selection of places where we can choose to land the data. And in a specific project that we’ve been doing recently, there’s been a really good team trying to solve a lot of problems for a couple of years really. And they’ve been constrained by this, and we’ve been able to sort of come in and say, “What we are looking at here, it looks like a job. It looks like machine data, but actually, it’s not machine data. These are job responses. And actually, we need to create a job response for every time this group of machine data changes, and we probably need to add some properties to the job responses.”

And then when we get that far into the discussion and everybody understands it, we’ve got many heads able to talk the same language and discuss, “If these are job responses, then we can relate our job responses back to maybe a process segment or maybe an operations segment. Where do we put that bit of data?”

Very quickly rather than getting bound up in conversations about columns and rows and a partial domain language, we can advance the discussion into something a bit more sophisticated and take the problem much further. That happens quite a lot when we use ISA-95 as a backbone for data persistence in our database, which is just a technical thing. But we’re actually using ISA-95 as a minimum when we engage with customers to say, “Here’s part two. Here’s part four. Here’s part three. These are the checklists of things that we need to visit. These are the checklist of things that we need to look at and consider if we’re going to take this group as a whole.” Just like in a surgical environment where there’s somebody there with a checklist making sure that all the basics have been done.

ISA-95 gives us that backdrop of confidence. We really are quite confident in how to engage with this domain and lift some of the nuance that’s happening within these data streams, within these data structures, and put them into a place that’s sensible and common and can be well understood.

Then from that beachhead that you’ve established in the early stages of a project, you can then move forward and go onto solve software problems, then solve the computer science problems.

David: Excellent. So again, as I introduced this topic, it was with ISA-95. You just think about it: exchanging data. I think the power here is that in your example of where you have this property, I would call the standard rate of how this material runs on this particular piece of equipment. It gives a lot of flexibility in doing it. So it’s not so much I’m just here to try to exchange data back and forth. The ontology means that I have this property existing in a certain location because that properly reflects how we go about doing our manufacturing process. I don’t want it in any processed segment or in an operations segment. I need it down at that network request level because that’s how we describe our manufacturing process. That’s what makes sense in this context.

Learn more about the importance of ISA-95 as an ontology →

So, with that power, when people want to consume the information back, that’s fundamentally where we’re trying to get. I need to be able to query and interrogate data so I can start making better decisions. If we have an ontology for describing it versus just a mechanism for exchange, that’s where the power comes in on the idea of the ontology here.

Andy: Absolutely.

APPLYING ISA-95

David: So with that, Andy, one time, you and I talked a little bit about what would happen if I took these ISA-95 models and applied them to the edge. So, for instance, in a UNS-type architecture, I may be bringing down a work order. And I say a “runtime,” meaning the operator is getting ready to select a work order, and I’m going to now process it. There’s a whole payload that’s associated with that. There’s a lot of data there that allows me to run the plant at runtime because there’s a BOM. There are some specifications associated with it. You know, there’s a lot of information there. What if I were to apply the ISA-95 model to all these events?

For instance, in my ERP, I have a new work order that’s been created. It’s going to give me this operation schedule, and I’m going to present the data that way. Or, I’m going to have an operations request because once I’ve scheduled it, there’s now a request for all the operations and all the segments that are associated with that. And then, of course, I’m also going to get the operations responses. So, that’d be that level four to level three communication, or if I’m just doing it between part four of the standard where it’s getting information back, I’m going to have my work master, my work schedule, work request, and work response. That’s going to be something that’s even more granular at that level.

And why I brought this up is because if I’m looking at an enterprise where there are multiple sites, they may go about building their UNS differently because I have found that typically these are standalone events. There’s not a data governance piece that needs to be there. Can we abstract that out to where you can build it however you want but utilize these objects?

I’m curious just your thoughts on that approach. So instead of me just building a semantic data model, which just says, “These are my lots.” I can utilize the model and the relationships of the model in doing that.

Andy: Yeah. It’s an interesting approach to take. Earlier on, I mentioned how REST and MQTT or broker-based self-publishing could be similar in that the part of the technical implementation for where you go to get the data, the topic structure, the path structure, or the URL structure in REST defines what you’re going to be able to get off that endpoint or defines what’s going to be available on that endpoint.

So, if you take the approach with a broker that says what we’re going to do for a particular piece of equipment, we can have an enterprise site area line. And then underneath that, we’re going to have a complex ISA-95 object available, like an operation schedule, like a job order and that kind of thing.

That’s perfectly possible and probably very convenient for the key events you know you’re going to be interested in. Also, for the concrete events, the things that you can build functionalities off of, and for the things where you’ve got a dashboard in Grafana, and you know that this dashboard is going to visualize a very specific payload that is a combination of job responses. It might be a combination of time series data and that kind of thing in there.

I think if you know, upfront, how it is you want to consume the data, then I think it’s very easy and convenient for you to be able to compose that data structure and put that data structure onto one or several, a small number of topics or a small number of endpoints in REST, for example, that allow you to conveniently respond to changes in that data and reassemble the relationship between those different pieces of data.

And it might be that you could choose the approach (I wouldn’t), but you could choose the approach that says, “For this Grafana dashboard that I’m building, the backend data payload is going to be this complex JSON object, and I’m going to fetch that off my broker. I’m going to fetch that from a specific REST endpoint. I’m just going to keep calling that REST endpoint. I’m going to keep subscribing to that broker topic. And that’s going to give me everything I want for my Grafana dashboard.”

That could be a convenient way of doing it, and that’s great. But then it’s when someone asks you another question in a project: when somebody asks you to do something a little bit more complicated. So, you need to bring in another variable from another part of the graph that suddenly becomes interesting. The burden, then, is that you are going to have to rebuild that payload structure somehow. So whatever it is that’s actually populating that REST endpoint, whatever it is that’s resolving that data, needs to change.

I think if you’re going to build ISA-95 endpoints for these more complex objects and put them into your broker, then the trade-off is going to be around the flexibility of how you go about building complex relationships to inject into your payload.

I think there’s probably a balance to be struck between assuming that everything just gets baked into one payload and goes onto one topic versus having a distribution of the kinds of data that you might be interested in – like availability, performance, that kind of thing, against, you know, current quantity, quantity left. You might have some core, foundational objects on there that are convenient to reassemble and to manage the relationships between those at runtime, query time and display time.

As an architect, you need to be aware of the burden that you may be putting on the client and the consumer. They’re probably going to need to rebuild the relationships at some point. That’s the thing. Whoever’s pulling the data, whichever actor is pulling the data onto those topic structures, has to also be aware of those relationships.

The other thing is you’ve got a concrete topic structure or URL structure in place, which everybody that’s consuming needs to understand somehow. And so every consumer is going to end up somewhere in their configuration with some kind of model in the configuration that reflects that, which is in that broker or that which is on that REST URL structure. So we’ve got a duplication there. So if we do go and change our equipment hierarchy, for example, that may – unless you’re really clever about the way you design everything – have quite the knock-on effect.

I think if you’re introducing these more complex events into a UNS environment, then I think it’s about understanding the trade-offs of that. And I think it’s about understanding what you’re trying to get done for the customer. The thing about architects is that they love to architect things, go the whole hog, and get everything perfect. In actual fact, what you really might want to do to get value to the customer is just get job responses onto a topic structure for that equipment so that they can consume it in Grafana.

And that’s great, but then, you know, it’s your next problem. It’s the problems that come to you down the line that you need to be aware of and not necessarily over-architect for, but just be aware that they may be coming.

David: Perfect. Yeah. I mean, a lot of this came from the idea, “How do I abstract the models that I’m doing here?”

So if I have a company that makes widgets at one plant but making juice at another plant, I might go about architecting this data a little bit differently. But if I were to utilize the models that exist within ISA-95, I can abstract that out. So now my resources, my assets, my equipment, my personnel, and my materials, all fall in there.

The next logical piece in this mental exercise was that it also makes it easier for a Manufacturing Data Hub to consume that information because it’s already somewhat predetermined. But say I take that UNS, and now I am going to put it into some kind of either a time series database or data warehouse where it’s just a large table and bring all that data together. I’m going to have this really great time series representation of all of these models and all that information. But it became pretty clear – and you’ve mentioned it a couple of times, Andy – unless you understand the relationships of all that data, you’re really going to struggle to create value out of the data that you’ve already collected. Let’s explore that a little bit. So, Geoff or Andy?

Andy: Go on, Geoff, I’ll let you go.

CREATING VALUE THROUGH BETTER DATA RELATIONSHIPS

Geoff: We’ve probably covered a couple of examples there that are really good to call on. If you’ve got a table of product information and you’re running a particular product on your machine, you might publish in your Unified Namespace topic a JSON that says, “I’m running this product on this machine.” This is okay, but how do I get to the other information if I want to?

So, if I want to know who was operating that machine at the time – going back to what Andy was talking about before – I need to subscribe to this topic, which says what product was running on the machine. I need to look at the time, and maybe the timestamp is on that message somewhere. Then I need to use that information to go back and either subscribe to or query from perhaps a different topic that has who was operating the machine at that time so that I can kind of join all this stuff together. Because it’s the joining it all together, which is the valuable bit, each of these bits of information on their own is interesting but not hugely valuable. It’s when you join it all together that you get the value.

So, why do we have a graph database, or why do we have a GraphQL endpoint in Rhize? It’s so that I can very easily ask whatever questions I want to ask of the Data Hub. So, if I want to know who was operating the machine when it made this product, well, I ask for that. I don’t have to care about if that is actually spread across different topics and it’s in several different payloads that need to be pulled in and then joined together to find the answer out of that. The Data Hub is doing that for you. You just ask it for what you want to know. And it’s built to be incredibly fast and efficient at doing all that work of giving you back the answer to the very specific question that you asked from it because it’s got all of those relationships built into it.

As a consumer of the information, it’s still Pub/Sub. You can still subscribe, but you don’t have to subscribe to a topic. You can subscribe to a query across topics. So if I want to subscribe to a set of information that says, which operator was operating a machine and what was running on it at the time, if I, as a subscriber, want that payload, I can ask the broker for that payload and subscribe to that.

There’s nothing on the publisher side that has to pre-build that payload and send it to me. The publisher doesn’t need to know what I’m interested in. And this is the power of the Manufacturing Data Hub. Publishers can publish their information. The Data Hub joins it all together in the ISA-95 standard model. And a subscriber can ask for what they’re interested in across all the various bits of information on topics that went into there and get exactly what they want back out of it. That’s the really powerful part.

David: Excellent. So, Andy, is there anything that you’d like to add to that?

Andy: Well, I always use the phrase recurse and traverse. That’s where the value comes from, where we’re querying the graph databases that we can recursively visit objects and work our way up and down hierarchies pretty conveniently. Then when we get to the point that we want in the hierarchy, we can traverse across to get context, and you can keep going in a graph database. It’s quite intuitive to be able to start with a material lot that you’re interested in and then, in old-fashioned object notation, to look at the material or to say “material loss” dot “material actual” dot “equipment.” Then, you can look at where that material actual or where that material loss was produced.

So you’ve got this ability to go up the chain and visit equipment from the material. But equally, you can look at a piece of equipment and then say, “Watch job orders run on this equipment between these days.” Grab your job order and then go into the job responses. Then, go into the material actuals and find all the equipment and material loss that was produced and consumed on that particular job order.

The two approaches to material loss yield the same or similar understanding, but the journey through the graph is different. And the journey through the graph depends on where your starting point is and what it is you’re thinking about at a moment in time. Starting in a convenient place and leading to somewhere else. That’s got a degree of convenience and exploration that’s quite intuitive. As I’ve said, being able to do that in a relational database is a degree harder and much more technical. So it inhibits your ability to think in the way that you might want to think when you’re exploring these complex object models.

David: Oh, absolutely. That’s a perfect example of how I would do a genealogy or a track and trace. If I need to get into lots and sub lots, I can certainly model those. They’re part of the material model that’s there. Again, I think the important takeaway is that if you don’t understand the relationship of all this data, even dumping it into something is really not going to benefit you because now, at runtime or at query, if a person wants to understand this data, I might have to rebuild that with those relationships in my query. And that’s some of the problem.

EXPLORING GENEALOGY AND TRACEABILITY USE CASES

Geoff: That’s a great example of exploring the genealogy and traceability use case. This is one many people in manufacturing deal with all the time. If I had a piece of material, what was it made out of? What are all the components that went into making it? Can I look at this sub-lot in the ISA-95 language and all of the components that we use to make it?

But then maybe I put that thing, let’s say it’s a mobile phone. I might put 20 of them together in a carton that has a label on it. And so then I’ve got this as part of another lot. And then maybe it got shipped out. So what is in it, and where did it go? What did it become later on in this upstream and downstream traceability question? And then you’ve also got what happened in the process. So, on which machine was it made? What were the process variables of the machine that made that phone? When was it made? Who was the operator who operated the machine when it was made?

And what the relationships in the data, when combined with the GraphQL language, give you is the ability to ask for all of that information in one request, where no developer has to think about building you a particular answer to that question. You as the consumer of the API can do all of that yourself without needing to go and get anybody to develop anything for you if you’ve already got that data in the database.

If you change your mind and say, “Well, actually, I don’t need to know the operator, but I really do want to know the average temperature of the machine that particular mobile phone was made on in the factory.” Then I just change that in the query, and that’s the response that I get back. But I get it being one request. So I get all of that information. I don’t have to pull all the separate bits of information and then join it together myself in code and then figure it out. I can ask for all of that, and in one request in 20, 30 milliseconds, get that answer back and in the format that I asked for it.

David: Yeah, absolutely perfect. This segues us beautifully into the last thing that I wanted to talk to you about when you said the average temperature. We’ve been spending a lot of time talking about events within manufacturing, but there’s also this time series data that has been referenced a couple of times here.

So, within the context of a Manufacturing Data Hub and in Rhize specifically, how do we go about working with the time series data? For instance, you want to know the average temperature. There’s going to be this temperature sensor out there, and I’m going to look at this event. There’s going to be this job response that had a start and end time, and now I can bring that back. But how do we integrate with that? And what does time series data look like within a Data Hub?

Geoff: That’s a great one for you, Andy because I know you’ve been working on that question recently. 

HOW THE MANUFACTURING DATA HUB HANDLES TIME SERIES DATA

Andy: Yeah. Time series. How does that not work with the data set we’ve got? 

We’ve got particular types of data that lend themselves very well to time series, and they’re reasonably obvious to most people. So tag data being streamed out from the operation and through into time series is a fairly obvious use case. Time series databases allow you to use time and data as a first-class function.

So we’ve got convenient features that are highly performant. We can also do things like query the database. We can do grouping and windowing on time. So, effectively, you can group by minute and bring all your data out of the database. If you’re querying a number of different measurements or a number of different tables in a time series database, you can structure your query so that for a given 24-hour period, no matter how many events you’ve got in your database, you can bring out – if you’re aggregating by a minute – you can bring out exactly 1,440 records in that data. Then, join them up with measurements from all the tables in time series and match those up. Then, you can synthesize the data that appears in that data set. You bring it out if, in a given minute, you’ve got a thousand values for a temperature sensor, then you don’t want to yield a thousand records. You’d yield one record with the average for that many – all the highs for that minute or some kind of aggregate.

But in a given minute, you might have no records in your database because the temperature only changes once for the whole day. But you still want 1,440 records in your data set for the convenience of feeding into Tableau and doing the turbo pivot Tableau is so good at. 

So, what you want to do with the time series database is synthesize the data and bring forward the last observation carried forward. So you can bring the value that you had four hours ago and use that to populate or copy down, use Excel terminology, and treat the data from the time series as your values. You’ve got ways of grouping by aggregate or interpolating these data structures that you’ve got stored in time series, which can be a powerful way of being able to lift that data out of your database.

That’s a first-class function of a data tool. So, you might want to go directly into a time series using native protocols. We’ve got a few ways of supporting this. We use a CQS pattern, for example, when we’re working with time series. So we’ve got a dedicated Query Engine. So, I’m talking about a time series that allows you to go in and pull the data the way that you like. But what we’ve also got as an implementation piece within values is we federate the time series data into the super graph, into the GraphQL API that we’ve got.

So, on specific objects that are going to have a time series database behind them, for example, equipment, what you might want to do is construct a GraphQL query that says, “Show me all the equipment.” GraphQL automatically gives you a group by behavior. If you start off with your query by asking for all the different equipment that begins with the letter “B” or something like that, then you might get five records at the top level. Then you might ask for the ID or the name of the equipment. And you’ll get a handy JSON structure there that might have an array of three items at the top level for your three pieces of equipment. But then you might want to drill into this equipment, and you might want to say, “Okay, give me the value of property x.” And if you just ask for the value, then what will happen in Rhize is it will give you the current value, the last transmitted value, if you like, that’s come through the broker. 

So you’ll get these three data structures at the top level in your array. Then what we can do with this federation of time series is ask for the history of that property value. So let’s get the history of property value “temperature.” 

You might submit a query that says equipment, description, value, and then history and history object on the date of a database. So it allows you to supply a couple of parameters – start date and end date – whether you want to do some aggregation work on that historized property.

And what you get is, let’s say, you just want to look at the history for the last hour for those pieces of code. You’ll get a JSON structure back from the database that is three pieces of equipment and then embedded inside the JSON structure in the correct place – and it’s important to emphasize that we’ve got this natural group-by thing going on that says when I explore my data when I’ve got my payload back, underneath the equipment, as a child of the equipment, I’ve got my history values there in a JSON array.

So for the equipment, your thousand values are going to be sat under that piece of equipment. So, for the next piece of equipment, you might have a different number of values. So, the data gets embedded right there within the GraphQL query.

And we can extend that kind of concept out to something a little bit different. It’s not that interesting to look at the history of a piece of equipment, really. If you really want to dig in and make improvements, it may be that you want to look at the telemetry values for a particular equipment for a particular job order.

Or you might want to compare two different materials that get consumed on the same machine to see if the process values differ. And then, in that case, if you’re querying job responses, what you might ask for from the database for this equipment, give me the job responses for the last 20 days for this material as a filter. Then you might say within each job response that comes back from the database, give me the equipment actual object. And under the equipment’s actual objects, there will be some history data there. What we actually end up doing is at the point at which we want to draw history for a particular job response for a particular piece of equipment. Our federation technique allows us to go off to the time series database and use the start and end day of the job response. Then, filter for the equipment ID. That brings back the history of that specific job response for that specific equipment. And again, we’ve got the convenient natural intrinsic group that’s happening within the GraphQL that seats this historized data underneath our object.

So that’s great to feed Grafana or other reporting tools, maybe for data exploration or whatever that might be. So we’ve kind of got through that method and all the methods because it’s not the only way to skin that cat. When we place these resolvers, these time series resolvers, at strategically important places within our data model, it allows us to automate the contextualization of the time series database that’s stored there and naturally include it in response payloads from the database that gives us this grouping and this contextualization. 

We can invert that as well. So, you know, that’s one way of doing it with time series. People want to look at data and go directly to the time series database in a lot of cases. In this case, when we push data into a time series, part of the technique that we use is not only including the raw values that we receive from the plant floor. We also do our best to resolve context before we persist a time series. And what we include in the time series is not just the raw data. We also include primary keys into our graph that we think are going to be convenient later for people to join the context data that lives in the graph, up with the data set that they’re pulling back from time series.

So it may be that your job response data, for example, if we’re streaming data into time series from live production, we would probably include the equipment ID, the conversion ID, the equipment class ID, a job response ID, material ID, whatever context that we know is happening at that moment in time. We would generally, if we can, push that data into time series. Then, when we’re building dashboards, when we’re trying to work with that time series data, what we can do is we might want to pull a data set back from time series. It might only be 20 rows long, but actually, it’s an aggregation of 10 million rows that live in time series.

So we’ll pull back. We’ll use the time series performance and the goodness of the time series queries that we have access to, to traverse this million rows of data. Pulling all the aggregation. Doing our filtering. Then what we end up with is a data structure that’s got the aggregations that we’re interested in across only 20 rows and a bunch of IDs that allow us to then go and visit the graph database to pull all that interesting context out.

If we’ve got material ID, then what we can do is hit the graph database and pull the material description. And also stuff that’s associated with the material, so maybe the material class. And if you’ve got a material class, then maybe a property on the material class that defines the material supply or something, but that’s probably a bad example.

But again, if we can use time series and the power time series to bring back this smaller aggregated data set, we can then go off and visit the graph and collect all these contexts of interest again. Then, pull that back into that data structure, manipulate it, and put it on screen.

So we’ve got this inversion of context exploration that we’re able to do. I suppose what we’re really saying is we’ve got this composition of two very different databases. We work hard with GraphQL Federation to find a route into time series through the graph. We work hard with primary-key pre-contextualization within a time series to give us a convenient route out of a time series response and back into the graph to collect our context. We can avoid the need to, in that second instance, duplicate lots of context data that will take up a lot of storage. We’ve got the convenience of graph storage efficiency rather than duplicating data. We’ve got a persistence layer with the storage. 

And so yeah, that’s the whistle-stop tour, I think. 

David: It sounds like there are a lot of ways that the Data Hub interacts with time series data. It could be raw. It could be the contextualized report pushing that data in. Or we’re bringing it back based on a query, and there might be something that is predetermined that we know we’re going to want this type of information. So we’ll construct that ahead of time to take advantage of the efficiencies within it.

A lot of great stuff. Geoff, is there anything you wanted to add in relation to how we do time series data? 

Geoff: Yeah. Just one. And that is not a new problem. To pick a phrase that was used, I think, in the last podcast or one of the previous ones. “Everything old is new again.”

ISA-95: THE KEY TO CONTEXTUALIZING TIME SERIES DATA IN A MANUFACTURING PROCESS

How we put time series data in context of the manufacturing process is something that companies have been trying to solve for a long time. The key to it for us is ISA-95. Yeah, there’s all the computer science magic that has to happen. And there’s a lot behind getting it to happen there. 

But the key is that ISA-95 is a bit like this fishing net. Rhize, as a word, comes from the term rhizome, which is an interconnected network of things. So you can think about that like a fishing net and imagine you’ve got the fishing net sort of laid out on your kitchen table. You can go and grab any of the knots and pick that up, and it kind of brings the net with you when you pick up any part of that net. That’s like a GraphQL query, bringing data connected through relationships with it when you pick that up. 

ISA-95 gives us this pre-laid-out fishing net of all the relationships and all the bits of data. But it also tells us which ones of those should have what type of time series data attached to them. I don’t mean which sensors. I mean, where does it make sense for an equipment property value to be attached to that fishing net of a big network of information? So because it lays out that for us, we can pre-build in the relationships that make sense.

Then you can go and grab a part of that net and pull on it, and the time series data will come with it in the right place and end up in the right place in the JSON file that would come through. It’s this combination of how we do contextualization and how we do ingestion and how we manage the relationships. Then the API and the ISA-95 layer on top, it’s like this magic that happens when you get all the right components put together.

David: Excellent. So, we’ve certainly covered a lot here on the podcast. Believe it or not, we’ve been going for almost two hours now. So we’re probably going to break this into a couple of different episodes.

I’d like to get into some final thoughts. For me, we’ve started off even in the podcast series. If we’re going to talk about a Unified Namespace, let’s get it defined and figure out how we can apply it. And it almost seems like the Uber Broker, and then ultimately the MDH is that logical progression of how we’re going to digitally transform our manufacturing processes.

Let’s get data and get those data models built. Then we want to now start building in some functionality so we can run functions and methods and queries and even add in some metadata, so that’s that next logical progression.

And finally, we’re going to have this Manufacturing Data Hub that, for all intents and purposes, gives us this ontology to describe our manufacturing data. So, you know, to me it’s almost like all of it is important, and it’s all necessary. It’s just that logical progression.

The key is understanding that, with the relationship of the data, we can’t just have one. We need to evolve and advance what it is that we’re doing. So, Andy, any final thoughts as we close out here?

Andy: I think the only thing I’d like to add is a general overview of Rhize and our use of ISA-95. When you talk to the ISA-95 consultants, they love ISA-95 because it allows them to express themselves. They’ve got OCD, these guys, it’s ridiculous. They really want to find somewhere to put the data. ISA-95 allows them to actually finish the job of doing that and get to the end to be able to say, “Yeah, this is analyzed.” It allows them to express themselves. That’s why on the consultancy side, we like that so much. And there are no limitations because our API provides for all of that. It provides for the whole ontology. So it’s really convenient for them. 

What we’ve also got on the technical or architectural side within our stack is a bunch of architectural building blocks. That’s broker, graph, time series, federation, microservices, rule engine, core, edge agent, BPMN, and some stream processing in there. As an architect with a specific problem to solve on a specific use case for a customer, that bunch of architectural building blocks allows you to express yourself as an architect to a fuller extent than I’ve been able to see with other systems. It also allows you to actually, from a project delivery point of view, understand the use case really well because the consultants can do their job on the ISA-95 side. 

You can put down a really finite architectural piece. Because of the way that the tools are constructed, you can plan out a project delivery program where you know what needs to happen. There are no mysteries. The computer science problems have been solved.

So we’ve got this convenience of expression for the consultancy and the manufacturing analysis side, and the convenience of expression from a technical architecture side, which has got a kind of breadth and depth that you don’t really see if you’re just going to consider an MES or a UNS or a partial piece. We’ve got a lot more tools in our box. We can wield them with care to bring about quite efficient solutions for specific use cases.

David: Yeah. Great point. Geoff, any final thoughts, comments you’d like to make? 

ISA-95 MAKES A DATA HUB A MANUFACTURING DATA HUB

Geoff: Yeah. Probably two. So, what makes a Manufacturing Data Hub a Manufacturing Data Hub and not just a data hub, is an interesting question. The answer to that is ISA-95. If it wasn’t for ISA-95, it would just be a data hub, not a Manufacturing Data Hub. That’s data governance and consistency, and it’s this incredibly rich, incredibly precise language that provides.

The other thing is we’ve talked about this toolbox of components, with its rule engines, stream processing, workflow, APIs, and all this good stuff. It’s possible to go and collect individual standalone components that do all of those different things. It’s even possible to assemble them into a solution. Given enough time and enough smart people, you can do that. But it’s really, really hard to do. And it’s even harder to make the end result usable and sustainable.

What we’ve had to do is actually not use a lot of the components that you can go and get to assemble a solution, but kind of purpose build a lot of these things, so they work closely coupled together. So they work as a harmonized solution set platform that our customers can go in and get and use individual components. If you had to kind of pull them all together, it’s an unwieldy beast if they’re all individual components of the solution. 

This is exciting. The problems that we can solve. You know we talked about genealogy and traceability and things, the problems that we talk to customers and companies about that we’re solving now with technology and the language that we’ve got to describe these things.

Problems aren’t new. Some of these problems have sat there unsolved for a long time because the approach and the technology hasn’t been ready in the past to try and solve these things. What we’re finding now is we can get in and solve problems that were unsolvable previously because of the combination of tools and the language and the approach. That unlocks value for companies. That’s how companies can really take that step forward. It’s exciting. 

David: Yeah. Agreed. We’re very deliberate about what’s here. As I first started exploring the Manufacturing Data Hub, or what we now call Rhize, I noticed we were very deliberate on the technology that we were using. Until we’ve used this robust technology as part of our platform, that’s what made a lot of these problems difficult. Now it’s easier. We’re not we’re not solving anything new. We just now have the ability, the programs, and the applications that we can now stitch together and solve it.

So, with that, Geoff, Andy, thank you so much for, for spending some time here today. And to our audience, thank you. I know we covered a lot of information. I look forward to seeing you on the next episode of the Rhize Up podcast. Everybody, enjoy the rest of your day. 

Geoff: Thank you. 

Andy: Cheers, guys.

People on this episode