How Incident Management Supplements Data Governance

Matthew Blasa, a Data Scientist at The Home Depot, stopped by the MAD Data Studio to talk about how incident management as a part of observability and governance can affect various data areas. Learn how Matthew and his team tackle data observability, as well as different ways of approaching data governance on a small scale versus an enterprise scale.

 

Check out Matt’s Medium @ blaza-matt and his YouTube channel @ DataLife360

How Incident Management Supplements Data Governance

About Our Guests

Matthew Blasa

Data Scientist The Home Depot

Matthew Blasa is a data scientist who describes himself as a person who lives to experiment and learn. He is constantly seeking out knowledge and being excited by the challenge of learning something new in the data science space. His most passionate topics include MLOps, machine learning, data quality and data governance. In his free time, Matthew enjoys mentoring people making the transition into data science field, which in turn helps drive him.

Episode Transcript

DISCLAIMER: The following is the output of transcribing from an audio recording with the use of AI. Although the transcription is largely accurate, in some cases it is incomplete or inaccurate due to inaudible passages or AI transcription errors. It is posted as an aid to understanding the proceedings of the meeting, but should not be treated as a valid record.

 

Ryan: Welcome back to the Mad Data Podcast. My name is Ryan. I’m one of the hosts over here with Databand and we also have a awesome person on the line. His name is Matthew Blasa. He is a data scientist. Over at Home Depot currently He’s really into data quality and data governance and also incident management, which we’re going to talk about today. And you also mentor people Matthew, which is really cool. You just told me that before we get on the line here, you don’t mentor them like in video games or anything, do you? Or how do you hide your mentoring?

Matthew: Oh, I’m awful at video games. I haven’t played them in years. No, it’s more like I just go through the video chat, I talk to them a little bit. We go over the stuff. It’s not like I teach them to. It’s not like I give the fish is more. I teach them the fish, so I give them teach them more on ways and how to think about things and view. It from their own point, their own perspective, because it’s more natural that way.

Ryan: I always like to Hear when people are giving back and helping people out, and I think that’s really cool doing that, especially in the data space there. Space is like I feel like it’s Always changing all the time.

Ryan: I feel like titles are rapidly changing too. You know, you’ve got you’re an analyst and you’re analytics engineer and now you’re an ETL person. I mean, it’s just all over the place. I don’t know how many titles you’ve had in the past, like, you know?

Matthew: A lot.

Ryan: Well well, I mean, so today we’re going to be talking about how incident management supplements, data governance. And that’s a big topic that’s kind of coming coming in to today’s space. But what I want to quickly get a background and we do this every podcast is get a background on who we’re talking to. You know, tell me a little bit about yourself. Tell me how you got into the data space, what you’re doing today and what you’re passionate about.

Matthew: Yeah. So I’m a data scientist with over five years of experience in the data field. So like I’m usually very focused on analysis, helping improve governance processes for machine learning. So a little bit in my past I’ve made Three switches, so the military.

Matthew: Marketing marketing strategist and then data scientist now well data and now I’m data scientist so side gig of mine I just focus on mostly on like.

Matthew: Helping mentor people which is my my big thing are really learn a lot by teaching them and learn a lot.

Matthew: From their.

Matthew: Questions lots of questions shout out to you guys. Thanks for helping me out there. I learned more from them than I do from my from books and.

Matthew: I am currently upskilling to be an ML engineer right now. So on the side outside I’m really passionate about like.

Matthew: Reading, running and learning about history. So especially military history and philosophy.

Matthew: So and those have really.

Matthew: Influenced my approach to like data and how to respond to like incidences and how to organize teams.

Ryan: That’s interesting. You talk about the topics.

Ryan: Of military and philosophy, how they have kind of impacted your your view, your worldview of how you operate data, what is like a philosophy book you’re reading right now that’s kind of making those connections?

Matthew: I mean, there’s always there’s a ton I can’t even know where to begin. But I always The Art of War and the Book of Five Rings.

Matthew: Are always the ones that I always look at consistently. I always go.

Matthew: Back to that. Yeah, that’s pretty much the consistent too that I always go back to. They’re they’re pretty much a generalist thing, but they’re, they’re.

Matthew: Very flexible and you can really gain a lot of like information out of it.

Ryan: Yeah. It’s interesting to see how the.

Ryan: Art, the art of war is like such a and I’ve only read parts of it. I haven’t read the whole thing. I probably need to. I remember getting it when I was a startup company called a Symphony and we were going to a sales kickoff and everyone was given a little the little bit because it’s not that big of a book, right? It’s like pretty, pretty, smallish.

Matthew: Yeah.

Matthew: Shambala version. I’m guessing I have that one too.

Ryan: Okay. Is there a bigger version? I’m assuming there’s.

Matthew: Tons of versions all over the.

Matthew: Place. Yeah, different.

Ryan: Sizes. But what I do know is it’s interesting how that same book can.

Ryan: Be applied to different areas of business. Like, for example, we’re talking about sales and marketing and they’re not. Now it’s, it’s strangely with you in the data space suits, I think that’s just really cool. The last person I talked to or a couple podcasts ago, we were talking about books that were shaping our lives. And one of the other books that we was talking about that was pretty substantial in his life, too, was the art of negotiation, of negotiation and never split the difference. That was another one that.

Ryan: Kind of he was talking about had internal politics.

Ryan: Between data contracts and being able to negotiate a part of that as a process to make sure everyone’s on the same page. And so it’s just cool to see how different books that.

Ryan: May may be viewed as just, oh, it’s only for, you know, business management.

Ryan: Or whatever actually translate to these different fields. I guess it’s really cool.

Matthew: Yeah. Yeah, definitely.

Ryan: So. So give us I know we had talked about this topic today specifically around how incident.

Ryan: Management kind of supplements governance and obviously you’re really passionate about data governance. You talked about that this kind of an area of a field of your specialties and things like that.

Ryan: What attracted you to.

Ryan: The data governance aspect as a data scientist starting out like what? What made that so exciting to you?

Matthew: It was kind of an accident, actually. So what happened in some of my previous jobs.

Matthew: I would be like building out a.

Matthew: Model. And of course every data science, most data, some data scientists, we end up like doing a half of data engineering, but it ended up being a case where I would not be spending as much time building a model, but I’d be spending more time doing the data engineering. And that once led to the data quality and the data governance. So for the data quality, it was just like, am I getting the right inputs and is this the right thing that data, the right data that the end user could use? And it got to governance where it’s like, okay, not only do we have to have the correct formats and the correct outputs, we have to have someone responsible for it. And we also need to have responsible for how how processes are written, how data is deleted. There has to be deletion rules and what the business logic behind tables are. And data governance is just really large.

Matthew: It looms really large. And it’s a.

Matthew: Very I feel like it’s a very it’s not a new field, but it’s evolving quite a bit due.

Matthew: To like all the technology changes in the cloud that have been happening.

Ryan: For the government side too. I’m always curious to see, like if you had to, you know, come up with like one.

Ryan: Sentence or a mantra that you view data governance through. Within this, you’re talking about how it’s changing constantly. Like, how would you describe how would you describe data governance?

Matthew: I would say data governance is not is a set of rules, a set of rules and responsibilities required to make sure that.

Matthew: Data is accurate, trusted and reliable. If you want to do that.

Matthew: Data quality actually.

Matthew: Fits under that.

Matthew: And data management.

Matthew: Also fits under it to a point.

Matthew: So there’s those two kind of overlap into the space, but they’re not part of the space. The governance is mostly the not only the rules, but the processes needed to make sure that those rules are complete. It gives order to make the data whether or not you are, it gives order to the data. So I mean, what I mean ethics like you.

Matthew: Can’t like build machine learning models off certain data or whatnot. Privacy like you can’t like include certain like data in, in a machine learning model or in an analysis or if you do include it, how do you anonymize that sort of rules.

Matthew: Or even as far as. As to what what our standards.

Matthew: For the data like what’s considered a standard for a data by one aspect of the business, like a marketing department and what’s standard of the data for the sales department.

Matthew: And being able to define.

Matthew: What that actually means without being too rigid.

Matthew: And it goes for me, sometimes data governance goes back to the.

Matthew: Idea of like in.

Matthew: Legal terms, what’s a good law? A good law is something that’s not too rigid. That’s not too rigid, but not too broad. It has to fit with it’s a framework not a not a not a rigid definition.

Matthew: Of it has to be this, but something that is.

Matthew: A framework that is defined.

Matthew: By individuals in the organization that they come up for and set off precedent.

Ryan: So that’s. Okay. Well, I appreciate that. That’s I know you give a more along with an answer instead.

Ryan: Of just one sentence. That’s okay. I liked I like your answer too. Anyway.

Ryan: I know one of the things we had talked about initially about the podcast, two ways you add you’ve been through different phases.

Ryan: Of data governance growth, whether it was like a small scale or enterprise.

Ryan: How would you go and people listen?

Ryan: This podcast may be like, Hey, Matthew, I don’t you know, I don’t work at Enterprise. I’m going to small, small, small startup right now. Like what nuggets can I take away from being a startup? Or from enterprise side to maybe thinking what things are my lacking in my enterprise governance planning that maybe I’ve.

Ryan: Maybe I’ve gotten to heads heads in the.

Ryan: Cloud sort of say and I’ve forgotten some of the roots of governance saying to get back to what you like to talk through, kind of how you view those different, different levels from startup side to enterprise scale for governance.

Matthew: I mean, governance is really can really depend.

Matthew: On the organization. Right now we’re in kind of a phase, I feel like, where governance is still trying to be defined, whether like.

Matthew: How it’s supposed to be done and people are still kind of testing it out.

Matthew: So there is that, you know, caveat there.

Matthew: But as far as like on the small scale versus.

Matthew: Enterprise scale, for the.

Matthew: Small scale, it’s more it’s more the responsibility of individual contributor. So if you mean like a small scale, like a ten person data governance in a ten person startup, well, data governance at that scale is the responsibility of either the data engineer or the data scientist. And with them going back to the founders and making sure, okay, is this data correct, is this data correct? Or even working with the with the whoever, whoever is responsible for the data, is this data correct? Is that data correct? Is this the correct format? So it’s more of an individual thing I found out at a smaller scale. So data governance does exist even in startups right now, but it’s more of individual contributors doing what they can.

Matthew: It’s not defined.

Ryan: Roles. If you had to, let’s say, get that ten.

Ryan: Person team right.

Ryan: Would you say that somebody needs to lead this somewhere? Where to to to lead.

Ryan: This which department would you think would or as a department person maybe should like lead lead that you think.

Matthew: Ideally it should be the person who can sit in the middle. So like in general, usually it’s the data engineers or people who are ingesting the data who are heavily responsible for the data governance, which 100% makes sense.

Matthew: I mean, they’re touching the data, they’re looking at the data, they’re the ones ingesting it and making sure that it’s correct and private information is being hidden. And among other things.

Matthew: Now, if you want to really make data governance, I feel like fly, you have to find someone who can sit in the middle so someone who can sit between the business users and the engineering users. Because the one thing I found out with governance is like if you have engineering users in there who are like pure engineers, your governance will start to end up.

Matthew: Being reading like a case statement is equal.

Matthew: It will just be pure like technical logic, which is not wrong. But a good governance document is able to be read by the tech people, but also by the business. So you need someone who can translate in between the two. And I found out like.

Matthew: Being a governance analyst in my previous roles, like if you end up being a translator, you have to.

Matthew: You often talk to about the business is like, hey, what is this tech this there’s a technical term to me it’s or you’re going back and forth between the the technical teams and trying to break.

Matthew: That down into simple English that the founders or the business.

Matthew: Users can use. So there there should be someone who’s in the middle who who.

Matthew: Can sit on both sides somewhat and can understand both. They’re a translator, essentially at a small level and at a large level.

Ryan: That’s you know, that’s helpful. I appreciate that. It’s interesting, you initially were kind of like, hey, they’re engineers. Yeah. Do you saying like you get to pick your topic.

Ryan: Like, you know, one person, maybe it’s them, but then really it’s the person can sit in the middle to do the translation of that because maybe they the engineers aren’t the best at translating technical stuff to this. I don’t know if that’s true or not.

Matthew: It depends. It’s individual. It’s very individual. Very individual. The key for a good data governance analyst is that they have to be. Very process oriented, detail oriented and ask tons of questions. So usually the people I’ve found who come from the technical background make really good data.

Matthew: Governance and was because.

Matthew: Usually the bit sometimes the bits will say, okay, this is a sale, a good data go with analysts. Regardless of whether they’re at.

Matthew: An enterprise or at a startup level, they’re going to ask, okay, can you.

Matthew: Define that? Where does this source from? Where are you using this in metrics like what are your use cases? How are you doing this? And then what is the process to obtain the data? And can you are you doing it in the in a way that’s like legal and ethical? That’s that’s they they’re asking.

Matthew: Constantly asking questions. That’s a very key part for data governance analysts and being very process oriented.

Ryan: Awesome. Okay. So at the small.

Ryan: Scale you had mentioned, you know, this is what you kind of see that at a certain level when you get more into the enterprise scale, what was starts to change.

Ryan: Or as enterprise, I guess maybe.

Ryan: A difference between small which is like under a hundred people, maybe 100 to 500 of them, like 500 plus maybe maybe the maybe 500 plus is, you know, more of the smaller enterprise, right?

Matthew: So the higher the larger you get the role start.

Matthew: To become more specialized. So at a.

Matthew: Smaller level, you.

Matthew: Might have the data engineer and the data scientist like doing the the governance, like for asking the business logic, making sure of everything else.

Matthew: When you start to go to a medium company like I, when I was a medium.

Matthew: Company, started to have like a specialized data data governance analyst.

Matthew: Which was just only responsible.

Matthew: For like the.

Matthew: Documentation, the technical, the technical details, explaining really and interviewing.

Matthew: Like end users or executives, like how that data is actually being used and.

Matthew: Putting it, putting it into a.

Matthew: Document that really details like the strategic picture, the business logic, the tables and like any like dimensional, like any data models or machine learning models that is being used. So it gets more specialized. Larger organizations might have a data governance manager who has two data governance analysts under them whose job might be different. Like one person’s just focuses on a machine learning governance or machine learning models, which is really specialized form of government is, by the way.

Matthew: And one person might just be the normal data governance.

Matthew: Role being, you know, trying to pull in the data, trying to explain data models, trying to explain tables, dealing with lineage, table lineage, that sort of thing.

Matthew: So it gets more specialized. And at a certain.

Matthew: Level, I haven’t seen this one yet. You’ll actually have a person who’s a dedicated data steward, who’s all their role is is just like working with that or a data governance team where it’s just.

Matthew: Two data.

Matthew: Governance analyst quality, endless and data Stuart who’s just responsible for all the data in that in that realm.

Ryan: When you talk about a quality analyst.

Ryan: Could you describe what that is? Because I know we could talk about like data quality and technical data quality.

Ryan: What’s that will be? Is it will they be really called a data quality analyst?

Matthew: I’ve seen a data quality analyst, yeah.

Matthew: They these guys these people would be.

Matthew: Usually like either building out data dashboards. I’ve done this as I’ve done half quality analyst.

Matthew: They build building.

Matthew: Up those dashboards and they’d also be like working with the business and the governance analyst to refine down the condition of what considers quality for each department so that the data is considered reliable. What what the and and accurate.

Matthew: So for example, like marketing.

Matthew: You go to marketing and you ask marketing, hey, you know, what’s the quality standards for you are for a sale. It has to have this, this logic and it has to be in.

Matthew: This format and it has to be a certain character. Like that’s that’s a lot easier. And they also put larger.

Matthew: Constraints in that regard where it’s like, okay, how much percentage of the data is acceptable to be like a null or is acceptable, acceptable.

Matthew: Not.

Matthew: To be in the correct format?

Matthew: So your threshold.

Matthew: Might be, oh, 75% needs to be the correct for 80%.

Matthew: Now that changes when you go to another.

Matthew: Department like finance, where the threshold has to be close to 100 because that’s that’s that’s money that they’re dealing with or accounting.

Matthew: So the governance analysts and the.

Matthew: Quality analyst work together to refine that down, like set the standards and document.

Ryan: So there’s basically like certain thresholds that you’re willing to accept with error thresholds within those ranges based off different departments. Because like you were saying, finance needs to be like dead on accurate for the most modern things. You know, we may give and take care in there. That’s okay.

Matthew: Right. And that’s and that’s interesting. You mentioned that because it’s there’s also the opportunity cost of doing that because let’s say your.

Matthew: Governance analyst is being paid X salary and your analyst is paying your quality analyst is B paying x y salary.

Matthew: I mean, how much more work time.

Matthew: Are they costing like when they have to like go after a certain thing. So the threshold there exist to like maximize they have not only amount of work but the maximize the budget of the department that has them.

Matthew: So it’s it’s really there’s a really.

Matthew: Big opportunity cost in a tradeoff for that.

Ryan: There was a I was talking to somebody.

Ryan: Recently on a radio show and they were talking about how.

Ryan: They.

Ryan: Really push for solos versus plays within groups. Solos being and this was I heard this some of you use this as well but service level objectives level agreements.

Ryan: So kind of like that threshold that you were talking about, they make these objectives. Is that is that kind of what is kind of cycle.

Matthew: That I feel like the service level agreements and service level objectives and data contracts, which is a new thing that’s coming out, those are the.

Matthew: Next stage of this because right now, like when I.

Matthew: Was doing this before, it would be.

Matthew: Haphazard all over the place. There wouldn’t be like a hate department. B, you know, we need to have your data needs to be this format. It was more in the documentation of, okay, this is what it is and this is what we agreed on in the last Data Governance Council meeting. Like so.

Matthew: The level the agreements are.

Matthew: A way step forward. I feel like because that’s really useful for like, like codifying it’s like you can hold people to it and not only hold people to it, you know, like what the scope of your work is in the long run.

Ryan: SLA is, is more of like customer facing, hey.

Ryan: We’re going to give you 99.9% uptime, blah, blah, blah, blah. Right. So those are internal where it’s like, okay, getting to your data contract thing.

Ryan: Like, hey, we’re going to set some sort of objectives that we all are going to agree upon within these different.

Ryan: Organizations from engineering’s analysts, scientist or whatever, that we’re going to measure our internal KPIs against, which I think is really important. I think that’s like a really good way to step forward and everyone get by in.

Ryan: That this is what we’re these are objectives we’re going about. And it it feels more of a team oriented, you know, language is important in data and everywhere. Right. But it’s like the change from agreement to objective, I think is an important thing to get the teams.

Ryan: And breaking down silos around. This is a good.

Ryan: This is what this is. We are agreeing on this.

Ryan: Shared objective for the entire business, you know?

Matthew: Right. And I mean, it’s a good thing because data is a little bit different than.

Matthew: Let’s say, software, where software the output is the most important and it’s very black and white and data to.

Matthew: Your your output can be.

Matthew: Gray sometimes. You might get like a machine learning model that.

Matthew: Doesn’t perform quite well, but you fulfill the minimum.

Matthew: Of like what an objective is.

Matthew: And the product focus mindset I think.

Matthew: Is like what you mentioned with the solo is pretty is a pretty good way of like managing the data.

Ryan: Suites. Okay. So we talked a little bit about governance, talked a little bit about kind of how you see it from startup to enterprise, a little bit about contracts and having some objectives we can all agree upon. So one of the things that that we were mentioning before the call was we talked about data operability, we talked about data incident management, and you were talking about how that was a new area that you were getting more and more involved in.

Ryan: It related to data governance.

Ryan: So could you talk about a little about like how you view data incident management, a part of.

Ryan: Observability and also the overall governance? Like what’s your perspective on that and what should companies be kind of focused on there?

Matthew: So I mean, for incident management, it’s very similar to like what I what I.

Matthew: Did a long time ago. I was like a software software intern.

Matthew: So I mean, incident management is basically trying to be but when a problem happens, it pings.

Matthew: Like someone on call and they’re able to triage that the the response and be able to like get a sense of the problem, root cause it as best they can. And then when the workday starts, they can report like the issues and figure out what to do from there.

Matthew: For data it it works similar to the software thing. Now the way I got it was is from.

Matthew: My background in the military.

Matthew: In the military we have something called a quick response force. Quick response force.

Matthew: Is always ready to go at the moment. They’re always ready to respond to any incidents that occur.

Matthew: And that’s a very similar.

Matthew: Way of how I think about this.

Matthew: Like for this, it means it’s good incident response.

Matthew: It begins with effective communication and also understanding of what you’re supposed to do in a situation.

Matthew: So for that means that means like you have to create.

Matthew: Like standard operating procedures are playbooks like.

Matthew: What do you do in like this common in this response.

Matthew: In this situation that we have?

Matthew: What do you do in these common situations? And if there’s a situation that’s not common, what should you be documenting this way? That the individual can respond a lot quicker.

Matthew: They can be a lot more flexible.

Matthew: And they can they can document the problem so that when the workday.

Matthew: Starts, when the workday starts, then at least they can.

Matthew: Determine the steps to cause it and the next steps.

Matthew: To be able to address.

Matthew: It. Because no two incidences.

Matthew: Are the same, even though they’re they fit within they fit within.

Matthew: That. And so, I mean, besides, like having standard.

Matthew: Operating procedures and being able to have like good communication.

Matthew: They there needs to be like a process that they’re used to and drilled on. So like you can’t have playbooks without. You can have so you can have SOP.

Matthew: Or playbooks for incident response.

Matthew: Without having drilled it or simulate it. Because. So the person who’s on incident response can at least.

Matthew: Like diagnose the data problem.

Matthew: So that’s that’s really important. And this also assumes that you can even check it. So you can’t do incident.

Matthew: Response without good data quality checks.

Matthew: So I mean, you have to have checks.

Matthew: Along the pipeline from ingestion all the way to output. Whether that output is a dashboard.

Matthew: Or whether that output or whether it’s.

Matthew: Transformation of data, there has to be like data quality.

Matthew: Checks. Without that incident response will fail because incident response requires you to be systematic. And assuming that the.

Matthew: Infrastructure is correct and there’s these.

Matthew: Checks, then the person who’s on the incident response can go check the tables from.

Matthew: All the way up to the to the.

Matthew: Excuse me. Correct. It can check it all the way up to the upstream. So if it’s a dashboard is the endpoint, they’ll check the dashboard. Okay. Dashboards find OC ingestion to the dashboard, fine API endpoint to the dashboard, fine. And they keep going until they go to the source. So that’s why this infrastructure.

Matthew: Is necessary to even.

Matthew: Be able to do the, the incident response. So you’re looking at the lineage, you’re looking at the code, you’re looking at the data, and then you’re then after once you do that, then you look at.

Matthew: The operational environment.

Matthew: And the whole time that the incident responder is doing this, they’re able they’re checking through all they’re noting anomalies.

Matthew: And oddities that are in.

Matthew: There. So when the next person comes.

Matthew: In, like, let’s say, at the beginning of the day or later in the day, then.

Matthew: That person can double check with them. So incident response isn’t just like, Oh, hey, I’m just one person, I’m going to do all this stuff for you. Incident response means is like there’s a person responding, getting all the data, then trying to find it, figure out like what potential.

Matthew: Problems are.

Matthew: There. Then working with another person who’s response who who know, who’s more knowledgeable like senior data governance analyst or a manager, and being able to really see like pinpoint the problem and zero in on that. Once those two have done that, then they can attempt to.

Matthew: Try and be able to.

Matthew: Pinpoint the problem, describe the problem. And then when they.

Matthew: Go to like the stakeholders who noticed it, probably by this.

Matthew: Time they can say, this is the problem, this is what we’ve found, and.

Matthew: These are the next steps we’re taking.

Matthew: Whereas like normal response that we.

Matthew: Have now, your data pipeline fails and you’re just like.

Matthew: Let me go check into that, which is really unnerving for stakeholders who rely off.

Matthew: Like your, your, at your end, use your end, your end use, which is whether that’s a.

Matthew: Machine learning model output or dashboard.

Ryan: Yeah.

Ryan: So you did a really good job describing how the different layers of incident response. I mean.

Ryan: When we talked to and you kind of mentioned like the last part, so when we talk to people or we’ve.

Ryan: Had conversations about with people in the past, it’s like first thing you understand is like is your data processes, are they running like the pipeline problem? It’s like that’s not running. There’s a problem, but.

Ryan: Nothing, nothing.

Ryan: Gets done. Like you’re like you’re saying your.

Ryan: ML model and things like that. It’s consuming the data as I am.

Ryan: You can send me anything because the pipeline is not running. But besides the pipeline running, like that’s like table stakes for, you know, making sure things work.

Ryan: Then you guys get into.

Ryan: Hey, is the schema correct? Is there any no records sent through any data set problems? And then you’re talking about the Leninist perspective of upstream and downstream, how all these kind of impacts one another.

Ryan: And so I think you hit the nail on the.

Ryan: Head in terms of like it’s really about detecting, resolving and triaging.

Ryan: Data.

Ryan: Incidents like as soon as you get them versus waiting until you eventually have to have somebody down the road tell you, oh, this is something’s wrong here, possibly. Go figure it out. Go, go re-engineer how it’s all how it all happened.

Matthew: Right. And a most important part I forgot to mention is, is like the mindset that you go into it, because when things feel like this, it’s very easy to blame everybody.

Matthew: It’s like, Oh, you didn’t do your job or Hey ho, that’s.

Matthew: Your pipeline, it’s yours. I mean, you have to go into it with the.

Matthew: Idea of that. It’s a learning.

Matthew: Experience and trying to understand what the facts are because not everyone.

Matthew: Engineers a pipeline correctly. I mean, I’ve engineered a few crap pipelines myself embarrassingly enough.

Matthew: But to really look at that and then also use it as a way to think about how you’re readiness for future incidents. So like one thing we.

Matthew: Did in the military was an after action report.

Matthew: So once let’s say you solve the pipeline, you found the problem, everything is done. That’s the time to like do a.

Matthew: Quick after action report, like call.

Matthew: A 30 minute meeting, explain like.

Matthew: What was wrong, what was done wrong.

Matthew: And then put it in a conference.

Matthew: Or somewhere and say, okay, this is what we change to our operating procedure.

Matthew: And if it’s a significant like if the turnaround time.

Matthew: Or your data downtime was really bad.

Matthew: That’s a great time to go.

Matthew: And update your standard operating procedures in your playbooks because.

Matthew: That will save the next.

Matthew: Person who’s on incident response much time later.

Matthew: So and also, if it’s really bad, just you got to revisit your service.

Matthew: Level agreements like you said.

Matthew: So those because those it might be a it might not be it might be outside the scope of both. And I’ve seen that happen once.

Ryan: Yeah, well, maybe it made. Even to maybe the slower Acela was put in put together. That was just it was to me, it was too too lofty of a goal to I mean I mean, when I was at this company called Tri Centers, I mean, one of the things that we always talked about was you have development and we have testing software engineering and and test automation and quality, just like governance kind of like quality is software. Quality is everyone’s job. It’s not, hey, the Jenkins pipeline completed, it was all green. That means nothing happened. Nothing went wrong versus like. Well, yeah, just because it went green doesn’t mean that everything is good. Because I just found, like, these ten bucks.

Ryan: You know, on my side.

Ryan: And so, yeah, maybe they did. You know, 80% of the testing needs to get done. Well, that’s way better than 20, 20% of the testing. So, I mean, I do think I think it’s really important.

Ryan: You mentioned that which is this instant response thing is meant as a learn.

Ryan: It’s a learning it’s a learning activity. At the same time, it’s not hard. Go fix it and then don’t understand how to make your upstream.

Ryan: Pipeline better the next time.

Ryan: Right? You want to be able to take it and go, Yeah, we want to help make this a lot better next time and we don’t want to just fix it and forget it. We want to keep what we want to be patching.

Ryan: Holes on a ship. You know, we want to be able to we want to be making the the vessel, you know, more sturdy over time with a new chassis and all kinds of stuff, right?

Matthew: Correct.

Ryan: You know, I have mentioned that incident management is kind of a part of data observability.

Ryan: Do you agree with that statement?

Matthew: I do. I completely agree with that because it’s incident management is a reactive response. It’s not a proactive response.

Matthew: Proactive responses are, you know, infrastructure, good definitions, that sort of thing.

Matthew: But you still have to put into place something in case everything goes to heck. So, I mean, you have the it’s definitely a big.

Matthew: Part of data observability.

Matthew: It’s it’s something that you don’t want to happen, but something that is necessary to.

Matthew: Be able to ensure that there is quality of the data and that there’s credibility in the data that you’re using.

Ryan: Yeah. One of the things I was saying is that talked about the proactive part of it. I just was talking to a major credit card provider, payment provider recently. They’re saying, hey, we’re setting up all this stuff on air flow and we’re setting all this stuff up in the cloud. It’s going to be great. And he’s like, But I know that something’s going to go wrong. And as I scale, I want to be able to know things are going to break. And if they break, like how to go fix it, like right away. And I can’t wait for.

Ryan: Problems to occur before I have an observability layer over top of it. And so that was what they were looking at, either build their own or look at a solution for it.

Ryan: But it was just interesting that he basically said, I’ve been there, done that, just like, hey, I’ve coded a wrong. You’ve said.

Ryan: You’ve messed up coding or on data pipeline.

Ryan: Before. He knows that these data pipelines it when they get up and running they’re always going to be running 100% perfect. And he wants to be able to understand when things do break.

Ryan: How to go fix them and how to resolve things. So he’s even looking in implementing something. Either they do it themselves or with somebody else to kind of help him out with as they scale the number of pipelines they have.

Ryan: Okay. I think we’re I think we’re pretty much up on time, man. This is this is a great 30 minute podcast that we had together. I wanted to ask, though, before we get on like one, what’s one thing that you want people to take away from what you talked about so they can take one member, one thing? What is it?

Matthew: Process. I mean, process, process, process. I know things.

Matthew: Sometimes there’s times that you need to have a quick turnaround for things.

Matthew: But once in a while you still have to encode process into it, process that isn’t so overly focused that it becomes bureaucratic, but process that that exists so that it can deal with not only.

Matthew: Data incidents, but also make sure that the data can be trusted and can be credible for the end users.

Matthew: So process is process is a very.

Matthew: Good foundation to start with.

Ryan: That process.

Ryan: With the purpose, I’m sure that process.

Matthew: With a purpose.

Ryan: Transparent catch phrase at a company.

Ryan: Somewhere and how do people connect with you? You have LinkedIn, you have Substack, we have you have a come conferences going to conferences saying.

Matthew: I wish I could go to conference right now. Now, right now I have a LinkedIn. You can find me on there.

Matthew: I’m sure we have a you can search me. I’m pretty easy to find on there.

Matthew: I also have a medium which I’m going to start writing on and I will be.

Matthew: Starting a YouTube channel about more of these unusual topics about data called DataLife360, which is going to be coming out in the next month.

Ryan: Sweet. That’s awesome.

Ryan: And well let us know whenever it’s published and I’ll put it on our.

Ryan: Channels to to promote it some more. So it definitely let us know when that comes out.

Matthew: Of course.

Ryan: Well, Matt, thanks, Matt. I really appreciate you coming by the Mad Data Studio and we will talk soon. Thanks so much for coming on.

Matthew: Thank you. Really appreciate it.