Lights On Your Dark Data Quality Issues

George Firican is an award-winning data governance leader, founder of LightsOnData, and podcast host of the Lights On Data Show. George swung by the MAD Data studio to talk about what actually makes a company data-driven and why companies need to be more aware of their data quality issues.

Lights On Your Dark Data Quality Issues

About Our Guests

George Firican

Founder LightsOnData

George Firican is the founder of LightsOnData Consulting & Training, focused on providing informative content such as online courses, templates, guides, best practices, articles, white papers, and other useful resources to help others with their data governance and data management questions and challenges. He’s also the host of Lights on Data Show – a podcast and LinkedIn Live weekly show that provides insights into the data world and showcases people that help to shape it. In addition, George is an international key note speaker, conference speaker, and panelist (spoken at over 60 different conferences worldwide). He loves being on camera, hiking, surfing, filming, and video editing.

Twitter: @georgefirican

 

Episode Transcript

Josh: Hi everyone, welcome to the MAD Data podcast hosted by Databand. I’m Josh, co-founder and CEO at Databand, where we’re helping data engineering teams deliver reliable data to the analysts and data scientists that depend on them. Today, I have a great guest. I have George Firican, founder at Lights On Data. Excited to have you on the podcast today. George, would love to hear a little bit about yourself. If you can, you can intro yourself to our audience.

George: My pleasure. And hello, everybody, and thank you Josh for having me on your podcast. It’s really a pleasure to be here. So like Josh mentioned, I’m the founder of Lights On Data, at heart, and in practice, I am a data governance practitioner and BI practitioner, but also through Lights On Data, I try and deliver a lot of educational content just to get people up to speed to what data governance is or the other different data measurement areas, data quality and whatnot. And you know, I do this through online courses and YouTube videos and articles and, you know, free templates and things like that. And I also have my own podcast, the Lights On Data Show, where we also try and put the lights on different data topics.

Josh: Amazing. Well, it’s good to have you on the podcast today. There’s a certain area that you, it seems like you cover around data quality that I’d really like to focus on for the conversation today, because this is really core to what we do at Databand; helping teams ensure better data quality, but seeing all the different areas where you’ve put a lot of thought and attention into, I think there is some other interesting topics that we could dive into today. But I think a good place to start with would just be to understand how you got into data at the first in the first place. What brought you into the data economy?

George: Great question. Well, I used to be a programmer, so programing and obviously interacting with databases, I think that was maybe my first tapping into the world of data, though I didn’t quite understand its implications back to the end-user. And then when I got into the business analysis and project management role and I got to interact a lot more with the clients that we are producing or programming products for. I think that’s where I could really understand, well, listen, we need to understand the use of that data so that we could capture it properly so that we could give it back into reports in a meaningful way. So I think that’s when I really started to understand the impact on that analysis. There needs to be done, the requirements that need to be gathered and how that has an impact on how the data gets collected enough towards maintained in the system that we are programming. So then I guess that got me into the world of data quality and data governance, data management. And then I kind of made the jump and stepped into a role of data quality manager enough towards data data governance director.

Josh: What do you think are some of the biggest differences in the data space today versus when you started out?

George: I like to think that there is a lot more awareness into what goes into managing and maintaining this data. I like to think from what I’m seeing is that a lot more companies are starting to put their lights on it and actually having roles there are dedicated for data management, whereas before I felt that different people were just wear different hats and one of them would be one of, yeah, let me do this data entry, but then not have to worry about it or somebody was a little bit more involved in saying, Oh, let’s do something and try and maintain it, maintain its quality. But I think now there’s a lot more focus on policy and standards and procedures. Maybe because of this sticky issue, maybe because of, you know, GDPR and all the other legislations that we need to abide by. And because otherwise we will be fined a lot of money for not adhering to these regulations. So. But at the same time, look into companies that have started to invest in data management, data quality, data governance. We could see that, you know, they’re some of the leading companies that are data driven. They’re having an advantage in comparison to companies that have not embarked on that journey yet.

Josh: We talk a lot about companies that are data driven data of them because it’s really important qualifier for us when we’re selling our products. We want to go to companies that are really invested in their data and it’s become something of a it’s become. A bit of a cliche in saying we’re riding this trend of data driven organizations. I’m curious in the companies that you’ve worked with. Are there any more? Commonalities or patterns that you see in companies that are really data driven and what are the ingredients or like signals that you look for that add a little more substance to that notion?

George: So first, I want to preface by saying that I’ve interacted with quite a few companies out. They’re saying they’re data driven, but they’re not really when when they actually when you actually look at the practices that they’ve adopted or not have adopted, you could see, well, you want to be data driven. You’re mentioning that you want to be data driven because maybe you heard at a conference that you have to be, but you’re not really investing into those efforts. You’re not really investing into those roles and responsibilities. But on the other side, when you are looking at those that are truly data driven, I think one differentiators that it needs to come from the top, that support needs to come from, you know, that C-level individual in the company, that they see the value of it and they’re supporting it, they’re behind this and they’re not second guessing it, and they’re really investing resources into it. So I think that’s really the biggest differentiator that I’ve seen from from these companies.

Josh: And is there a point point in time where you can say, OK, you are data driven, what makes a company cross that threshold?

George: That’s a very good question. I think you’re also it depends on the magnitude and the scale of that company when but when you when you do realize that they have some sort of data analytics data quality piece in their job description for any almost any type of role that interacts with data, even if it’s from a consumption point of view, then you can kind of get a sense that, OK, culturally, these individuals, this company is really data driven because data and especially in particular data quality is everyone’s responsibility. So because they’re acknowledging it as part of everyone’s responsibility, you’re like, OK, they care about this stuff. They’re, you know, putting their their their money where their mouth is.

Josh: So more organizations becoming data driven is definitely a trend that I think we’re all capitalizing on that in the market and that gives us the space to deliver our products and services. Are there other trends in the data space that you’re following closely that you’re excited about?

George: Well, I think that, you know, the whole blockchain piece is trying to disrupt the data science world a little bit as well. And, you know, data security, data privacy. So that’s an interesting it’s an interesting field to watch and to see how we will evolved and what does it mean to govern data within the blockchain, for example, there’s organizations in particular governments even that are investing resources into trying to kind of deconstruct the whole blockchain and find out different pieces of data and metadata about the blockchain that technically they should be hidden or unavailable. So I think there’s a completely different feel that it’s being created just out of that out of the blockchain.

Josh: Interesting. Have you seen any really compelling use cases in particular with that technology?

George: Yeah. So I think the the privacy security right now, it’s something that’s emerging out of it. And you know, you’re being strong of having an API. Let’s say in a social login, you can have the blockchain to also authenticate you as the trusted user of an application or for you to have access to that data or actually transferring a piece of data through the blockchain, then you know, it won’t be able to be modified after the fact. So, you know, you know what, since been cleaned and validated, trusted, it’s sent out there and transferred and and, you know, to automate use and, you know, through the blockchain technology that it will not be manipulated as a result.

Josh: So it’s about securing access and usage of data resources that a company already has. That that’s that right. Interesting blockchain, as you’re saying. Right? Right? Are there? So would that lead to new applications that are products using blockchain to be able to lock down that data? Or are you also seeing interesting use cases around analysis of blockchain itself?

George: Yeah. So as I mentioned, I think governments in particular, they’re very interested in analyzing their blockchain data and understand the ins and outs and also have predictions based on it. Because most of these blockchain applications, they’re still in the financial sector. So dealing with the cryptocurrency, I think it’s something that governments in particular, they’re very interested to find out a little bit more. And maybe in a sense, d the financial sector banks are a little bit threatened by it, and they also maybe want to learn from it and see how they could capitalize that system. To us with within their own transactions.

Josh: Interesting, another topic that you’ve written about, which caught my attention was around the concept of dark data, which I take it you define. Yeah, and I’m a astronomy enthusiast, and that might be why that’s that turned out for me. But I’m really curious about this, this idea. I’d love to hear more about and listen.

George: I mean, I can’t take credit for it. I’m not the one that came up with the term initially. I think it came from Gardner the first time I saw it, and IBM also has a definition for it. But to put it simply, dark data is is any data that an organization acquires through various processes and just stores during their regular business activities? But and here’s the critical thing. This data is not used for in any manner to derive insights or decisions or monetization. So it’s data that just sits there in your systems, in your cloud, and it’s not being used at all.

Josh: Interesting. It reminds me of the concept of Typekit in a software application where you have engineers that may have written too much code around a set of features that don’t turn out to be super useful or that just end up making more important functionalities harder to manage and accumulate this weight that acts like a liability on the balance sheet on an organization’s software engineering efforts. Is that is that a good parallel for how to think about dark data as a problem?

George: It’s definitely one of the issues, for sure. Yeah.

Josh: So what are the main issues with this? Why not? Why is it such a bad thing to have a bunch of data that’s not really being used? We did have I think there was a it’s probably still so alive this idea and a lot of companies that want to be storing all of your information. You want to create a big data lake of all the data that you possibly can capture. And it’s so cheap to store things these days that it’s OK just sits. They’re being totally unused. Maybe one day that that turns out to be important. So why? Why should we care about the issue of dark data?

George: So there’s very reasons sometimes within dark data, you don’t even know what type of data you’re storing there. So it could be personal pi information that you’re storing and you don’t even know because you’re not even bothered to look at. It is just stored there, you know, just in case. So things like, I don’t know things from a call center audio logs, but also the audio conversations. And within some of those audio conversations, you might have some personal data that’s being recorded unbeknownst to you. So that’s definitely a liability if ever that data gets out there and it gets accessed by people that shouldn’t access that information. It’s definitely a risk for the companies to store that data if they’re not even using it just because, right? And like you said, data storage is very cheap. So that’s one of the main reasons why we’re keep on storing keeping this data, because why not? It doesn’t cost us a lot of money. But if you do add add up the the storage cost actually comes up to, you know, a few savings if we wanted to get rid of it. I know there were some studies, some surveys done and it could range from, you know, 10000 to maybe 50000 or more in storage that we could actually get rid of for data that we’re not using at all. And that’s, you know, significant savings that our company could have. Another issue is, of course, the opportunity cost. I mean, we’re storing data that we’re not doing anything with it. Well, what if we were so if we continue opportunity cost, if we were to analyze the data, if we were to actually manage it and use it, maybe you could derive some interesting insights about our customers or about our own internal metrics and what what have you that we’re not. So that’s definitely a cost there that we’re losing for data that we are not using. Interesting.

Josh: So there’s a liability aspect of the data that you might be saving, which you shouldn’t be saving and definitely don’t want anybody getting access to. There is the accumulated cost of an individual by being really cheap, the store, once that accumulates, it can get pretty expensive. And then you mentioned the opportunity cost of not using data that you otherwise should be using.

George: And we’re seeing this a lot in organizations that still have the siloed approach that one department might actually make use of that data that another department is capturing. But they’re not even aware that the other department department is capturing this data that they’re not using. So in essence, it becomes dark data to them that another department could actually benefit from.

Josh: So would your approach be? How do you help organizations take inventory? All this information and decide which of those problems they need to solve. Are you going in? Most of you are. Most of your focus would be around putting the lights on this dark data or cleaning it out and deleting it and saving costs. What we’re usually prioritizing attention.

George: Yeah, I think you need to have some sort of a dark data policy. Let’s see or really part of your data policy to review any data coming in and then have clear rules of data retention. And you know, whatever you’re not going to use any more, then get rid of it. So I think a simple example is the unstructured data, you know, versions of old documents or presentations that you’re not going to use anymore is just there because you’ve created it. Like you said, steroids are cheap, but really, is there a need for us to to keep on that? So I think if we make it as part of policy and then incorporating as part of our our regular processes when we are creating new data and when we’re maintaining it, if we have part of their retention, archival or destruction policy, we can get rid of it. That’s a really a simple way of of reducing some of the dark data that we’re holding at any time.

Josh: How do you rank this issue of dark data with other problems, data teams maybe facing like data quality, which I would consider to be a pretty hyped issue right now that’s getting a lot of attention.

George: Well, you know, interestingly enough, sometimes dark data gets created because of data quality issues. So maybe it’s something that we are we are acquiring or we’re storing, we’re creating. But because of data quality, maybe it’s incomplete data. We know that, OK, we’re not going to use it for anything. So we’re let’s see. We’re getting some customer data, but we can tied back to the customer for whatever reason where a particular customer. And then we’re saying, OK, well, let’s let’s not worry about it right now. Or let’s say we have a transcription part of a deal. The audio conversations such as this one, but a transcription because of the algorithm that’s being used doesn’t yield complete results or it’s not really accurate because the AI transcript is not there yet. And we just want to keep it just because maybe a one time we’re going to revise it once we have a better algorithm to scan through the audio log and spit out better, better transcript. So because of that data quality issue, we are creating this dark data. Hmm. But back to your original question. Definitely. I think data quality issues are, you know, all the way to top as the importance in comparison, dark data would fall at a lower level.

Josh: It sounds like you use the concept of dark data in some ways to understand the sources of bad data quality or their effects. It’s it’s a filter to look at the problem in some cases.

George: Yeah, yeah, it’s definitely one of them. You know, you bring up a good point that so often when organizations are looking at data quality and solving the issue of bad data quality, they’re kind of missing that issue. The what’s causing the issue? So not the issue, they’re missing that what’s causing the issue and they’re just looking at the effects and they’re just cleaning the data, but not really going in depth and understand what caused the bad data quality in the first place and trying to resolve that.

Josh: Yeah. Well, I think the example that you say you just mentioned around transcription or NLP is a really it’s a good one because. The ultimate transcript of a map that into a analytic system that we might actually encounter, maybe there are some analysts analytics that’s being done on sales calls within a couple of days, and the analysts are extracting things like what’s the how many times does a client react negatively when we raise our pricing? That might be like an end analytic that comes out of those kinds of transcriptions. And it’s it can be very, very difficult to, first of all, identify that there is some quality problem. If an analyst is looking at that data and a dashboard somewhere downstream and then if that quality problem is identified, finding where it originated is a whole nother ball game, a whole a whole nother task. So eager to chase down those issues to the origin is particularly tricky for a lot of teams. Do you do you have any best practices or standards that you’ve promoted and trying to identify those issues early or in general, trying to tackle data quality problems that you encountered?

George: Yeah, absolutely. So I follow let’s see a three step process when it comes to data quality management and we can go a little bit more in depth into each one. But the first one is just analyze and identify your as is situation. Step two is fix and fix a data and prevent bad data from reoccurring and step three is just communicate. And it doesn’t. This doesn’t come sequentially after step two, but it’s kind of along the way communicate at all times. What’s what’s happening? What have you done, what you’re doing? What are you going to do and what the impact is and what how can you connect that back to the business goal?

Josh: So, so what are the most important stages in each of those steps, like analyzing or understanding what the status quo looks like? Your first step? Are you actually doing that?

George: Well, I think you need to analyze your environment. So with that comes from, you know, your technical environment, your even culturally how? What are some of the issues there? Identify your data standards, if any, or lack of analyze the data quality, quality level, the the business impact that it has because that helps you prioritize that data quality issue, of course. Do you you know what resources you might need to tackle this in a sustainable way if there are any data owners, stewards, custodians and of course, most importantly, to identify and analyze the root cause of bad data. And through that, you can, you know, take various root cause methods such as the 5Y techniques or the fishbone diagram. And depending on the situation, they’re they’re fun ways of trying to get to the bottom of it all.

Josh: Is there a a really common data quality issue that you see consistently across organizations that you interface with?

George: A lot of it comes from a lack of standards. I think so. People are just recording the same thing in different ways, and that’s causing a lot of a lot of problems. Even if we’re talking about recording names. Maybe some of them would be with, you know, capital letter, some without some would accept non-English characters and some wouldn’t. But if you don’t account for that, then you can have weird data transformations, issues where a non-English character gets, you know, convert it into some sort of a weird, husky character. Or maybe you have a limit in your name field on how you’re recording in your database. So let’s say 24 characters, but for whatever reason, maybe a name would be 25 characters and data gets truncated without you ever knowing about it. You know, dates is another great example. It’s so different from the U.S. and it is from Europe, and that’s causing issues. If you don’t ever specify what standard you are using. So I think a lot of it comes from misunderstanding what the standard is or a lack of that standard.

Josh: A lot of the examples that you raise there, I would consider data quality issues that arise really at the source of data, like the input of data or when it’s being pulled initially into an organization. Do you think that’s where consistently most data quality problems are being created like garbage and just leading to garbage out? Do you see a lot of issues happening and arising downstream?

George: Both. But I think a lot of it is from how the data gets recorded. But then as part of maybe an ETL process as well, data gets transformed without knowing the the original standard and what the new center needs to be. You know, it’s I think one example that comes to mind is the there was this NASA Mars orbiter that that crashed, I think, 1999 or early 2000. And the source of the issue was that these two teams that were working on the design, on the mathematical equations, they were actually using two different measurement systems. One was using the empirical data, one was using the metric and they kind of both assumed that they were both working with the same same measurement system there. So as part of that data transformation that took place then to to converge the calculations and combine them into the one source of truth, that’s when the kind of the issue occurred.

Josh: Are there any techniques that you generally suggest folks bring in in order to catch these kinds of issues early, if they’re coming from the point of data entry or the point of data integration? What are what are some solutions that you’ve tried to promote with teams to be able to understand those issues at that part of the process?

George: Yeah, so so again, you need to have those data standards then you need to validate again to make sure that you’re complying against. And of course, you want to add in as much data validation point of entry as possible. Sometimes that’s not possible because you might not be the owner of those source systems. So that’s when you also want to develop some sort of, you know, data quality audit to make sure the data that’s coming in that you have no control over how it gets recorded. Does it comply to your standards and your needs? And if not, you need to have some sort of data transformation process to to put it into your your own standards.

Josh: The audit that you raise is interesting because it is and. It reminds me of our approach where the first value proposition that we offer with our solution is really awareness, so integrating our our system with existing pipelines that you have existing data flows that you have. We do have intrinsic bias internally to try to catch issues proactively. Mm-Hmm. Similar things around a lot of issues coming from bad sources of data or unreliable sources of data, external issues or the point of data entry. So we want to try to catch those issues early and being able to build the awareness. And I think maybe in your terms, an audit person say, Hey, this is the this is your ranking of data help in terms of the data sets you’re working with. These are the ones that produce the most kinds of issues. That’s something that we really strive to do when folks first get into our platform. I’m curious in the audit process that you that you set up. Are there certain certain measurements that you’re using to be able to quantify that in some way and say these sources or these providers, these data suppliers in this process of data integration is less healthy than others, and you should prioritize there.

George: Yeah, definitely. You can tie back to different data quality metrics, whatever those are as defined by your organization. So, you know, completeness, inaccuracy, inconsistency and so on. But he can also do some really quick data profiling exercise to see where where you can pinpoint to a particular table that’s maybe holding information that’s holding data that’s not really at the level of your data metrics that you want them to be.

Josh: Every organization is going to be really different. We see this all the time ourselves. Are there any generically applicable measurements that you see around building these standards? So taking one taking the the data profile that you mentioned? Are there certain kinds of data profiles that you would always suggest a team run, regardless of the nuances of the data that they’re working with or the industry that they’re in? So generally applicable principles?

George: Yeah, I think, you know, taking a look at every column, for example, and seeing how many unique values do you have in there? How many know values do you have in there? What’s the the standard to format and that comes out of it? I think just these these things or, you know, minimum maximum average at a high level, I think it will give you a very good indicator to understand if there’s a data quality issue in those columns that you need to tackle.

Josh: You mentioned before the idea of ownership, like, do you own the data? Do you own the process in terms of the team members that you typically see in organizations? Are you seeing any interesting evolution of who owns what within a data team as the different roles have become more specialized?

George: That’s a very interesting question. And you know, I’ve noticed a bit of a difference in culture between Europe and North America and in in North America. There’s, I think, more understanding of that ownership of data and me as an individual or me as a department within a company owned their data. And you know, what I say goes and by my my rules and by my standards, we need to abide. And it’s my own responsibility to manage that data. Whereas in certain organizations in Europe, that ownership is maybe treated a little bit differently if it’s customer data, for example, well, the understanding is while our customer, they own their data and we are just their trusted stewards within the company that we’re trying to manage the customer’s data as best as we can. So I thought it was a very interesting differentiator, culture wise.

Josh: Yeah, we’ve also worked with clients across different geographies and we had differences in culture as well. Certain clients that we’ve had outside the US, for example, I think in our experience, we’ve seen more fluidity between responsibilities. So you may have a team outside the US where the data engineer is also doing all the analytics and doing all the data science. And you may have a lot of these multidisciplinary contributors in the team as opposed to a lot of clear separation and responsibilities. Whereas in the US, it seems it’s more there’s more clear distinctions between who owns what. The data engineers that we encounter typically are deeper in the code. If there’s a lot of code that’s being used to drive pipelines, they are often sitting upstream more on the platform and we have a new role of the data of the analytics engineers already, which is an even greater specialization, this combination between analyst and data engineer. There’s no engineer role that’s popping up more frequently. So I think we see a lot of that separation of responsibility hardcoded into titles in the US as opposed to some other geographies we’ve worked in.

George: Definitely. Definitely. And also, I think it’s it’s an indicator of maybe the the scale of that company in the scope of of that organization and the size of the organization and the bigger they are. I think more differentiators between these roles are encountered. Whereas if you’re a smaller, let’s say, a start up company where you have, you know, five to 50 people in it, definitely one individual will wear multiple hats if they have to.

Josh: Do you think if you were starting a new data, team yourself? Mm-Hmm. In a typical enterprise, what do you think are the most important roles to have?

George: Well, it depends what type of company you have and if data is your product or not. And I’m mentioning that because if data is your product and maybe you want to put a little bit more emphasis on it rather than the other. But I think you definitely need to have, you know, your data scientist, you need to have your your data quality. Whatever title you want, sign, you know, if it’s direct your manager coordinator, but you need to have somebody that’s focused on data quality, you need to have your data stored, your data custodian. So there’s people on the side of the fence that need to to manage the infrastructure of their data, then the data itself, but also have the business knowledge to be able to put context into how their data should be best managed. And of course, lastly, is that visualization, analyzing and predicting drawing information out of your data? So however you want to define those roles, there are different titles that mean different things for different companies. But in essence, I think these are the areas that we need to tackle with with our data, right?

Josh: What about in terms of technologies that different organizations are investing in now? Do you have a favorite stock that you would suggest teams start with? Or what are the what are the tools out there that you’re really excited about?

George: I can see that I have a favorite stock. I know there is a move towards the cloud environment and I understand why. Sometimes, unfortunately, that’s not possible for different organizations, for different legislation purposes, regulation purposes and the need to kind of have a close knit into their did their data. But besides that, I think from the data science space, let’s say there is a little bit more move into the AI and trying to automate in the machine learning portion to try and automate redundant tasks that are helpful to have and of course, to the whole data profiling data management pieces there. As with data visualization, I think where you kind of have the the big three with Tableau and Pas and click, the others are are starting to come into focus too. And I mean, the data space is growing so much, you know, every year there’s sort of this one page graphic with all the different logos of the companies. There are within the data space and they’re tackling different areas of data and is just growing each year.

Josh: Do you have any perspective or or bias in the Lake House versus warehouse discussion? The Databricks way of work first. The Snowflake way of working. Any, any sort of leanings there.

George: Well, I like the snowflake approach because it’s giving you the flexibility to not have to. Have a data database manager or a data database individual that’s that knows the INS and out, and it’s easily trainable, I think the barrier of entry is very low. So I feel is just empowering data engineers and really data scientist by professionals to kind of just tender to their own needs. And I’d have to wade into queue for somebody else to, you know, spit up a new instance over the database that they need to work with.

Josh: Zooming out in the. In the data space, more generally, again, curious about what you’re working on at the moment. And any any big projects you’re undertaking on these different areas, the air quality, dark data or something else that’s got your attention these days?

George: Well, I’m trying to reach as many people as I can to educate them into the importance of data governance. So I’m I’ve wrapped up a course that’s been very successful and practical data governance implementation that walks you through how to implement a data governance program and why one is needed. It comes with, you know, templates and everything so he can really get easier than that. But I’ve been requested to put more content on the fundamentals of data governance on really people trying to understand data governance from the inside out and really seem simply what it is and how they could convince their their CEOs or CISOs to invest into data governance. So that’s that’s an area that I’m trying to tackle right now.

Josh: What’s your main suggestions for four teams that are beginning to tackle that issue? What are the the policies or best practices that you see effective organizations sprang into place?

George: Well, as we’ve mentioned before, I think it does start with the, you know, upper management that you need to gain their support and you need to have a sponsor at that sea level. I think that’s crucial. And you know, there’s nothing more frustrating as you did a professional trying to use your time to make a business case every time for anything that you’re trying something new, you’re trying to bring in a new tool into the environment or you want to draw some insights out of this data or you’re trying to build a analytical model, you just want to do it. And sometimes I think you need to also try things and be allowed to fail. And I think there should also be part of the culture and. Instead of doing that, I feel like data professionals out there, they’re wasting a lot of their valuable time creating these these resources or bringing in these resources and capabilities within the organization. So rather than doing that, they’re they’re spending their time trying to convince their managers and their directors to allow them to do this.

Josh: Makes sense. Well, as we wrap up, I’m curious if you have any call to action calls to action for our listeners as they build out their own data teams. Any first steps you would recommend or things that you’d suggest they definitely keep in mind as they get things off the ground.

George: Yeah, to me, communication is is a very valuable tactic and it’s something that I think it’s it’s missed. Usually we forget to communicate. We just want to do our work and we forget to talk about it. So I think at all times, we need to remember ourselves to again mention what we’ve done while we’re doing what we’re going to do. And how does this tie to the business goal? How does it tie to a specific department that you’re doing this for? And even on an individual or a particular role that you’re helping out.

Josh: Curious if you have any big predictions for this year?

George: Well, I think, you know, the whole air space will grow, and we’ll just see this as a capability is part of more, more tools out there that will help us with better managing our data and our data quality and whatnot.

Josh: Share your enthusiasm about that end of the market. It’s been, it’s been a while since companies started investing in machine learning and AI capabilities and 2022 could be the year where a lot of those investments really start paying dividends. And hopefully, at Databand a part in making that ROI happen by just making sure that the reliability of the data going into the systems is, is that a good mark to start bringing them more into production?

George: I think so.

Josh: Yeah. Well, it was it was great having you on doors. Thank you for joining the podcast today. Just last thing, as we wrap up, where can folks reach you if they have any questions or want to talk further about some of these ideas that you’re that you’re bringing to the market?

George: Well, you can reach me on LinkedIn, George Firican, I post a lot of content daily, so please feel free to engage with me. I love meeting new people online or in person, so don’t don’t be shy. And of course, LightsOnData.com.

Josh: Awesome. Thanks a lot.

George: Thank you.