Unveiling The IBM Acquisition Of Databand

In our first two-part guest special we’re happy to introduce, Maxime Beauchemin, CEO and Founder of Preset. Eager to learn more about the recent IBM acquisition of Databand, Maxime asks us to describe it in more detail and explains what it means for the future of the data observability space.

Unveiling The IBM Acquisition Of Databand

About Our Guests

Maxime Beauchemin

CEO & Founder Preset

Fun fact – Maxime Beauchemin is the original creator of Apache Airflow and Apache Superset. He is now the CEO and founder of Preset, an open analytics data platform, built on the Apache Superset platform, which helps makes any team productive with data. Maxime also has over a decade of experience in data engineering, at companies like Lyft, Airbnb, Facebook, and Ubisoft.

Episode Transcript

Ryan: Hey everyone, welcome back to Databand’s MAD Data Podcast. My name’s Ryan. I’m one of the people here talking to other people. We also have Josh on the line. Josh, is our CEO over at Databand, we also have a very elusive person, Max Beauchemin, who’ve been trying to get on the podcast for how long now? Probably like two months or something like that.

Maxime: Sounds about right. I’m not sure if I’m elusive or if you guys are busy getting acquired maybe? I don’t know what’s happening there.

Ryan: Yeah, well, we’ll get to that in a minute. But yes, there’s some reasons why we’ve been taking a while to get you on the podcast. But so just real quick, Max is the CEO co-founder over at Preset. He speaks at a lot of conferences as well. He, I feel like you, keynote a lot, don’t you, Max? Like you’re kind of all over the place here.

Maxime: Yeah. You know, I’m busy doing a lot of different things. So, you know, amongst other things, running a company, being close to open source projects, but also kind of touring around and fairly public too. So I do podcasts and conferences, love to talk about data, you know, I love to build stuff too. So.

Ryan: Well, tell us a little bit about yourself, because I know we always like to have people explain how they what was their field like, how they get into data, how they, you know for you starting open source projects now, Preset, like what’s a day in the life of you look like and how did you get to where you’re at today?

Maxime: Long story to tell here. I’m going to try to go fairly quickly through it, but yeah, I started a career in data about 20 years ago, actually more than 20 years ago at this point. And I was a data warehouse architect and a business intelligence engineer for the first decade or so. So very much a data practitioner. For quite a while I worked at a company like Ubisoft. I worked on their data warehouse and when like we had central data teams managing all the data stuff like small and mighty teams that would do everything data on the have the company that was before me that really democratized data. I joined Yahoo! Where it was the birth of Hadoop around like 2007 is when I joined Yahoo! Went on to places like Facebook. Airbnb left just as a. Very much what I would describe like data engineer now. But like back then it just wasn’t called that. And then I would say like joining Facebook was kind of this, so was 2012 or so. And I felt like Facebook was like a decade ahead in some ways, or at least like five years ahead in terms of like how they had reinvented a lot of things internally because the tools that existed outside of Facebook were not able to manage the scale. So they were that this prompted kind of rebuild everything in a very distributed kind of big data cloud type of way on top of Hadoop and then all sorts of tools and just emergence of like, you know, people there was like this Cambrian explosion of of a lot of like new tools, new way of doing things and a little bit of a microcosm of what the modern data stack as become today. So a lot of what you see, the complexity and the amount of data that can exist in a microcosm like internal internally built tool at Facebook. So like really inspiring for me. That’s when I pivoted to become more software engineer and more of a tool builder. I joined Airbnb in 2014 with the premise to work on open source. So the motive there was a instead, you know, you can join a company and have a huge impact within that company. But if you work on open source, you can have a huge impact that’s unbounded and that’s very much on the world stage as opposed to being like within like bounded to this company’s impact or a garden or playground. So, so that was the idea. I never thought that the open source project I started to work on where can be like become as successful as they did. So Airflow kind of took off around that time. So I was early on alone working on it. But very quickly, people internally, people externally started to contribute to Airflow took off. I started Superset. So Airflow is that orchestrator I presume like most people know a little bit about about Airflow I’ve heard of it they know what it does. So data, pipeline orchestration or, you know, data orchestrator and then Superset is a open source competitor in the data visualization exploration space. So very much aligned with kind of open source challenger for tools like Tableau and Looker. So the goal there was very much to have an impact. I really do think open source is social progress in a lot of ways. So I was like, you know, there’s a lot of virtuous things that come with that. And then it became clear after a few years of working internally at Airbnb, and then I joined Lyft, where I was also working on Open Source, but I really wanted to get these projects to become like a really solid challenger and kind of fulfill, fulfill the prophecy and fulfill the promises that I wanted for them. I think raising money and starting a company in this case around Superset, that was really my baby. And the thing I really wanted to push forward. Raising money was the absolute best way for me to to grow this thing and get a team of like 5000, 250 people eventually working on this and getting open source to succeed. So, you know, I had the question then is that, you know, is there a way to do that without perverting the commercial aspects of going to prefer like the open source vision? I talked to a bunch of founders at the time that in the commercial open source space and really realize that you can you can really find a balance. And I thought that’s going to be able to really find this balance that you have to get a commercial company successful enough and source project successful and everything to work well. So I’m on this journey, you know, it’s been super exciting. Now I’m a founder, so I’m sort of building software, I’m building a company, you know, and it’s it’s been super tight.

Josh: Did you have any role models in the open source space that you feel like struck a really great balance between the commercial and community or open source aspect of things?

Maxime: Yeah, I talked to Jay and Ali and I think it’s it’s not necessarily like. Always super easy to navigate, you know, the commercial versus like openness of things and finding the boundaries and making everyone happy. But I thought so. Jay Kreps at Confluent, the creator of Kafka and a bunch of other things in the streaming space. I spoke with him at a time, Ali from Databricks, so closer to Spark. So I think I was inspired to see, you know, everything that they’ve done. And just like, you know, you talk to a bunch of founders and you realize, like these people maybe are not that different from me. And if they’ve done it, maybe, maybe I can do it. Maybe it’s not that foolish to think that I can try it and give it a shot. So. So, yeah, I would call these two as like, you know, people that inspired me at the time.

Josh: Very cool.

Ryan: Well, we’re talking about building companies and getting good people to be part of that company and raising money and well, what do you happen Databand actually just got acquired by IBM so that’s well thank you.

Maxime: Well congrats!

Ryan: I feel like it was all me. I feel like this podcast is what really got IBM interested in acquiring us. Just kidding. But yeah, one of the things we want to talk about today was in lieu of like building a startup culture as we’ve been a part of Databand for a while and really embracing a lot of the open source tools as well. They are part of our platform now. We’re able to really broaden that, a part of a broader IBM initiative. So I know that you had some questions that you want to kind of lay out there that we’re kind of in the same vein we talked about, which is you have commercial offerings, you have open source tools. How are those kind of going to come together? And maybe what’s the reason behind this IBM acquisition of Databand?

Maxime: Yeah, I would say like my question personally, like just reading about the acquisition, what’s, you know, what are what are they by buying the people, are they buying the technology or what’s their vision for the place of Databand? You know, the people, the people behind Databand or the technology of Databand as part of whatever offering they have.

Josh: Great question. So they are buying the entire organization. This is a very strategic investment from IBM that includes a lot of investment that they’re going to put into our product, our team and the ultimate impact of data observability for their customers. So I think just like us, when we originally set out on this journey, we took a bet that data observability would be a really critical piece of the modern stack and of growing importance to the organizations out there that are depending on data. IBM figures the same thing. I mean, they’re seeing the same thing within their customer base. This escalated up from, I think initially IBM clients telling folks in product we need services to help us measure the reliability of our data and ensure that it is trustworthy and all these different services that the data is flowing into. And they decided to take a similar bet and put their money on on Databand as the team to get them there. So we’re really excited about this initiative, this acquisition, because it means more growth for our organization, more growth and becoming a market leader with our technology and the personal growth of everyone on the team. So we just have very aligned visions.

Maxime: Super interesting. I know when you think about it like this, the sheer scale of a company like IBM, I don’t even know how many employees they have. It might be like counted and hundreds of thousand, which is kind of insane. And your ability Josh to tap into this infinite amount of resources to grow your vision, your company can be. I think if you do it well, you can really that can be a huge callus and an awesome garden to grow into. If you can kind of pull the right people from all the resources and departments, you know, their go to market there, their sales team is probably incredible, you know, or just the scale of it. So that gets fit. It’s going to be a challenge. But at the same time, you know, if you tap into it well, you can really accelerate.

Josh: Yeah, we’re we’re excited about it. I mean, that’s exactly the process that we are going through now is learning how we tap into those resources and how we sort of ride the IBM wave and take them in this future direction towards towards better data observability for their clients. I describe it as us kind of moving from a rowboat way of operating as a startup, just furiously trying to turn the water and get us to the shore. And now we have this huge wave behind us. So we went from like a rowboat into more of like a sailboat. And we just really want to catch that, that wave of the IBM machine and and use that to grow the vision and the company and the team. So we’re really excited about it. A lot of challenges, a lot of things to figure out. But I’m most excited about what this means for our team internally and also our customers out there, which will be enjoying a lot more service from from the resources that we’ll be able to tap into.

Maxime: That’s cool stuff, you know, because you get, you know, for them, they get new blood with a fresh vision and kind of a modern team where they’re not dragged by, you know, things like Cognos and DataStage and whatever product line they have that are older, this is fresh and new and you can go. And I think if you go and shake that cage a little bit, you can get some activism. People excited internally about like, you know, rallying and pushing this thing forward. So I think like that that’s super key to navigate that well. And the challenge of figuring out how to navigate a large organization like IBM becomes super important, critical. And if you do it well, you know, it’s the sky’s the limit.

Josh: Yeah, for sure. I think there is a lot of need within existing IBM clients that are using all those services for what we offer. And then there’s the strategic addition that we bring into the IBM portfolio, which is more tied into the modern data stack and a spearhead to give IBM more penetration into that end of the market, which they may not be playing too deeply in. And today, we we may not be thinking too deeply in today. So I think there’s a lot of a lot of exciting directions for us. We sort of have this wide angle of of coverage on the market now between growth businesses, mid-market businesses and the biggest enterprises in the world at IBM has a huge and existing footprint in so we’re excited about exploring these and helping to lead the direction but definitely a lot of challenges figuring out how we infuse our startup culture into that big organization and evolve our culture through our stages of growth is definitely.

Maxime: You got to evolve their culture, you know, that’s the way you got to think about it. But yeah, that wave, you know, that dangerous like that wave is really exciting if you can surf it too. But a danger to is like a big you know, it’s a big ocean big, big wave too. So like, it becomes, figure it out. But like random window into my past. So I use that IBM DataStage a lot back in 2008. I believe we use that technology. Yahoo! And it’s kind of interesting to see like the engine behind this ETL tool. It’s pretty solid, you know, parallel process. You know, you do it like pretty solid parallel processing. It was a drag and drop tool. And the whole premise of Airflow is to bring, you know, code that say, like, I pipeline need to be managed as code as opposed to, you know, drag and drop type tool. But I remember thinking that the engine behind IBM DataStage was, was interesting and could do could allow you as a data engineer or, you know, data warehouse architect back then to define your parallelism scheme as well. So you could write your you could drag and drop your data transformation with parallelism in mind and being like directive around how you’re going to implement parallel processing of large amount of data. It’s kind of crazy to think like back in 2008 they had pretty solid. There was really good technology. We keep reinventing the same things and then forget some the lessons from the past.

Josh: Well, there is a DataStage only being one of them, but there is a lot of services that will now be brought into our field of view. There is actually, I think, more than one pipelining service within the IBM product suite that we’re looking at and exploring partnership and integration with for IBM customers. I think it’s important also for our customers to understand and the market to understand that while this will increase the partnership for sure between Databand and the other IBM product lines, we are continuing to invest heavily in all the modern services that we grew up with in Databand right. That’s the open source services out there like Airflow, which is probably our most used integration Databricks, Snowflake, BigQuery, all the various services that make up the modern data stack, open source and commercial. We’re still putting a lot of investment into. And I think that’s key to our strategy of helping to to bridge between those IBM services that we’re talking about and the kind of new breed of technologies that are really important for all data teams today.

Maxime: I think they’re big on open source too. I don’t know too much about the nature of their involvement and data open source project, but I know IBM was pretty involved in Spark, for instance, like they were driving Apache Spark and investing into it and contributing back to. So I don’t know exactly how it works, but as a company that that size with that many customers do they they got to be deeply involved in open source because, you know, the market is.

Josh: I mean I think I think knowing my team we will have an eye out for the different opportunities to continue contributing into the open source community. And yeah, there’s a lot of just amazing engineers that we’ve met at IBM and frankly, I don’t really seen this or expected it at a company like that. It’s a bit unheard of today. We meet a lot of people that have been at IBM for like 15, 20 years doing something right over there that is getting really smart people engaged and involved. So we’re excited to learn more and and go through that learning together as a team.

Maxime: Well, sweet. Well, congrats. And so people will have more questions and you know, we’ll get to know more as this as you guys kind of, you know, integrate.

Ryan: Yeah, and and by the way, for those listening, we weren’t planning on talking about this today. We just want to talk about it because Max was like our first podcast after this acquisition where we were like we might as well talk about it as well, like at least address it because it’s kind of a big deal day.

Maxime: I wanted to talk about it. See you as a founder, period. I went to your website, I saw an IBM company. Have I wait a minute? I’ve got questions for like Josh I’m not having this podcast if we’re not talking about this.

Ryan: We welcome those conversations. And I mean, all the all the same questions you ask. We’re communicating with our customers already about so all things that we’re already actively talking about. And one thing I will just mention, too, is that we’re also really excited about at least from from my side, from a marketing standpoint, you know, we’re data observability is still like a very early, early endeavor in the data space and there’s lots of really good observability solutions out there that we can go and take a look at, just type it and you’ll figure it out. Lots of lots of raise good money from VC as well. But what’s great about this is like we’re basically now the most capitalized observability tool on the market now. So that really helps us unlock a lot of different resources that I know. IBM is very much on board this Josh set to get behind. So anyway, that’s a that’s a close out of that conversation. All right.

Maxime: Well, before. We move on to, you know, I think talking about data ops and the abstract, too, I think there’s like catching up to do if you think about data ops as the potential to become, you know, as a strong and solid of a movement over time as like DevOps, right? On the software engineering side, I think like in general the data space data engineering and you know, is catching up with software engineering and will become potentially as big or bigger are as important or we’re going to see like a sheer diversity of companies and tools and frameworks and libraries and things. Right. I think we sometimes we assume that in data it’s winner takes all and that, you know, in dev ops, it’s like it’s okay that in DevOps there’s like hundreds of very successful company and in data ops, which one is going to succeed? I think what we’re going to see there is like diversity of solutions and frameworks and tools and things. You know, over time we’re going to develop that similar diversity. So there’s room for a lot of people to succeed there too.

Josh: Yeah, I think that’s a great point and there’s a lot of angles at which Databand is a bridge within IBM. You know, we hope to bridge between the IBM Enterprise stack in the modern data stack as one. Another really interesting bridge that our product will form is between the data fabric, the data services within IBM that IBM offers customers, which we have obvious relevance to. That’s a unit that we’re inside because we sell to data teams, right? The other angle where we are bridge on on that on that slide is within the automation services and this houses IBM’s various observability products. So this is like Instana and Turbonomic as some some of their acquisitions and that’s another vector where we have overlap and product integration work that we’re talking about and that will form that will serve as a bridge between the DevOps units that use services like Instana to make sure that their cloud operations are running effectively, their infrastructure is running effectively, and the data platform where companies are using IBM services to move and analyze and ship data. So we sort of sit in between that, that observability product suite, the full spectrum of observability and the data fabric services within the within the data and ai unit. So another area of bridge that just makes that DevOps versus data ops question really a focal point for us.