The Top 25 Data Engineering Influencers and Content Creators on LinkedIn
Interested in data engineering? You’ve come to the right place.
Whether you’re a data engineering pro looking to stay up to date on the latest trends or new to the space and want to learn more, following the right leaders and joining the right conversations can make all the difference when it comes to plugging into the data engineering community.
And one of the best places to do just that? LinkedIn.
LinkedIn is full of influencers sharing new ideas and sparking conversations on all kinds of topics, and data engineering is no exception. But knowing who to follow is important to getting the information you want on your home feed and not just a bunch of noise.
So without further ado, we’ve compiled a list of the top 25 data engineering influencers and content creators on LinkedIn. Happy following!
1) Joseph Machado
Senior Data Engineer at LinkedIn
Joseph is an experienced data engineer, holding a Master’s degree in Electrical Engineering from Columbia University and having spent time on the teams at Annalect, Narrativ, and most recently LinkedIn. He has deep expertise in distributed systems, data engineering, API design, data integration from multiple sources, and machine learning.
Joseph also manages the Start Data Engineering newsletter, which features tutorials, data design patterns, open-source tools, and techniques used by data-driven companies to help others become better data engineers and land their dream data engineering job. We’d be remiss not to share that Joseph was a recent guest on Databand’s MAD Data Podcast, where he discussed ways to keep data systems from becoming unwieldy and shared tips for data teams to manage their data warehouses and keep data pipelines running reliably. You can also watch the video recording.
2) Charles Mendelson
Associate Data Engineer at PitchBook Data
Charles is a skilled data engineer focused on telling stories with data and building tools to empower others to do the same, all in the pursuit of guiding a variety of audiences and stakeholders to make meaningful decisions. Currently, Charles works at PitchBook Data and he holds degrees in Algorithms, Network, Computer Architecture, and Python Programming from Bradfield School of Computer Science and Bellevue College Continuing Education. Charles is also an Instructional Assistant for the Python certificate program at the UW School of Professional and Continuing Education.
Notably, Charles combines his data engineering experience with a background in business and entrepreneurship, having previously worked as a business and data analyst and earning a Master of Liberal Arts Degree in Psychology from Harvard Extension School and a Certificate in Entrepreneurship Essentials from Harvard Business School Online. This blended experience shows on LinkedIn, where he discusses data, Python, creativity, psychometrics, and data engineering.
3) Deepak Goyal
Azure Instructor at Microsoft
Deepak is a certified big data and Azure Cloud Solution Architect with more than 13 years of experience in the IT industry. He also has more than 10 years of experience in big data, being among the few data engineers to work on Hadoop Big Data Analytics prior to the adoption of public cloud providers like AWS, Azure, and Google Cloud Platform.
Currently, he helps companies define data-driven architecture and build robust data platforms in the cloud to scale their business using Microsoft Azure. He runs the high-ranking Azure blog, azurelib.com, which features tutorials to help people understand cloud concepts and technologies like Azure Data Factory, Azure DataBricks, Apache Spark, Azure Synapse Analytics, Azure Key Vault, Encryption Decryption, Azure Blob Storage, Azure monitor, logging, Snowflake cloud data warehouse, Fivetran, and more. It also features tidbits from Deepak’s personal experience and advice on acing interviews to help land your dream job. Deepak regularly shares blog content and similar advice on LinkedIn.
4) Sarah Floris
Senior Data and ML Engineer and Zwift
Sarah describes herself as a curious, self-starter data scientist who is willing to challenge the status quo. With more than five years of experience as a data engineer, Sarah currently works at Zwift, where she leads a team of vendors to build data pipelines and deploy machine learning models and owns e-commerce datasets to handle data quality, data contracts, and resolve pipeline downtime.
She also runs dutchengineer.org, which features a blog and newsletter full of tips for landing your dream job in data science, and offers digital courses and one-on-one mentoring for data scientists and data engineers.
Sarah focuses on designing, analyzing, and visualizing large (>500 TB) datasets, building and managing multiple ETL infrastructures to identify text patterns and detect anomalies, and preparing key metrics to guide product decisions in people, hardware, and chemical analytics. She holds a Master’s degree in Physical Chemistry with an emphasis on data science from the University of Washington and is multilingual, speaking Dutch, English, and Spanish.
5) Caleb Keller
Principal Solutions Architect at Elastic
Caleb is a mechanical engineer turned data scientist turned machine learning practitioner. He is focused on solving the problems of enterprise data, starting with how we can “Do Data Better.” Caleb has over a decade of experience in the data and engineering space, currently working as a solutions architect at Elastic.
Caleb also offers executive coaching, IT consulting, project management, and more, and his LinkedIn page focuses on sharing advice and experiences around data reporting, strategic planning, data analytics, machine learning, and data engineering.
6) Gowtham SB
Data Engineer II at PayPal
Gowtham is a big data enthusiast who works with various big data and AWS technologies to help companies build data frameworks and more. With more than eight years of dedicated experience in big data and AWS, including time spent at PayPal, AstraZeneca, and Capgemini, Gowtham has trained more than 2,000 big data professionals.
Gowtham is an active blogger on Stack Overflow and Quora, a technical speaker on emerging technologies like big data and cloud computing, and runs a YouTube Channel, Data Engineering Videos. On LinkedIn, he focuses largely on Spark, Hadoop, big data, big data engineering, and data engineering.
7) Zach Wilson
Staff Data Engineer at Airbnb
Zach is an experienced data engineer with nearly a decade of experience and Bachelor’s degrees in Applied Mathematics and Computer Science from Weber State University. His experience includes growth analytics at Facebook, security infrastructure at Netflix, and, most recently, pricing infrastructure at Airbnb.
Zach runs a YouTube channel, Data with Zach, that teaches all about data engineering as well as a popular blog at www.zachwilson.tech. His LinkedIn page focuses largely on data science, data engineering, machine learning, software engineering, and even mental health.
8) Shashank Mishra
Data Engineer III at Expedia Group
Shashank is a data engineer with over six years of experience working in service and product companies, having solved data mysteries across aviation, pharmaceutical, fintech, and telecom companies and designed scalable and optimized data pipelines to handle petabytes of data with both batch and real time frequency.
He currently runs a YouTube channel, E-Learning Bridge, focused on video tutorials for aspiring data professionals and regularly shares advice on data engineering, developer life, careers, motivations, and interviewing on LinkedIn.
9) Andreas Kretz
Founder at Learn Data Engineering
Andreas has over a decade of experience in data engineering, which he has used to found Learn Data Engineering, the ultimate data engineering academy that teaches everything you need to know to become a data engineer or add data engineering to your skillset.
He also runs a popular YouTube channel under his name, where he shares tools, techniques and topics that he experiences in his day to day work. On LinkedIn, Andreas talks heavily about big data, data science, and data engineering.
10) Mehdi Ouazza
Staff Data Engineer at Trade Republic
Mehdi is a self-proclaimed data geek entrepreneur passionate about Big Data, Data Science, Web App, and Music. With more than 7 years of experience, Mendi had the opportunity to work on multiple aspects of data engineering, that includes data pipelines (stream/batch), data modeling, orchestration, infrastructure, and from time to time, analytical reports using dashboarding tools.
Mehdi is also the creator of the Data Curators Club (https://datacreators.club/) where you can suggest new data influencers and listen to his podcast with the top minds in data.
11) Xiaoxu Gao
Senior Data Engineer at Dott
Xiaoxu is a developer with a focus on Python and data engineering. She holds a Master’s degree in Computer Science from KTH Royal Institute of Technology and has experience working in data engineering roles at companies like Meltwater, ING, and Dott.
She publishes a popular blog on Medium, featuring advice for data engineers and posts frequently on LinkedIn about coding and data engineering.
12) Bob Haffner
Data Engineer and Co-Founder at Inventive Data Solutions
Bob has over 20 years of experience in data analytics and data engineering, most recently having founded Inventive Data Solutions. He is also an AWS Certified Solutions Architect and AWS Certified Big Data expert.
Bob also hosts The Engineering Side of Data podcast, which is dedicated to discussions around data engineering and features a variety of guests from the data engineering space. He carries these discussions over to LinkedIn as well, with frequent conversations about everything from functional data engineering to data lakes and data warehouses.
13) Robert Sahlin
Data Engineering Lead at Mathem
Robert has over 15 years of experience in data engineering, including architecting, building, and running data platforms for enterprise companies across multiple industries, often within the domain of e-commerce or consumer-oriented digital services.
A data engineer at heart, Robert remains hands-on with coding and is active in many open source communities in the data domain. He also helped start the Data Mesh Learning community on Slack and publishes a blog on his thoughts and learnings as a data engineer at robertsahlin.com. His posts on LinkedIn focus on machine learning, big query, data engineering, stream processor, and streaming analytics.
14) Jessie Snyder
Program Director, Product Management, IBM Data Integration
Jessie Snyder currently leads the Product Management team for the IBM Data Integration portfolio, covering DataStage and Information Server on Cloud Pak for Data, DataStage as a Service, QualityStage, IBM Address Verification, and the InfoSphere Information Server suite.
With almost a decade of experience in product development, she specializes in areas of Data Fabric, Cloud, Multicloud and Hybrid Cloud Data Integration. She holds a Computer Science degree, and has authored eight patents. Jessie recently stopped by Databand’s MAD Data Podcast to talk about the past, future, and next big thing for the Data Fabric. Tune in to also hear her go in-depth on the topic of data integration and how it’s quickly gaining popularity within modern data teams. You can also watch the podcast’s video recording.
15) Simon Späti
Data Engineer and Technical Author at Airbyte
Simon is an entrepreneurial data engineer with more than 15 years of experience. In his current role at Airbyte, he both works as a data engineer and helps educate other data engineers through articles, tutorials, and best practices guides.
He publishes a data engineering blog focused on big data, Python, open source, and ETL at sspaeti.com and shares insights on the same topics regularly on his LinkedIn page.
16) Deepanshu Karla
Data Engineer at Google
Deepanshu is a data engineer with more than a decade of experience at companies like TiVo, Microsoft, and Google. He describes himself as a creative thinker, continuous learner, and technologist who is adept at implementing advanced technology and business solutions, especially in customer and growth analytics. Deepanshu’s skills include SQL, data engineering, Apache Spark, ETL, pipelining, Python, and NoSQL, and he has worked on all three major cloud platforms (Google Cloud Platform, Azure, and AWS).
Beyond his work at Google, Deepanshu also mentors others on career and interview advice at topmate.io/deepanshu. He also shares thoughts and advice regularly on LinkedIn, centered around topics like SQL, data engineering, careers, and interviews.
17) Darshil Parmar
Data Engineer at Wayfair
Darshil is a data engineer and solution architect specializing in data strategy, technology selection, building data warehouses, creating and automating data pipelines, real-time streaming and ETL processes, building interactive dashboards, data cleaning, data processing, machine learning models, and data migration.
He has experience across cloud platforms (AWS, GCP, Azure), databases (SQL Server, Redshift, BigQuery, Snowflake, RDS, PostgreSQL, MySQL, S3, DynamoDB, MongoDB, Cloud Data Store, Redshift), data integration/ETL solutions (Talend, Stitch, Informatica, SSIS, AWS Glue & EMR, Alteryx, GCP DataFlow & DataProc), BI/visualization (Tableau, PowerBI, Spotfire, Excel, Google Data Studio, AWS QuickSight), and machine learning (Natural Language Processing, Keras, Jupyter Notebook, Python, TensorFlow, Pandas, Numpy, Pytorch, JS).
Darshil also offers freelance data engineering services, including database development, cloud application development, business analytics, and information consulting. He runs a successful YouTube channel under his name focused on data engineering, architecting systems, data science, machine learning, and interview preparation, and posts frequently on LinkedIn about similar topics.
18) Matthew Blasa
Machine Learning Engineer Consultant at Aspire Analytics
Matthew Blasa is a data scientist who describes himself as a person who lives to experiment and learn. He is constantly seeking out knowledge and being excited by the challenge of learning something new in the data science space. His most passionate topics include MLOps, machine learning, data quality and data governance. In his free time, Matthew enjoys mentoring people making the transition into data science field, which in turn helps drive him. Check out the MAD Data Podcast episode with Matthew, where he explains how incident management as a part of observability and governance can affect various data areas and shows how he and his team tackle data observability, as well as different ways of approaching data governance on a small scale versus an enterprise scale. You can also watch the video recording.
Check out Matt’s Medium @ blaza-matt and his YouTube channel @ DataLife360
19) Ananth Packkildurai
Principal Software Engineer at Zendesk
Ananth has over a decade of experience in software and data engineering, having worked at companies like Sephora, Bazaarvoice, Slack, and Zendesk. He has worked on everything from building big data platforms to developing scalable visibility infrastructure for log, metrics, and traces to applying observability to traditional data platforms.
Ananth is also the editor of the popular newsletter Data Engineering Weekly, where he writes about the latest developments in the data engineering world. He shares similar content on LinkedIn, with a focus on topics like big data, data quality, data science, data analytics, and data engineering.
20) Madison Schott
Analytics Engineer at Winc
Madison has more than five years of experience as a software and data engineer, working with Capital One and Winc. She holds a Bachelor’s degree in Mathematical Finance and Marketing from Seton Hall University.
As the author of the Learn Analytics Engineering newsletter, Madison’s goal is to help others modernize their data stacks by thoughtfully choosing and implementing the right tools to move from data-informed to data-driven. She also posts frequently on LinkedIn, sharing various articles and advice around topics like analytics engineering, data engineering, and the modern data stack.
21) Charles Verleyen
CEO and Lead Architect at Astrafy
Charles is an expert data engineer with experience working on the full spectrum of the data journey, from business workshops and designing architecture to implementing solutions.
He is experienced in DevOps, DataOps, and SecOps, with specialties in data engineering, supply chain management, and project management. He also holds eight certifications in Google Cloud Platform as well as certifications in Python, AWS, and more.
Charles also shares his experience and advice on LinkedIn, regularly discussing topics like dbt, Google Cloud, data analytics, data engineering, and data architecture.
22) Saikat Dutta
Senior Specialist Data Engineer at LTIMindtree
Saikat has over a decade of experience as a data engineer, working at companies like Tata Consultancy, LTI, and LTIMindtree. His specialties include Microsoft SQL Server, Azure Databricks, Azure Data Factory, SQL Server Integration Services (SSIS), and Azure Data Lake.
Saikat is also passionate about guiding others to grow their careers, serving as a mentor for junior data engineers and publishing a Medium blog focused on data engineering skills and career tips. He regularly shares advice on LinkedIn about working at startups, data lakes, data engineering, Azure, and continuous learning.
23) Matt Weingarten
Senior Data Engineer at Disney Streaming Services
Matt is an experienced data engineer who has worked on the teams at Nielsen, Facebook, and Disney helping standardize data engineering practices and improve data pipelining, data quality, and data visualization. He holds a Master’s degree in Computer Science from the University of Florida.
Matt writes frequently about all things data engineering on his Medium blog, covering everything from data quality and data security to platforms like Snowflake and AWS. He also shares his thoughts on LinkedIn as a regular contributor around topics like AWS, data analytics, and data engineering.
24) Marc Lamberti
Head of Customer Education at Astronomer
Marc is a data engineer turned instructor, having worked as a data engineer for several years before joining Udemy and now Astronomer as an instructor in data engineering. As an instructor, Marc focuses on Apache Airflow, how to identify and learn new technologies, and striking a balance between theory and practice in data engineering. Marc is also experienced in machine learning and holds a Master’s degree in Information Technology from The Hong Kong University of Science and Technology.
He shares advice regularly on LinkedIn, offering observations and tips around Airflow, general training, and data engineering.
25) Tobias Macey
Associate Director of Platform and DevOps Engineering at MIT
Tobias is a multi-disciplinary engineer with over a decade of experience across many industries, technologies, and responsibilities. Currently, as the Associate Director of Platform and DevOps Engineering at MIT, he is focused on marrying the worlds of software engineering, systems automation, and data analysis.
Data Developer at Shopify
Matthew has a knack for building ecosystems and tools that gain adoption. He also likes to think of himself as a Data Pioneer. These days, Matthew sets out to build systems to make sure his team succeeds, and he’s been working in MLOps ever since. In addition, he’s known for being able to do everything data. In fact, he has experience in almost all aspects of the data life cycle, from dashboards, analytics, and statistical tests to setting up servers, building machine learning pipelines, and data warehouses. Furthermore, he is experienced in most types of datasets having built deep learning models in NLP, CV, and RL tasks.
If you are interested in building and improving your career in data, Matthew loves to talk about clean code, data careers, and data science.
Mark Freeman II
Founder of On the Mark Data
Mark is a community health advocate turned data scientist interested in the intersection of social impact, business, and technology. His life’s mission is to improve the well-being of as many people as possible through data, especially among those marginalized. Mark received his M.S. from the Stanford School of Medicine where he was trained in clinical research, experimental design, and statistics with an emphasis on observational studies.
In addition, Mark is also certified in Entrepreneurship and Innovation from the Stanford Graduate School of Business. He also has adept knowledge of coding in Python, R, SQL, and using big data tools such as Spark. Mark is the founder of On the Mark Data, where he uses the platform to share impactful ideas via content creation, as well as push for innovation through consulting startups. He encourages others to reach out to him if you want to learn more about how he is pushing towards data systems and products representative of all communities.
IT Manager-Data Engineering at Cloud Untitled
Suraz is a Data Engineer with 11 Years of IT experience and has been working in Data Engineering for the last 9 years. He has worked with companies and clients such as Capgemini, UHG, CreditVidya, Prokarma, Apple, and is currently working as a Manager of IT Data Engineering at Cloud Untitled.
He also offers a top-class Data Engineering Program to help non-IT people transitioning to Data Engineering, help non-coding people to Data Engineering, and help existing Data Engineers boost their criteria with in-depth knowledge of Data Engineering. You can learn more about his courses at Online Learning Center.
Founder & CEO of Forest Rim Technology
Best known as the “Father of Data Warehousing”, Bill has become one of the most prolific and well-known authors worldwide in the big data analysis, data warehousing, and business intelligence arena. He is the founder of Forest Rim Technology, which is also the world leader in converting textual unstructured data to a structured database for deeper insights and meaningful decisions.
Bill was also named by ComputerWorld as one of the ten most influential people in the history of the computer profession. You can check out three of Bill’s latest books: The Data Lakehouse Architecture, Integrating Data, and Building the Data Lakehouse.
Co-Founder & CEO of Monte Carlo Data
Barr is an entrepreneur and a well known star within the data engineering community. In 2019, she founded Monte Carlo, a data reliability company backed by Accel, GGV, Redpoint, and other top Silicon Valley investors. You can find her frequently sharing her knowledge and expertise on her Medium blog.
You might be thinking, “Why would we put a content creator that’s also viewed as a competitor to Databand?” Well, it’s because we think Barr has some great things to say about Data Quality and Observability even if the product competes with Databand.
Data observability that's built to improve operational flow
Implement end-to-end observability for your entire solutions stack so your team can build better performing and more reliable data products.