About
About the Author
I am Amin Siddique, a Senior Data Engineer at Mercedes-Benz with over 6 years of experience building production data systems. I specialize in large-scale SQL migrations, pipeline architecture on Databricks, and integrating AI tooling into real engineering workflows.
My day-to-day work involves migrating enterprise-scale data warehouses (Oracle, Exasol) to modern lakehouse architectures on Databricks, building dbt models that serve thousands of downstream consumers, and optimizing Spark jobs processing billions of rows under tight SLAs.
Credentials and Experience
- Current role: Senior Data Engineer at Mercedes-Benz, working on enterprise data platform modernization
- Specialization: SQL dialect migration (Oracle, Exasol, Databricks), dbt at scale, Spark optimization, lakehouse architecture
- Certifications: Databricks Certified Data Engineer, Azure Data Engineer Associate
- Tools in daily use: Databricks, Apache Spark, dbt, Python, Azure Data Factory, Delta Lake
- Scale: Pipelines processing 10B+ rows daily, managing 500+ dbt models in production
Why This Blog Exists
Most data engineering content falls into two categories: vendor marketing or academic theory. Neither helps when your pipeline fails at 2 AM or when you need to migrate 500 stored procedures to a new platform.
Reliable Data Engineering fills that gap. Every article is grounded in hands-on experience with production systems. I write about what actually works, what breaks, and what I wish someone had told me before I learned the hard way.
The data engineering landscape is shifting fast. AI agents are starting to write SQL, manage pipelines, and automate the grunt work that used to define the job. I started this blog to document that transition honestly — not with hype, but with benchmarks, code, and real-world results.
What You Will Find Here
- Data pipeline patterns that survive production load — architectures tested against billions of rows and tight SLAs. See: Our $2M Data Lakehouse Is Just Postgres With Extra Steps
- SQL migration strategies across dialects, including the edge cases that vendor docs never mention. See: How We Cut LLM Token Usage by 90% in SQL Migration
- AI in data engineering — how LLMs, agents, and new tools are changing the way we build pipelines. See: Your Data Stack Wasn’t Built for This
- Honest tool reviews with real benchmarks, limitations, and trade-offs that go beyond the marketing page. See: Databricks Agent Bricks Is Quietly Changing How Data Engineers Work
- Research paper breakdowns that translate academic AI and systems research into practical takeaways. See: 10 Research Papers Every AI Engineer Must Read
My Engineering Philosophy
I believe data pipelines deserve the same engineering rigor as application code. Version control, testing, code review, and CI/CD are not optional — they are the baseline. Before data engineering, I spent time in software development, which shaped this perspective.
I do not write sponsored content. When I recommend a tool, it is because I have used it in production. When I criticize something, I explain why with specifics. Every article includes limitations and honest assessments.
I test everything I write about. If an article includes a benchmark, I ran it. If it includes a code snippet, I executed it. If it covers a tool, I installed it and used it on real data before forming an opinion.
Some articles on this site contain affiliate links to books I genuinely recommend. These are clearly disclosed and do not influence what I write.
Connect
- LinkedIn: linkedin.com/in/amin-siddique
- Medium: medium.com/@amin-siddique
- Email: amin.siddique@outlook.com
Have questions or feedback? Visit the Contact page.