I wrote something about *Real* Data Engineering… then added the AI bit
martinchesbrough.net/data-and-ai-...
@mchesbro.bsky.social
Socio-technical data designer
I wrote something about *Real* Data Engineering… then added the AI bit
martinchesbrough.net/data-and-ai-...
I am signing up for Chad Sanderson's Shift Left Data Manifesto www.gable.ai/blog/shift-l...
It is more than data contracts but we all need to start somewhere ...
Adding more layers to your data architecture is unlikely to make it easier to manage data or make it work better - consider alternatives martinchesbrough.net/against-laye...
#dataengineering #dataarchitecture #datamesh #softwarearchitecture #datamodeling
The argument is more Layers AND than Layers OR
We need to evolve data architectures.
So I wrote this: medium.com/thinking-abo...
15.03.2025 04:02 — 👍 1 🔁 0 💬 1 📌 0I open up my browser and search for “data architecture” and back comes articles talking about Medallion Architecture, Source-Warehouse-Consumption Layers and so on.
I don’t see many alternatives.
Are we allowed to have a different opinion?
It probably started around 10 years ago when every Big Data/Hadoop project I was involved in seemed to mimic the good ol'data warehouse architectures of the 1990s. I was frustrated ...
15.03.2025 04:02 — 👍 0 🔁 0 💬 1 📌 0I've been meaning to write this for a while - it is titled "Against Layers" and is a plea for pluralism in data architecture.
15.03.2025 04:02 — 👍 2 🔁 0 💬 1 📌 0If you work with groups and teams, the Fun Retrospectives website is an excellent resource for ideas. I enjoy using my favorite techniques and experimenting with new ones to keep things exciting. #QiSky #Facilitation
18.12.2024 16:20 — 👍 6 🔁 5 💬 0 📌 0A ClickHouse query optimization guide. SQL-based observability in the wild. PIVOTing in ClickHouse, and more. It must be our final newsletter of 2024! 🎄
Few typos above - apologies
15.12.2024 12:39 — 👍 1 🔁 0 💬 0 📌 0I’m not saying that building a better LLM model is not important. Of course it is.
But for GenAI app developers it is less crucial.
Designing the app is paramount.
When I look at the features that OpenAI, Google and Anthropocene are launching they are more dependent on app features than LLM model
15.12.2024 12:37 — 👍 0 🔁 0 💬 1 📌 0My argument was that I can build an app, swap out GPT4o for Claude 3.5 and get similar results.
You get better results from better prompting. Or getting another model to check your results.
I was talking to a friend who’s very into Gen AI at a recent Xmas party and I put it to home that my views is that the LLM model “wars” are now pretty well over …
15.12.2024 12:37 — 👍 1 🔁 0 💬 1 📌 0- then I moved onto asking Gemini to read the code on my screen (in VSCode) and tell me what it did
- it did that pretty well (after an initial hallucination)
It's exciting!!!
- then I got it to explain the paper and research similar papers (ScholarGPT does this but not from a shared screen - you have to copy-paste text)
- and I asked it to identify gaps in the literature and research (it did a reasonable job)
I've had a lot of fun today playing with Gemini 2.0 through aistudio.google.com
- I got it to read an academic paper on my screen (Claude does this as well but Gemini does it better)
I don't agree with everything Andrea Gioia writes but I do think this book is a valuable addition to the body of knowledge on data products, data management and data architectures.
Go read it!!
This is a lot of heavy stuff to get into with the first 3 chapters of a book.
10.12.2024 23:10 — 👍 0 🔁 0 💬 1 📌 0I'll provide a word of warning as well - within the first 3 chapters we get a diagnosis of data platforms using System Dynamics, we get an interpretation of the organisation that builds them using Viable Systems Method and we get straight into Hexagonal Architecture as an approach to design.
10.12.2024 23:10 — 👍 0 🔁 0 💬 1 📌 0Let me provide an excerpt from Chapter 1 to do with the failure of data platforms as monolithic architectures
10.12.2024 23:01 — 👍 0 🔁 0 💬 1 📌 0There are too many books written that simply explain how to do things.
Now that's not a bad thing - the last book I mentioned, "Deciphering Data Architectures", is such a book. These books teach you stuff which is useful.
But "Managing Data as a Product" challenges you to re-think what you know.
Why should you read this?
If you liked Zhamak Dehghani's book on Data Mesh then this is (in my view) a book of the same ilk. It is not overtly data mesh, although it makes plenty of data mesh references.
"Of the same ilk" means that it challenges existing thinking on data and I like that.
Another day, another book recommendation. This time it is "Managing Data as a Product" by Andrea Gioia, published by Packt
www.packtpub.com/en-au/produc...
A screenshot of slides drawn from 7 slide decks, showing the range of topics addressed on the webpage
Calling univ profs & lecturers! If you want to bring concepts from Doughnut Economics into your teaching, we've just launched a webpage bursting with resources for you: 7 slide decks, reading lists, videos & activities - all open access. Dive in & pls reshare! doughnuteconomics.org/university-c...
25.11.2024 12:17 — 👍 452 🔁 187 💬 19 📌 11My advice is that before you decide to build a data xxxx have a good think about what problem you are solving and the pros and cons of xxxx
08.12.2024 01:39 — 👍 3 🔁 0 💬 0 📌 0This diagram is a good example - I like it conceptually but if you look closer it gets a few things wrong. Doesn’t change the core message though.
08.12.2024 01:37 — 👍 2 🔁 0 💬 1 📌 0I will offer some words of caution.
I don’t agree with everything James says. I suggest readers also do their own research and check things.
At the end of the day what “Deciphering Data Architectures” does is give you (the reader) a level of knowledge and some tools to understand the options.
In my work there is too much jumping in to build a data xxxx and too little understanding of what are the different options, pros and cons!