Beam is a job orchestration and data processing framework from Apache foundation, from a bit of reading, it seems similar to Airflow.
Looking for any good book recommendation on Apache beam Python SDK. I know I can read the documentation, but it seems it is quite scattered and involves a lot of navigation to build a foundation or mental model.
Some books that I did come across, focus on the Java SDK which is not what I want. Hence wondering if anyone can recommend any book focussing on the python API.
Beam is kinda outdated, but look at the resources in r/dataengineering
Why would you say it’s outdated? I don’t see any other technology that has a similar scope as Beam.
Correction: meant to say niche. As a DE, it’s not one of those tools I think about in the space. I’ve heard of it and played around with it, but it’s definitely not something you hear often being used.
And it’s Python API is very very immature and poorly documented.
Okay, that I would agree with.