Dolphin Scheduler

jackneary · on Jan 9, 2020

I try DolphinScheduler online demo: http://106.75.43.194:8888/

easy to use.

jackneary · on Jan 9, 2020

mail to dailidong66@gmail.com, and tell him you want to try online demo.

Xiali · on Jan 9, 2020

DolphinScheduler ranks among top 10 most valuable projects in OSChina GVP(Gitee Most Valuable Project)

Ozzie_osman · on Dec 26, 2019

How does this compare to something like Airflow or Luigi?

luckypeter · on Jan 9, 2020

may be you can refer their proposal. https://cwiki.apache.org/confluence/display/INCUBATOR/Dolphi...

foo_barington · on Dec 26, 2019

This may be of interest:

https://github.com/mikub/titanoboa

newcrobuzon · on Dec 27, 2019

Thanks for linking! There seem to be similarities, but looking (very) briefly at the DolphinScheduler these are the potential differences:

- number of contributors :D

- titanoboa can process even a potentially cyclic graph

- in titanoboa you can write step functions directly in high level programming languages such as clojure and java (so not just bash or python) and deploy them directly during runtime

- the clustering setup in titanoboa is master-less

- titanoboa does not have such direct integration with Spark as it employs some map-reduce patterns internally

But all-in-all I have to say that DolphinScheduler seems quite nice! Also would have to compcomplement it on the nice documentation (again, just briefly skimming through it).

(edit: formatting)

eitland · on Dec 26, 2019

That looks interesting but remind me more about Apache Camel.

rb808 · on Dec 26, 2019

Also interesting that its one of the first open source applications from China I've seen.

y4mi · on Dec 26, 2019

I've actually stopped paying attention to https://github.com/vitalets/github-trending-repos because there are so many Chinese repositories each week.

It's just rare to get them on HN, because it's a nightmare to go through their docs and they're usually not even attempting to write their code in English. Basically unusable for all intents and purposes, even if it were quality software.

tasogare · on Dec 26, 2019

> Basically unusable for all intents and purposes, even if it were quality software.

Only if you don’t have anyone who can’t read Chinese on your team. Also most repositories are not documented at well or at all anyway so language hardly matters.

y4mi · on Dec 26, 2019

yes, most repositories arent documented well either, thats definitely true and was part of my point, really.

how are you going to figure out why you're encountering a bug if not even the code itself is written in english?

its fine for learning repositories or simple toy projects, but if you actually want your code to be used... please use the world language. (and no, english isnt my native language either)

hinkley · on Dec 26, 2019

I wonder if we’re hitting a point where a better decompiler would be useful. Transliterate the code into your first language, English or not.

JonathonW · on Dec 26, 2019

A decompiler can't pull contextually appropriate variable and function/method names out of nowhere (not to mention comments), which is the big roadblock when reading foreign-language code.

That is, you're just as likely to be able to follow foreign-language code as you are decompiled code. Either way, you've basically thrown out all the documentation and swapped out all the names for gibberish.

bathtub365 · on Dec 26, 2019

It seems like having only some team members be able to really understand the code would still be a risk.

rb808 · on Dec 26, 2019

This looks great, I've always wanted something like this. I've always had autosys or controlm at work and they both suck.

I'd just prefer if it had been around longer. Any other open source alternatives out there? I only know of Airflow, k8s Cronjobs.

massive · on Dec 26, 2019

There you go https://github.com/meirwah/awesome-workflow-engines

hartzell · on Dec 26, 2019

> Any other open source alternatives out there?

Here's a big handful of them: https://github.com/pditommaso/awesome-pipeline

daveFNbuck · on Dec 26, 2019

There's also Luigi. https://github.com/spotify/luigi

elcritch · on Dec 26, 2019

I was just reading up on Broadway, written in Elixir, (https://hexdocs.pm/broadway/Broadway.html) that provides the fundamentals of batching/job control. It’s by the creator of Elixir and is based on 7 years of libraries in the area so the fundamentals are pretty well honed.

monstrado · on Dec 26, 2019

I've had alot of success using Apache NiFi as a distributed scheduler / general purpose workflow tool.

jpitz · on Dec 26, 2019

I'd love to see what kind of complexity you are managing there, and how.

powerbook5300CS · on Dec 26, 2019

The only other one I can think of off the top of my head is dagster: https://github.com/dagster-io/dagster

It’s made by Nick Schrock of graphql fame, among others. I’m sure there are 100s of these projects though.

TeMPOraL · on Dec 26, 2019

From the beginning of Dagster's readme:

> Dagster is a system for building modern data applications.

> Combining an elegant programming model and beautiful tools, Dagster allows infrastructure engineers, data engineers, and data scientists to seamlessly collaborate to process and produce the trusted, reliable data needed in today's world.

Two paragraphs, communicating zero bits of information. I wish Github repositories, of all places, didn't contain such noninformative copy.

pgoggijr · on Dec 26, 2019

This looks almost exactly like Airflow - I wonder what Apache’s plan is for both of these to coexist.