Hacker Timesnew | past | comments | ask | show | jobs | submit | more LLcolD's commentslogin

Maybe this is like a chicken and the egg problem? I see it in my country. Investors would like to come and build a factory, but there is no work force. So they go elsewhere.


What do you call your product?


Peervine


It all depends. I consider my self to be generalist at the place where I'm at the moment. But if I were to switch companies at some other place I would be considered an expert.


Great. Webtestpage reminds me of https://browsershots.org/ that is down for some time.


I really like webpagetest.org. Thank you!


For me the perfect setup would be about 25 hours of work per week. Something like 4 days, 6 hours each day. Or 6 days with 4 hours of work.


This could work if you have people that know how to operate your server. I guess that it is not the same if you have your own rack in some data center and AWS servers. What about routers, networking and all of the other things?


He does "disclose" in related blog post [1]: "I don’t work for Neo4j anymore, why am I here defending them? Well… that and the fact that I still have a dinghy load of vested shares I have to sell so I can buy a place in the Villages and begin a new life as a golf cart driving day drinker."

This seems like a series of post on benchmarking results from different vendors so if he "disclosed" it once I don't think that there is need for another one.

[1] https://maxdemarzi.com/2022/12/06/khop-baby-one-more-time/


Yes, running a benchmark on your data is the only way. I've taken a look at both benchmarks (the one from OP and the one from Memgraph). They seem like different types of benchmarks and different approaches. But I still find it interesting that although the numbers in OP's are not so much in favor of Memgraph it turns out that Memgrpah is faster than Neo4j in large number of benchmark queries. So yes, it all comes down to type of benchmark and data that you use.

I've also noticed (from OPs tweet https://twitter.com/maxdemarzi/status/1613075177704677376) that he used Enterprise version of Neo4j, but it doesn't say which Memgrpah version was used. I don't have experience with this two databases, but usually ENT versions are somewhat better than community ones.

[EDIT]: I fixed few typos.


Memgraph compared the freely available open source editions of both databases. Neo4j Enterprise seems to have more performance optimizations compared to the community Edition.


He didn't run the Memgraph one, only took their numbers https://hackertimes.com/item?id=34368362

I'd assume it's what the disclaimer at the top was about: that the code is in Python which he's not familiar enough with. The license bit on not having the right to integrate it if you're building a competing product might not be decisive but a good enough reason to not invest more time into running the code.


I mean it should come as no surprise that an in-memory graph DB outperforms one that stores data on a hard disk, even an NVMe SSD.

I would also add that the primary sell for Memgraph seems to be “fast enough that it can process data as it comes in via a stream, and present it to the user in a reasonable timeframe”. Anyone facing this use-case would want to use Memgraph regardless of how much faster it is than Neo4j.


That's their claim, but who knows. The article shows:

* Memgraph's benchmark only show SQL ~where clauses, not graph ones

* (nor streaming ones)

* The existing memgraph numbers are questionable, and if the competitor tuned, who knows

* The memgraph team refuses to use community-defined graph benchmarks for these articles.. so we won't know

* Memgraph uses weird patterns like doing bulk loads as a query stream of atomic singleton creations vs batching (csv, arrow, ...), so even if it was graph/streaming, a proper benchmark would show tools going way faster b/c the relevant task would instead be for csv/arrow/etc bulk loaders or some other form of micro/macro batching

It's not just this article but the others too. It's frustrating to watch the memgraph leaders take their VC money and dump it into a big negative campaign lying about basically anyone in the community. They even spend money punching down at academics doing OSS. I haven't been this annoyed at a seemingly real tech company in a long time.


DISCLAIMER: I'm a cofounder and the CTO at Memgraph.

The workload and software used to benchmark are public on Github, which means they can be validated and tested. Memgraph as a company is committed to improving Memgraph and benchmarking further. That's why, in addition to other reasons, we raised funding. We have made no false statements and our findings are replicable. Everything, Memgraph source code + benchmark methodology, is public.

Benchmarks are always workload dependent and we always encourage people to test on their workload. The workload in the benchmark closely resembles the ones our customers have most often (mixed highly concurrent read/write with real-time analytics), and we perform well on it. Our default Snapshot Isolation consistency level further enables a vast class of applications to be built on top of our system which would simply break due to the weak consistency guarantees of legacy graph databases. That's precisely the reason why our customers choose us. You should always test on your workload because your mileage may vary and Memgraph might not be the right fit for you.

The main reason Memgraph is performing that much better is that Neo4j Community Edition 5.0 is limited for anybody in terms of how it uses available resources. On the other side, Memgraph Community (equivalent offering, it's not 100% the same, but it's closest to compare, no two systems are the same) does not restrict the performance of our public offering, and that's also something we want to highlight as just one of Memgraph's competitive advantages. So, all this is about comparing offerings rather than the underlying tech. Even if you take Neo4j Enterprise (which Max did, on completely different hardware, which is... "creative"), Memgraph has an advantage.


Fair enough. I didn’t realize they were being so shady with the benchmarks.

Isn’t Neo4j written in Java and Memgraph written in C++ (with lots of Python extensibility)? By that alone I would think Memgraph would be more performant most of the time, unless Memgraph is poorly-written/optimized vs Neo4j, which is very possible.

I work on the “R&D” team for my company so we spend a lot of time researching and building PoC apps. I did one with Memgraph a few months ago after concluding it ought to outperform Neo4j, however I did not build the app with Neo4j to do a side by side comparison of performance. Both support Cypher so I wasn’t attached to one or the other, but I’ve always liked the idea of using in-memory stuff (like RAMDisk) to achieve extreme performance, and I figured at worst Memgraph would be “as fast” as Neo4j… that is 100% an assumption though and assumes that Memgraph is well-written. It sounds like it’s not though.


Totally agree with doing your own benchmark, and when performance matters, work with someone who knows the systems

I'm not a neo4j expert, and am not paid to write this. That said, their GDS subengine from the last couple of years appears to be distributed in-memory, essentially a view, and their year-over-year improvements there have been substantial. There might be no difference at the checkbox level. Likewise, when we did billion-scale work here with a variety of common queries, we found that the existence of basic features like indexes quickly changed what was fast vs slow. Historically, C++ vs Java is often < 2X of a difference, so when we're talking parallel & distributed hardware with tricky query planners & data representations... I have many questions beyond the language. If they were targeting something like FPGAs, I might feel differently.


Diagrams that show LDA, STA, cycles, etc. Like this one -> https://iitestudent.blogspot.com/2011/06/timing-diagram-for-...



Wow. Didn't ever think that there is something like this. I've heard about UML, but I've always thought that it is something related only to programming.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: