Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

Kinda weird to see people going through this immense pain to get the whole import-pipeline and efficient search problem solved with OSM and Postgres + PostGIS when it's very much a "solved" problem.

Incremental updates and intelligent, flexible, efficient search are all immediately doable with existing open source software.

Why are people hellbent on using Postgres where it's suboptimal for a dataset this large that needs intelligent searching?

Seriously people: https://github.com/ncolomer/elasticsearch-osmosis-plugin

Learn the JSON document and query formats, and then proceed to jump with glee whenever you encounter a problem well served by a search engine instead of doing hack-neyed manual indices and full-text queries and poorly managed shards on Postgres.

Postgres is for important operational data. Excellent at that. Not so great for search, bulk, or static datasets.

ElasticSearch is so well-designed that I just expose a raw query interface to it to our frontend JS guy and he just builds his own queries from the user inputs.

ElasticSearch is probably like 20-30% of the technological secret sauce of my startup.



A key feature of spatial databases, for me, is being able to store and query shapes more complicated than just points.

In particular, roads with long straight segments don't have many nodes, so the nearest road isn't always the road the nearest node is on - and a road that travels through a bounding box won't always have a node in the bounding box.

Does ElasticSearch support indexing on geometry more complicated than points?


ElasticSearch isn't a spatial database per se, it's an exceptional search engine. It natively understand geo points, radii, bounding boxes, polygonal geo filter, geo faceting etc.

If it can be represented as a geo point or a composite of multiple geo points, then ElasticSearch can grok it. Otherwise, no.

If you want to query arbitrary paths, that's on you to bridge the gap between a spatial database and a graph store.

I'm not really sure what you're looking for. This post was about OpenStreetMaps.


Then, in answer to the question "Why are people hellbent on using Postgres" I'd say it's to do the kind of searches I described, which are a native feature of spatial databases like PostGIS and Oracle Spatial.


You never actually clarified as to what you wanted that wasn't included in the list I provided.


Here is a diagram: http://imgur.com/yHbS7

I store paths which are comprised of an ordered sequence of points. Depending on what spatial tool you're using, you might call this a path, a linestring, a line, or an ordinate array. In the diagram the points are black dots and the path is shown in purple.

I want to do a bounding box query - finding the paths that are entirely or partly inside a given box. In the diagram, the box I'm querying for is shown in red. As you can see in the diagram, the purple path passes through the red box, but none of the black points defining the path are within the box.

I can accomplish this with a single query using Oracle Spatial [1] or PostGIS [2]. It requires that the spatial database understand shapes more complicated than just points. Can elasticsearch? There aren't any examples of this I can find in the documentation.

[1] http://docs.oracle.com/cd/B12037_01/appdev.101/b10826/sdo_op... [2] http://postgis.refractions.net/docs/ST_Intersects.html


You could do this pretty easily with ElasticSearch given that vectors and paths are composed of start/end points, with varying degrees of customizability, but whether or not that would be a good idea depends a great deal on your query patterns and how much customizability and scalability you need.

The strengths of ElasticSearch are in trival sharding and replication intelligent, fast, and soft real-time search.

It's also got a very powerful, easy to understand, highly programmable query syntax that is very easy to generate in code.

It's not a spatial database and what I was originally talking about wasn't designed to solve pathing/graph traversal, but you could still do n-dimensional indexed spatial search in ElasticSearch and that is something I do on a regular basis although it's not the "base" use-case for their geo API.


I agree, the data is not optimized for searching since it's supposed to serve all sorts of purposes.

We didn't look into elastic search, since we wanted to give ArangoDB a try. We will have a look into it, thanks for the hint!


ArangoDB isn't designed to solve the same problems as ElasticSearch, it's a database/data store.

ElasticSearch is a search engine, first and foremost, and while you could use it as a database-of-first-resort, I'd be hesitant to recommend as much. For one thing, it doesn't take durability very seriously.

As a result, I have to assume you chose wisely if you're using ArangoDB for a standard database use-case.


Just wondering if you played with the hStore column in the postgres ways/nodes tables before diving into ArangoDB? I see hStore as nosql-on-demand within a relational schema: http://www.postgresql.org/docs/current/static/hstore.html


Nope we didn't, as I said in the post, it was one of the things we decided up front to use ArangoDB.

It is developed locally and we wanted to try if it scales up and assists us, or if we should go the "traditional" Postgres way that everybody else goes.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: