Some time ago, I wrote a simple tool (command line and C++ library) for generating random data for the same purpose.
You basically specify a regular expression and the random strings on output match that regexp. Link: https://github.com/vrok/randodo
Faker is python-specific, and not really about creating databases, but of course you can tack that on easily. It also gives you specific values instead of a set schema and supports different languages. I use faker too, and my favourite is to tack it on to factory_boy (http://factoryboy.readthedocs.org/en/latest/) for unit testing data with random but reasonable values.
Maybe a cool idea is to merge these two projects or use faker to do the value generation part here, but add more stuff in this project that's about generating a consistent schema of related values - fake users that post fake posts from fake locations in a consistent manner.
Btw, they have counterparts in the Ruby world: https://github.com/stympy/faker (which states to be a port from Perl) and https://github.com/thoughtbot/factory_girl
I use them in a Rails db/seed.rb file to create a new dev db every time I need new data to play with, especially after changes to the schema.
Oh, makes sense. I did not see that this is a bridge between DBs and faker. Mixer does something similar directly on top of sqlalchemy if I recall correctly.
I'm starting to learn and use MongoDB quite a bit and this one thing I tend to struggle with is getting enough data into a DB to test and run queries on. I'll be happy to give this a go.
emirozer (the author), you're marked as [dead], which means that your comments are hidden by default and nobody can reply to you. Try emailing hn@ycombinator.com and see if a moderator can sort that out.
It seems like this project is basically just fake-factory applied to a few different databases. It doesn't look that useful on its own, but it's a nice demo of those packages.
Using Faker with Fabrication is pretty swell as well. You'd already have Fabricators defined for your tests, so you just call the Fabricator however many time you need in your seed file and you're done.