Hacker Timesnew | past | comments | ask | show | jobs | submitlogin
Thank You for 20 Years of Discogs (discogs.com)
440 points by paulcapewell on Nov 4, 2020 | hide | past | favorite | 89 comments


Love me some Discogs. Always scrape against their API to tag my downloads. Marketplace is sick too; never would have imagined that I'd own so much weird shite from Portugal or Germany or Japan or wherever. Maybe one of the few sites left that makes me believe in the internet.


Exactly, it's one of those sites reminding us of the 90s/early 00 internet and how great it once was. Thankfully they stayed with the "conservative" revenue system as well as never followed the caravan of laggy and oversimplified UI.

As a vinyl collector, thank you Discogs!


I've been hunting for an auto-tag solution for my collection. What tools are you using?


Beets is a decent and battle-tested solution. It has a Discogs plugin as well.

https://github.com/beetbox/beets

it takes a while to take a large library, but that's mostly because of intractable problems with ambiguity in metadata searching. I think it's about as fast as is reasonably possible.


My favorite thing about beets is that it doesn't stop processing when it needs user input. Meaning you can tell it to import a bunch of files, completely forget about it, and come back the next day and only give it input for the files that need it.


I wouldn't let an auto tagging solution anywhere near my collection - at some point it will get things wrong. However if you don't mind manually checking and editing tags I find Meta [1] to be an absolutely fantastic app.

Every new release that gets added to my collection will spend a few seconds in Meta so that I can check and edit the meta data and rename the files in a consistent way. Obviously you can also edit multiple releases at once.

I used quite a few tagging apps over the years, Meta is without a doubt the best of them.

[1] https://www.nightbirdsevolve.com/meta/


Discogs is great. In my experience doing entity matching/record linkage across all the different music APIs:

- Discogs seems to have the most complete release data.

- MusicBrainz has the best organized encyclopedic data. For example they have entities for specific instruments, relationships between artists, entries for supporting roles like recording engineer, etc.

- Wikipedia still has the best common-sense/canonical info about singles. For example, Wikipedia will tell you in the first sentence (or infobox) "song X was first released on album Y in year Z." Answering that can actually be quite challenging on other services.

- Spotify is the go-to for finding useful qualities of individual songs (due to their acquisition of Echo Nest). Things like popularity, danceability, energy, etc.

A long-term project of mine has been wrapping all these up in a more approachable API. So far that has only resulted in this big GraphQL schema project: https://github.com/exogen/graphbrainz – but the idea is to build on that and make it even simpler.


> - Wikipedia still has the best common-sense/canonical info about singles. For example, Wikipedia will tell you in the first sentence (or infobox) "song X was first released on album Y in year Z." Answering that can actually be quite challenging on other services.

Sad fact: For a while Discogs had an experimental feature aptly called "Tracks". It allowed individual tracks, not just the whole release, to be viewed as separate entities along with references to all releases on which said track appeared. A database of canonical tracks.

It enabled me to see if that great track from an ultra-rare and hard to find album perhaps appeared on a compilation or any other much easier to buy release. Sure, you can achieve similar results by simply searching, but it's much more cumbersome and downright difficult when it comes to rather generic track titles leading to countless irrelevant results.

Sadly the feature was abandoned: https://www.discogs.com/track


There's also AcousticBrainz for audio features, coupled to MusicBrainz: https://acousticbrainz.org/


Do Spotify expose much of that API? I remember using, or coming across, Echo Nest around the time of its acquisition only to find that Spotify cripplied its functionality.


As far as I know they still support all the same fields that the Echo Nest did! At least, I can't remember any additional ones that aren't here:

https://developer.spotify.com/documentation/web-api/referenc...


Wow, "Echo Nest" really threw me for a loop (Amazon Echo? Google Nest?) - it is a research project created at MIT Media Lab to analyze audio.

https://en.wikipedia.org/wiki/The_Echo_Nest


Very well explained. I am a programmer who DJs and a music collector, so I am naturally interested in these technologies.

I used to do a lot of data mining on data from various musical sources. Mostly with the goal to make the discovery process faster.


Discogs is responsible for too many rare album purchases that I paid irresponsible amounts of money for. its awesome


I love them for this also, they found a single copy of a rare German 45 for me.


> German 45

Can you explain what that refers to? Because if I search for that I obviously get WW2 related topics instead


A ‘45’ is a small vinyl record played at 45rpm instead of the more common 33rpm for the regular ‘long play’ or ‘LP’ vinyl record.

A ‘45’ fits only a song each side, so usually for releasing singles.

And ‘German’ means German.


Hahaha, thanks for the laugh, and the explanation (you too redwoolf)


Just for some anecdata, what is your age range?

My family who grew up with vinyl records are obsessed with figuring out when knowledge of that technology will really start fading.


I'm 37, and my parents do own tons of vinyls. However, they're all albums and we never used any 45s, so I guess the whole 33/45 thing didn't really stick in my memory


Oh OK!


It refers to a 45 rpm vinyl record album from Germany.


With that title I thought they were closing down or the guy was killing himself


Had the same thought. But he only used the language with the original intent/meaning in contrast to our spoiled perception by euphemisms / corporate speak of "sunsetting" etc...


These two comments were my exact train of thought upon opening the thread. How poisoned we've become by corporations burying the lede!


Well put. It seems to have that immediate effect of invoking the image of a slightly sad man with a smile walking slowly into a calm lake.


I actually laughed way too much.


What I never quite understood is that I cannot filter or sort on score or something like "just give me the 12" singles". Apart from that Discogs and Whosampled are amazing resources for music freaks that thank god escaped the "Facebook everything" internet of 2020. I do see those weird little music blogs and forums declining, though places like Drexciya Research Labs and Gearslutz still seem to go strong.


Or sort by both artist, and original release date.

But they also have a pretty good API with generous limits. Really need to find the effort to restart work on my "spotify shuffle/playlist" style app that uses your collection to build out a play session. Had a lot of momentum at the start of covid and just fell off.


Check out https://ogger.club, which adds features to the collection display, like tagging and other search options. It's based on uploading your exported-from-discogs collection, but it might scratch your itch. I'm not involved, it's just something I know about.


I love Discogs. They're such a great site for everything that they offer.

Discogs not only has an API [0] but they also provide regular database dumps for free [1]. Last year, I converted tjeir database into SQLite and queried it in various ways to discover new artists or releases that I might be interested in. Ultimately, I realized how much of my favorite music was put out by labels that were headquartered in UK flats that have since been re-leased.

[0] https://www.discogs.com/developers/ [1] https://data.discogs.com/


This is very helpful. I have found their Advanced Search to be a little quirky when I am looking for something highly specific (which seems to be "what I do" as a person), and it misses things I would later find manually.


Re-leased, as in the entire building is no longer private accomodation? (I'm assuming you didn't knock on the door and meet some new tenants?)

Also curious what labels are on that list of favourites!


Discogs is the Wikipedia of the music industry. It’s amazing to see the vast amount of detailed information on every release from mainstream to obscure.


I don't know, Discogs may have better info for the releases it has, but my experience is that the completeness of their data is pretty spotty.

For example, lots of popular artists in Japan (not trying to cherry pick something obscure; Japan after all has the second largest music industry in the world) don't even have an article there, let alone a complete discography.

This is obviously due to lack of volunteers, or their volunteers have limited areas of interest, which is totally understandable. But I think some basic automatic systems to at least add the new releases from major labels is needed for such database.

I personally have a better experience on https://musicbrainz.org/. It suffers the similar problem mentioned above, but much better. At least most of "mainstream" artists I checked, it has complete or close to complete discography. It is also a more "ambitious" database, which has data for each track, recording (my fav part), etc.


A lot of (particularly older) music is on Discogs but not musicbrainz. Including much from africa, south america, etc. as well as blues, electronic, etc.

I think Discogs mainly caters to record collectors (vinyl etc.) which is not often covered by other sources. While honestly musicbrainz info is more common/shared.


>I think Discogs mainly caters to record collectors

This makes sense, I do have better experience when looking for vinyls, regardless of country of origin.

Still think they should at least have a basic dataset for newer CDs, though (to be fair, MusicBrainz's new music catalog is often lagged by a few months too.)


Discogs also contains digital releases.


However those are somewhat neglected by their user base. Digital releases are rarely even listed along with their physical counterparts on Discogs. Of course it's up to the users to actually create and maintain those entries.


>This is obviously due to lack of volunteers, or their volunteers have limited areas of interest, which is totally understandable. But I think some basic automatic systems to at least add the new releases from major labels is needed for such database.

They don't exist. Look at SoundExchange, the US organization that handles music royalties from streaming. They don't even have complete records on the releases for which they collect money! You're talking about a myriad of Hollywood-accounting companies (if that) that were fired up and dismantled willy-nilly, for decades.

Oh, that Japanese label that put out two 45s in 1962...where would that be recorded except in the collections of music fans who own them? That's what Discogs is based on. You want those records to be in the database? Buy them and put them in. The fact that you don't is what you're describing as "limited areas of interest."


Not sure what you mean by "don't exist". I meant that they could write scrapers and fetch such information periodically from, say, recording companies' websites or more reliably, from e-commerce stores.

I'm saying so because I know plenty of 3rd party fandom/crowdsourced databases are doing exactly this, for books, CDs, etc. The data obviously need some manual checks later, but this step saves lots of time.

Again, Discogs already have a very complete database for old releases, especially for vinyls. But what I'm talking is that they are lacking for newer ones (CDs or digital releases) and just provide a systematic (and tried) way to improve in that regard (considering their goal now is "everything").

For something in 1962? Sure you're right. But you don't really need some fans to own a copy to prove the existence or get information for a hit album in 2010s.


Discogs is fantastic for electronic music, though. Even quite old and obscure stuff, I don't recall ever not finding something.


Discogs started as an electronic music database, so it's not surprising.

I guess we can say that all these crowdsourced music databases have obvious priority or bias, despite they are appeared to, or want to, be a comprehensive one, which isn't very realistic just due to the sheer amount of the music produced. In Discogs' case, even though it gradually expanded to "everything", its strength is still at its core (where it started): electronic, hip-hop, rock, jazz, (in this order, named in the article), but probably not pop.

One probably would have much better luck looking for something in one than the other, totally depending on the genre. Like, if I'm looking for game music, I definitely will check VGMDB first.

Of course, it's not limited to music either. It can happen in much, much smaller fields. One particular case I want to mention is that there used to be three wikis for a single game, Dota 2, and each have their own strengths (there are still two today!).


I gave musicbraiz few chances in the past. To my disappointment, I was able to find maybe 20% of albums I wa looking for compared to discogs. And even if I found one, then it missed plenty of meta data that's available on discogs.


I had the same issue with musicbrainz but recently it seems to have much better metadata for my stuff. But usually between the two you can find what you are looking for.


What I really want for Discogs is some kind of database feature, recognising that unique tracks make up releases.

So all the individual credits for e.g. the song "The Boys Of Summer" by Don Henley could be updated in one place, and then updated on all the releases it appears on.

That would make the site extremely useful to see who played what on which songs, who produced and mixed it etc.

You CAN find that information for songs now, but since a track might appear on hundreds of releases, it's impossible to know easily which release has the credits for it.



Discogs actually had this feature (called 'tracks'), but recently disabled it: https://www.discogs.com/forum/thread/803017


I agree. Sideways related and maybe more of a UI issue, but the site also makes it unnecessarily hard to limit my search to a "release" instead of a version of that release (i.e. I am looking for general info on a U2 album, not the liner notes of the Asian release of that album).


If you press "View All" on the "Other Versions" it will bring you to the master release of said release.


Sadly they could have been a major player in the standardisation of metadata or at least even conformed to one of the current (imperfect) standards that exist. Their priorities lay elsewhere I guess.


I'm curious, I never really used discogs since it wasn't integrated with any music players I used, but how does it fare against MusicBrainz? Do they have the same goals? Are they both opensource and open data? I see musicbrainz becoming the standard in opensource projects for music information lookup, but discogs seems absent.


I wish they somehow merged. It's unnecessary and a bit ridiculous that music nerds would need to update information in two places.


As a contributor to yet another music database (SecondHandSongs), let me offer the opposite perspective:

Information propagation is good IF AND ONLY IF the information was correct in the first place. But it's a big problem how much incorrect information already exists in music databases (including copyright databases, which make a living off providing supposedly correct information), and blindly copying information between sources propagates such errors.

Preserving the diversity of sources, and only judiciously updating information, at least gives a chance of evolving toward a more correct representation of the world.

Discogs, for instance, seems fairly accurate for release dates (Something that e.g. AllMusic can't be trusted for at all). And discogs is a wonderful source for photos of album front and back covers and record labels, which by definition are first hand evidence (even if they themselves not infrequently contain errors).

Information on e.g. track composers is unreliable, and musician and label information is sometimes a mess (multiple entries per musician, inconsistent attribution of releases to label / country sublabels, musician / band). All of this is hard to avoid in a huge database like this.

I have not worked with Musicbrainz as a source, but it does not have a reputation as being particularly reliable.


> I have not worked with Musicbrainz as a source, but it does not have a reputation as being particularly reliable.

This seems like a roundabout way of saying it has a reputation for being unreliable?

Do you mind clarifying? I know there's certainly issues of completeness, as with any user-compiled library of data, but I haven't heard charges of unreliability levied against it before.


I admit that I'm guilty of pretty much what I accused those music databases of — passing along unvetted secondhand information...

But let me try: An ideal data source has a traceable provenience to a reputable authority. E.g.

https://iswcnet.cisac.org is in the business of establishing songwriting royalties.

The Discography of American Historical Recordings https://adp.library.ucsb.edu/index.php/basic/search has very sound scholarship behind it and documents their sources (usually ledgers kept by the recording studios).

The Encylopedic Discography of Cuban Music https://latinpop.fiu.edu/discography.html seems to be compiled mostly by one scholar, but he seems to have an incredibly detailed knowledge of the field.

Musicbrainz, in contrast, is a crowdsourced collection that is a mile wide and an inch deep. Within literally seconds I found a bogus artist entry: https://musicbrainz.org/artist/729f9e2f-c993-40b6-ae20-bd21b...

I'm willing to bet that the the last album shown for James Carter is not by him: https://musicbrainz.org/artist/a8880ecc-10d8-4492-b17e-02715...

This one has a typo in 13, and 12 is completely wrong: https://musicbrainz.org/release/81c40619-26b9-421a-906a-4ef0...

And all of this is just a few random checks. To be sure, the sheer volume is not without value to start a search (Quantity has a quality of its own, as they say), but it should probably not be taken as a sole source of authority without a backup source, and one would have to be careful that the backup source did not, in turn, take musicbrainz as its source.


Your examples are fair, but allow me to offer a counterpoint.

Your argument seems to verge on absolutism, as in "if it's not perfect, it is worthless", which makes me feel like it's a little dismissive of the value that musicbrainz has managed to build.

I'd like to argue that most if not all data collections like this, including Discogs, Musicbrainz, web stores like iTunes or Google Music or Amazon are always going to have some data quality issues. It's the nature of the beast. I've seen data quality errors even in Apple Music and Spotify, where you'd think they have vetted data from labels directly.

MB has not one scholar but thousands of enthusiasts contributing to it. Some are very methodical and provide source links with each edit (creating, as you say, traceable provenance to a reputable authority). Some are newcomers and just throw data in, and need to be educated on how to do it better.

The great thing about MB is the powerful data model, and the UI that makes it possible both to fix the invalid data, and to trace the history of the edits, with links to the sources used for the changes. The other great thing is that it has, so far at least, somehow managed to elude most of the attention of vandals and trolls that plague places like Wikipedia. It feels like most if not all contributors to the site act in good faith to their best knowledge (lacking though their research skills may sometimes be), and when their submissions are lacking, they are easily improved.

Finally, I do agree with your ultimate statement - none of these data sources should be taken as the sole source of authority.


A great part of the internet community, or what's left of it.

To a very specific group of people collecting old electronic and dance music, it is indeed invaluable and unique.


Its what IMDb was originally, but for music. Except you can’t download the database, I think.

edit actually the database is available at data.discogs.com


Discogs is pretty awesome

A friend of mine even found an old LP of his band there after he lost the last copy he had. (They disbanded decades ago and only pressed a small amount of copies)


I need this more than ever now because after many years of using music streaming I've started buying all the music from my youth.

Sometimes the disc isn't recognized or you need cover art and discogs is there for free.


Their cookie consent dialog is super annoying. There are a ton of entries (though not the most I've seen). There is no "opt out of all" button. The confirmation to save the opt-outs is hidden down at the bottom of the dialog, on the opposite side from the button to accept-all. Most of the entries are marked as "always active". There is this confusing "object to legitimate interests" button, which I have no idea what that is supposed to mean. I kind of get the idea that it's not an actual opt-out.

So no, I'm closing this window and not even reading the content.

I thought the EU court clarified that this is not what they meant about reasonable opt-outs being required.


Good to know they "care about my privacy" ... what a clusterf*ck of a consent mechanism and so much stuff I can't turn off.


It's been quite a while, so I might be misremembering some of the details, but I was one of the early contributors to the Discogs database.

In the beginning it was very much a community driven project and it seemed like the Discogs database would remain open like e.g. Wikipedia with the GFDL, so after pouring in a lot of work it was very disappointing when the project was commercialized and the database closed with a restrictive license.


What data are you referring to? The database is available as CC0 at https://data.discogs.com/ and has been for many years AFAIK.

I really wish they would include the marketplace statistics, because from that one could estimate popularity of releases. But since that data is not crowdsourced, it's fair enough that they keep it private. Of course, one could scrape it from the web, but that is impractical and unethical to do for all the millions of releases.


It went from being open without a clear license to being closed off. Good to see that they opened it up again (in 2008 by the look of it).


Some time back, I bought/sold on Discogs in the 12" vinyl category (music_mike) from thrift store finds. I was able to throw together a Cordova app within a day to price stuff while I was out, and their API is simply fantastic.

The database is the total accumulation of literal decades of work with almost nothing thrown away, so it's not the cleanest to navigate for a newbie, but it is deep.

Thank you, Discogs!


Discogs is the only place I've been able to find certain LPs, I had no idea it was used as a data service as well!


Same. It's an amazing resource and just got more amazing.


Discogs is amazing in the depth and breadth. It's the only place (other than scanning my own) that I've managed to find cover art and track listings for things like 80s and 90s New Zealand compilation disks, for example. It's a remarkable project.


I don't know much about the cut they take, but it seems like a pretty good opportunity for small record shops and collectors to generate revenue through the marketplace. They have gotten some of my money!

An excellent feature is their wishlist. Add releases you would like to obtain, and you can get an email "newsletter" with new/good deals on these releases. You also get to see if one seller has multiple wishlist items to help reduce shipping. Or just browse through the seller's other listings and add a few cheap records without raising the shipping cost.


Wonderful site, wonderful resource.

I wish Metal Archives weren't offensively lazy sacks of shit about their own data and made an API. I'd work on that shit for free for them.


I interviewed at Discogs about 3 years ago (after discovering they were local to me in Portland, OR). Seemed like a great group of people working there!


I love discogs and have used it to find a huge amount of new (or new to me) music over the past 10 or so years. I would love to see a discogs streaming service so you can much more easily find this that collaborators have done. I use Spotify currently but find a lot of the user generated playlists on discogs to be higher quality


Agreed! I would love for discogs to be used as a database of sorts for a streaming service. In fact I started a sideproject couple years ago do do that. It's called https://tubeplates.github.io and uses youtube as the music backend.


I lament that they did something to the comments on albums. The comments were always extremely poignant and spot on.


What did they do? I still see comment sections.


Love Discogs, have been using it for many years!

Some interesting musicological statistics can be computed using their public dumps. For example: http://hdl.handle.net/10230/32931


Discogs is amazing. It's what the internet should have been all along.


Love Discogs. Represents the best of the Web.


I guess it's intentional that you're blocking access to a thank you post until I agree to accept your cookies?


I <3 Discogs. Such a great site. The Marketplace is a drain on my wallet, but hey, everybody needs a hobby :)


Hopefully during the next 20 years they become not a shit hole to work for... have only heard numerous awful things about working for discogs for literally years now. Shame somewhere so nostalgic is so awful for its employees.


Can you elaborate more?


I am surprised that they lasted this long. I am curious, how do they make money?


They have a marketplace where lots of vinyl is sold. They take an 8% fee iirc.


Also it's really good! The UI is a little dated, but it gets the job done. I used their marketplace to buy records from my favorite record shop earlier this year when the lockdowns made it too difficult to go in person.


Agreed! I've definitely purchased really obscure records I didn't think I'd be able to find at like 2 am on discogs. My favorite moment on discogs was looking for a random record and realizing the shop selling it was the store I used to go to all the time when I was in college.


The best




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: