Hacker Timesnew | past | comments | ask | show | jobs | submit | fuzz_junket's commentslogin

Archivist here. Google is not an archive. Neither is Tumblr or Flickr or any other platform that might delete your content at any time. They're companies and it's their job to make money. This is why my profession exists. We don't make money, which is why we're not well funded, but we have a whole lot of training, technical knowledge, and professional ethics around saving information and making it accessible. If you want to preserve your records, talk to an archivist because you can't assume some faceless corporation will do it for you.


I commend you for your work and I think it's incredibly important, and I fully agree with what you've posted here.

However, this is still a noteworthy story because they aren't complaining about their own data being deleted. It's all data history for political ads, and it's whole point of existing was for transparency (it's even in the URL of the Google site). This is a reversal of an almost 10 year old policy


The data is not yet deleted but will be in 2 days, would you be interested in archiving it?

https://hackertimes.com/item?id=45412855


Yup. Thanks for your work.

If we want to preserve something, then it's up to us, to ensure that it's preserved.

If we pay someone else (like you) to do it, then we expect them to preserve it, but not if we aren't paying for it.

That said, preserving stuff; even electronic stuff, is a challenge.


I think the reasonable position here is that it's within Google's rights if they want to take data down, but at least give us warning and an archive of the data. If they'd said "we're going to take all this data down in two months but here's an archive of all of it if anyone wants to download it", I think very few people would have a problem with it.


> Google is not an archive.

Agree. They're a company with (I assume) a PR department that continues to allow the company to make some really bad choices that continue to erode their reputation.


I don’t think it’s PR’s job to tell the company what to do. It’s their job to spin what the company does in a way that’s beneficial to it.


That's an interesting point. But then I might have said instead that Google, the company, is sure making their PR department's jobs much more difficult.


[flagged]


I'm reasonably confident you literally mean the corporations have to do the archiving since the article is about a corporation not doing it. Philosophically you just picked a random subgroup in society to do the archiving. If we're going to pick people who have to do this by law, why not force the archivists to do it? They've already got the skills and experience. Probably willing to do it voluntarily if there is some money in it from the government, but I suppose if we're committed to forcing people to do it there can be some sort of taskmaster drag them back to the archives if they try to sneak out early.


It's kinda interesting, since many countries already have taxpayer paid archivers.

Not sure if the laws have changed, but every book published over here in my country needed to send a few copies to our "national library" for archival.

edit: https://www.nuk.uni-lj.si/informacije/obvezni-izvod-fizicni-... (yeah.. google translte it)


Ironic considering it seems to be a change in law that spurred this action in the first place.

If it's worth saving from a societal standpoint, maybe a third option of funding and maintaining a public archive could be taken. Wild idea, we can tax the faceless corporation to pay for it.


You really want mandatory data retention laws? Think about the side effects of this.

First of all, any such regulation is a regressive tax on small businesses. Small companies will find it harder to comply than large ones. The cost to Google would be trivial but for a small startup it might kill them, especially if retaining data isn't important to their business.

Secondly, there are privacy implications. It's sometimes good when data is purged.


In general, this point is absolutely correct, and regulatory capture as a mechanism to stifle smaller competitors is shockingly common.

But in the case of requiring brokers of political advertising to maintain transparency about the reach of that advertising - that seems far more palatable and far more in the public interest. If you want to play in that specific sandbox, you owe accountability to the public at a level where dynamics across election cycles can be analyzed.

Of course, all this is just a thought exercise, since the background of the original post is that Google is removing its archive because its response to the EU regulatory environment has been to pull out of the political ads market entirely on a go-forward basis. Regulations did not require it to maintain any historical archives, apparently, and so the natural consequence would be that Google had no reason to air its historical dirty laundry with no benefit to them at all.


"Brokers of political advertising" is just speech.


All communications are speech in one way, but whether this should be considered equivalent (from a regulatory perspective) to an individual's speech was not apparent to 4 out of 9 Supreme Court justices in the US in 2010 during Citizen's United - and, certainly, this opinion does not bind (or speak for) the entire world's description of speech.


"just speech" is just speech too, right?

I seem to be missing the point here. Are you claiming that this term can not be applied categorically in a realistic fashion? I think that's wrong.


I agree with what you said here.

I do think that with a slight modification, OP's statement can be improved in a practical way.

"Regulations that allow consumers to export their data should be more comprehensive and standardized."

I'll also note that many of these big services do allow you to export user data fairly easily.


For political advertisement via the internet? I absolute want mandatory data retention and transparency. It must be clear what was published using which targeting criteria by whom and when. 100%. Our societies are in grave danger.


What are the privacy implications in a database of advertising, an activity specifically intended to make information as public as possible?


Pretty sure GP was being sarcastic. But in any case, there's no reason we can't recognize that Google and other such massively-influential companies are hugely different from a small business and act accordingly.


>First of all, any such regulation is a regressive tax on small businesses.

your first argument is that it harms small businesses

it's really not an issue to set up laws such that small businesses do not have to follow them. the DMA is a perfect example

your second argument is that there are privacy implications

okay then require the data to be anonymised


I think your comment exemplifies why people have an issue with "just regulate it" because there are endless nitpicks and carve-outs that seem arbitrary and will likely have unintended consequences. It's easy to go "then just do this" but in reality the government and private sector can only deal with so much from an enforcement and compliance perspective.


We need to start working on the premise that large corporations are different beasts than small businesses. I mean as a people of the world as a whole.

There is a tipping point somewhere and that is definitely up for conversation but we need to pick a point and start making sure regulation hits where it does good.

Frankly, the outcomes of both "regulate it" and "don't regulate it" have already both been captured by the biggest offenders to use as they wish.


saying "businesses over a certain size must comply" and "data must be anonymised" are not endless nitpicks, they're simple rules that can be and are regularly enforced the world over. I think your comment exemplifies why people have so much distaste for the corporate sphere and its disingenuous ideology in general


Say you build a hobby website for your photos and allow people to comment on them. Boom, now you are responsible for keeping archives for other people posting there and cannot take your site down. Why do you think this is correct?


This is about political advertisements, not about a hobby website for photos. As a society, we need to hold those that influence and track us, to be responsible and transparent.


The author seems to be frustrated at something but I'm not sure what. There is value in learning how to implement something from first principles. Teachers aren't sitting around scheming about how they can waste their students' time with "meaningless calisthenics", they're trying to help them. Calling them "clueless professors" isn't great either. There's a degree of disrespect in undergraduates dunking on professional tertiary-level educators for making them do homework.

Also, Prolog does not have a "standard library". What predicates are implemented varies greatly by implementation, and if you want to write portable code then you have to stick as closely as possible to the ISO standard.


Funny, I was just thinking how much this reminded me of Stanisław Lem's "Memoirs Found in a Bathtub". Must have been one hell of a vibe in Iron Curtain bureaucracies.


I love O'Bama, the famous O'lympic athlete.


Technically Doc Brown was alive in 1885, having travelled there from 1955, so he was the first time traveller by a good 100 years before Einstein.


I worked for Livescribe from 2008 to 2010. The author of this article describes the Nuwa Pen as a "game-changer" but, as others have pointed out in this thread, other products have done the same thing for a long time. Livescribe's pens captured both handwriting and audio, which was great for meetings and lectures. For a while they also supported an app ecosystem, though the apps' usefulness was never successfully demonstrated to be more than a gimmick.

I still believe there is a niche where a product like this would be very much at home, but Livescribe's smartpens in particular were undone by a combination of bad internal decisions combined with a market that changed from underneath them. Who knows, maybe the Nuwa Pen will be able to target that niche market more successfully. I could certainly find a use for one, given the right combination of price/features.


Adding to this that before Livescribe there was the Fly pentop computer: https://en.wikipedia.org/wiki/Fly_(pentop_computer)


Does anyone know what this actually does? There's no explanation in the documentation, and the only screenshot is of the build process.

I get the impression whatever it is requires metadata from a digital camera, which isn't present in my photos because I like to shoot film and scan it. My first thought was, "Wow, how is it going to analyse the subject matter and classify all my scans in 'seceonds'?" My second thought was, "Ah, it won't." The author is clear that this tool is based solely on their own workflows, which is fair enough, but I'd at least like to know what those workflows are.


It moves photos between local files, dropbox, and google drive, depending on how you configure it.

From a quick look at the code, so far you can achieve the same with exiftool if you mount those sources as local drives. There isn't much else yet, but it wouldn't take a lot of work to hook this up with a vision models to add some labels or metadata.

For example, I normally run the following to import files from an SDCard to a local folder and organize them by date:

exiftool -r -d ~/Pictures/%Y-%m/ '-FileName<DateTimeOriginal' -o . /Volumes/LEICA\ M/DCIM/100LEICA

A bit off-topic: I have the same problem that you face with shooting film with Leica and manual lens.

It would be interesting to see if there are tools to automatically populate metadata based on the content:

1) field of view + rough estimation of lens data can be computable directly from perspective cues

2) this can be narrowed down to specific objectives either:

  2a) using a short list (all objectives a specific photographer owns), or 

  2b) using enough lens data + images (e.g. something that DXO or Adobe could do)
While I am skeptical about doing (2b), it should be possible to do (2a). A middle ground is to manually label a few images and use semi-supervised learning to propagate them to the rest of one's photo collection.


> From a quick look at the code, so far you can achieve the same with exiftool if you mount those sources as local drives. There isn't much else yet, but it wouldn't take a lot of work to hook this up with a vision models to add some labels or metadata.

Thanks for walking me through it, and for the handy example with exiftool. I like that you're doing this with command-line tools because you can see what's going on more easily. I think it's more accessible than requiring a Java + Gradle build environment.

> 1) field of view + rough estimation of lens data can be computable directly from perspective cues

> 2) this can be narrowed down to specific objectives

Really! That is so cool, I wasn't aware that kind of thing was possible. I never write down the lenses I've used (because lazy) so extracting that kind of metadata from the image would be an interesting exercise.


Your friends are just poking fun, of course it's a real library. I also have a large number of ebooks (in addition to a 1-2k book physical library) and the ebooks are, paradoxically, less accessible. It's too easy to just download hundreds of books in one go (let's say if you wanted all the original Goosebumps books) and not actually look at them, whereas every physical book has to be obtained and shelved individually. Then ebooks disappear into Calibre where I utterly forget about them, whereas a physical book's presence on the shelf is a constant reminder that it exists and is waiting for me to read it.


Yeah I came here thinking it was about books. Apparently Magic people call their cards a library? Here I am sat with a plain old deck of cards like a complete lemon.


In Magic parlance, you play the game with a deck, but all your cards that you can build decks out of is called your library.


This is wrong.

Deck is the set of cards that you bring to a match.

Library is a region of the game state: the cards that are yet-to-be-drawn. This only exists as part of the gameplay. Other gamestate regions are the battlefield, the graveyard...


Strictly speaking doesn't have to be undrawn cards given the myriad of ways you can put cards from other zones into your library, those cards may or may not have been Drawn from the library as well.


the library is your collection of cards from which you can draw and from which you do draw every turn.

It's not your entire compendium in and out of game, it's simply the deck of cards from which you (usually) draw and is distinct from your hand, graveyard and banished cards.


I came here to say exactly this! I got thrown by the second image right at the top before the article has even started: "K(AW)MPL(E)KS". All well and good, if you want to write in an American accent.

On a related note, writing accents is usually a bad idea. I remember reading Asimov's "Foundation" when I was a teenager in Australia. One of the characters is a lord and speaks like this: "Ah, Hahdin. You ah looking foah us, no doubt?" Because of course everyone in the future is American and everyone who's not American speaks with a comical (and completely unreadable) accent. Trying to read that dialogue makes me feel like I'm having a stroke.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: