Which as some running a website raises a fascinating question. If Google is just going to crawl my sites and present information as an AI summary on their site, then what exactly do I gain by allowing Googlebot to crawl my sites?
A couple of years back I worked with a company which maintained specific data which was the main traffic driver on that page.
Google approached them and wanted to pay for the rights to get the data and display it on top of the search results, a feature which was fairly new back then.
This was an interesting dilemma because it was very clear that the money was way less than the loss in ad revenue due to traffic drop, but it was also clear that if we wouldn’t take the deal, a more desperate competitor would, which would result in the same traffic loss but without the extra google money.
So the company took the deal.
History repeats itself here, with the difference that instead of paying for the data, the ai crawlers simply take it for free.
That's the problem with the current monopolistic system, the money won't go down the stream, it's like a dam. One big dam owned by a few people is worse than many small dams
I agree. For what it's worth, I think Google's business model is in trouble. They're putting in heroic effort to stay relevant during this shift to AI search, but how does their advertising model work when people expect summarised and accurate information instead of scrolling pages of ads?
The search space is actually not that unhealthy. There are competitors. They just lack the reach of Google. These competitor will become increasingly useful when our personal AIs plug into them to search for the right info.
That said, I think the old internet business model is dying. Why will people write articles and pay for hosting if they can't receive any advertising revenue? I think I'm okay with this. Before advertising decimated the free internet, people ran sites for fun. Maybe we return to that.
>> but how does their advertising model work when people expect summarised and accurate information instead of scrolling pages of ads?
1) because that fancy summarised results page will effectively be the ad. The results will push you towards google's partner sites or promote behavior benificial to them.
2) because partners may soon be able to pay to change the results of searches. Who won an election 20 years ago? That "fact" from the AI could depend on which answer has been sponsored by which political interest. What happened in china in 1989 ... how much money do you have?
3) the user becomes the product. No ads in search results, but our searches and online behavior tracked by them will be sold to whoever is most interested in what we do online.
TL;DR: Google is heavily stuffing more and more ads into every monetizable SERP, which explains the ever-increasing revenue.
But my theory (expanded in more detail here https://hackertimes.com/item?id=47957708) is that its cash cow is heavily optimized towards its current anti-competitive, multi-sided chokehold of the online ad market -- look up the AdTech Antitrust findings around manipulating ad auctions, e.g. Project Bernanke -- which relies on ad volume.
With agents however, I suspect that volume will shrink drastically even if they stuff ads at each step of the conversation. Because it's only the final click that will matter, and those ads will be meaningless.
It's not clear that Google can charge enough for that final click to compensate for the loss of both, the value of all the ads that lead to it, as well as their ad auction manipulation premium.
Hanging my comment here, not blaming the parent poster, it's just relevant...
All the "AI search" I've tried is far less relevant than a proper search, using search engines without it seems much faster and more relevant.
Whenever I get AI search results, its summaries lack substance, are often just plain wrong, and so on. When I check, it's often usong the wrong aggregate data to return results.
Search engines these days, traditional ones, have been hobbled by targeting the lowest common denominator. They drop search terms, alias them (eg, Joe=Joseph,Josephene, Jodiene), and thus return endless bad results.
Using "web" search on Google, for example, putting all search terms in quotes, helps enormously. If searching for Joe Baker, you don't want aliased responses.
My point is, AI is yet another abstracted layer. It not only aliases things, it halucinates, and further, it doesn't even show the context it simply summarizes.
It's the next layer of crappiness.
As a side note, there os a
product in my home country called the Swiffer. It is like a broom, but uses disposable sticky paper instead of broom bristles and a scoop.
Point is, it isn't very environmentally friendly. I weirdly see environmentalists using it, instead of just a good old broom.
And now I see environmentalists using AI for search. What? Why! The sheer additional power and cost, all the new datacenters, hardware, just for what??
Even of one believes it is mildly better (which it isn't, if you use traditional search properly), are you now replacing your broom with a swiffer?!
The amount of environmentalism that goes out the window, for convenience, astonishes me.
It is totally dependant on context. Ask about product purchases/prices and it will give you great results. Ask for basic data and it will quote wikipedia or government websites all day. Ask for a coding solution and it will spit out answers ripped from the corners of github. But i asked google about hydrogen-3 recently and got back a result saying that we were mining it on the moon. An entire space economy was invented because AI cannot tell scifi from reality.
The real test: dont test AI by asking it questions you already know. Ask it a proper question and then go out and find the answer yourself. That is where AI really falls down, the non-trivial stuff.
I’ve experienced the opposite. I get good summaries with links for original sources when something seeems off, which it almost never does. I also like that I can glance to see where the data came from.
Several months ago, I was getting hallucinations and silly answers, like Jaco Pastoreous being the bass player for Metallica.
but lately it’s been great. Especially for technical syntax stuff, like asking for syntax on a jq command, or sal syntax, or something trivial like movie casts members….
Edit: I also assume there’s now some pre-processing filter to retrieve answers from a cache or something, because I’m getting pretty long answers very quickly….
I've noticed something of a hybrid of what you posted and what one of your responders below posted.
For average results, that can have an acceptable answer that tends to the mean or societal mainstream views along some subject, the google answer works for me. For others, which are well unfortunately most of my queries these days, the google answer is brainwashing and propaganda. It feels little more than a continuation of the SARS-COV-2 era fact checkers which were actually failed attempts at propaganda machines for enforcing certain views as true while others were cited as "evidence does not suggest" (hand selecting the evidence of course means this is nothing more than confirmation bias).
I think the AI technology used in most LLM models is an acceleration of this- it will tout mainstream views or hand-picked views as truth while calling others false. And without an ability to search any more for opposing views, you get left with a really warped view of reality that might work for most. Though not for everyone.
>All the "AI search" I've tried is far less relevant than a proper search, using search engines without it seems much faster and more relevant.
I've had the opposite experience with Perplexity Pro. Yesterday it took just under nine minutes to deliver a detailed and accurate answer as to the nearest locations where I can buy Meta Ray-Ban Display glasses (at least 3 hours drive away FWIW) that day.
In the past several months I've spent countless hours using search engines with dismal results that weren't close to that.
Regarding a swiffer: it's a zero sum game. Where did the materials come from? The earth. Where did they return after use? The earth. The only thing changed was the energy used to create and transport them.
If conserving energy was important, none of these AI datacenters would be built, none of our items would be shipped across the planet from Asia. They would conserve energy and import raw materials only.
While I agree in principle, I think it was inevitable. People been gaming SEO for so long that even with judicious use of search operators, it was getting harder to find the things I wanted (probably drowned in a sea of spam in such a way as to fall outside of the optimized search index). It _looks like_ the AI overview does not have this problem (yet...).
Anecdotally, I find Google's AI Search results to be genuinely helpful much of the time, and the box isn't so intrusive that it prevents me from digging into literal search results if I am not satisfied with the AI summary.
The new business model is misinformation sold by the highest bidder. Ask it what's the most reliable kind of computer, it answers with whoever paid the most this week.
That's a pretty sinister system when the dam builders are suing the work of those downstream to build their damn. What happens when everyone downstream has been starved to death?
The very people calling the shots have so far been the most removed from the consequences of their actions. They have no incentive to be responsible or considerate of others.
But google won’t give you the recipe. It’ll give you a pretty piece of text that resembles a recipe. You’ll only find out it’s not a recipe when it fails to produce a cake.
But then, the sites it's training on are starting to do the same thing, so maybe it won’t matter. Just last night, I pulled up four sites with “gluten free almond cake” recipes. One specified less than 1/4 the flour it would have needed, and another didn’t have any butter in the ingredients list. I had to eyeball the median and tweak from experience to actually get a bakable cake out.
I'm not going to drive to the bookstore only to find out the recipe I need twice a year is not there because the local cousine doesn't even know the ingredients.
Oh, maybe I should drive to the major city and check several bigger bookstores?
Or order it online? But that would be from the third country because this one is weirdly blind to variety. That will be €20 for a book and some €10 postage.
You're sure I can just get my recipe from the store?
Completely unrelated to AI, I buy and always bought recipe books in second hand bookstores. It has this vintage vibe, and not everything has to be kale. Downside is that some ingredients aren't easily reachable, but it doesn't happen that often.
I kinda think everyone should have an old school "joy of cooking", where they have instructions on skinning rabbits and squirrels, just as a reference that things weren't always like this not even that long ago.
Online grocery shopping service I use has added recipes to their website. Not obvious slop at first reading but then you see stuff like add 600g of carrots and 100ml of water to make a quite watery soup according to the picture.
The only solution is to find recipe books that were printed in previous decades.
Which is ironic, given that Google's entire value proposition (to users) was extracting signal from noise...
... and now it's come full A/B-advertising-optimization to being useless at that, when the need is greater than ever.
Imho, Google's greatest failing was missing how its own incentives warped creation of new web content, and failing to account for that strategically -- it turned the web into something it can't itself usefully parse.
We have a distinct poverty when it comes to secondary considerations and long term ramifications - which used to be manageable when progress was slower. Now, we're on a very steep acceleration/progress curve, and any shortsighted mistakes cause extremely large ramifications. Which are then compounded by both more short sighted non-fixes and our rapid acceleration/progress curve layering in additional confusion, misunderstandings, omissions of critical information.
What kind of insanity are you all talking about? Plenty of cook book authors out there with good recipes, tested, big-personal brand, etc. no need to go to prior decades for good books.
You need to compensate for not adding it though. The recipe without was literally the same ingredient list as the site it copied, just missing a line.
And almond flour does its thing by carmelizing in combination with butter and sugar, turning your whole cake into a sort of giant macaron. You can’t pull it off without any one of those things.
> But ultimately that strategy is good for the consumer right?
No. It may be good for the consumer right now, but not ultimately (and on top of that, I would argue that reducing everyone to a consumer is already the wrong framing -- you need to ask what's good for the citizens). Having ten competing supermarkets with various interweaving supply lines is ultimately much better than having one giant supermarket, because that one monopolist is able to squeeze both consumers and suppliers to the detriment of both.
Do you think Walmart’s prices are higher than you’d pay if there were just ten competing stores with the same product mix, each of which did not have the leverage to squeeze suppliers?
It doesn't matter what Walmart's prices would be -- because there'd be nine alternatives offering their products at different price points. Also, you seem to be implying that suppliers are inhuman and deserve to be squeezed (but Walmart somehow deserves all the wealth it extracts as a middle-man, go figure), which is exactly why I said that focusing on consumers only is a myopic view of the world.
Without some way to generate revenue, people aren't going to publish recipes (for Google to scrape into their AI.) Maybe we could live without more recipes being fed into the machine, but there are many other types of content that will suffer the same fate.
It would be nice to find something better than an ad-revenue driven web, but I'm not sure this is it. We'll find out I guess...
Those who won’t were doing it for the money. Those who continue are those who do it for passion, or those whose recipe is just a way to attract people to their business (e.g. kitchenware company). I don’t think it is necessarily bad, the quantity will decrease but the quality may even improve
I built a passion reference site. A large part of that passion came from knowing and talking to the people I was helping. One person emailing or saying thanks would later help power me through to create more useful articles. Enriching openai/claude/ms/google and no thanks from an individual, has disincentivized me from writing more.
Same here. People knew the website and it was immensely flattering to meet users in the wild. It motivated me to really sweat the small stuff, because people noticed. Now I'm just feeding the slop machine, and it feels pointless.
We just won't get countless recipe websites where you have to scroll, scroll, scroll through slop about someone's day to read a scraped recipe that every other website has.
Does the current Google search results indicate that they will be any different?
It’s also not disruption if your product relies on the output of the industry it kills. What will AI train on when it destroys the economics of sharing information with others?
From the HN guidelines: "Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize. Assume good faith."
In this instance, the internet is more than top-ranking recipe websites.
> But ultimately that strategy is good for the consumer right?
No because it's killing competition and becoming an even more obvious monopoly. Then at any occasion they have to choose between consumers and profits, they'll do what shareholders want and increase profits.
It's not uncommon for free things to be higher quality than cheap things, especially when we're not talking about physical goods. Think hobbyist vs hack. Selective pro bono vs quantity over quality. The former describes old internet while the latter describes much ad-supported internet. I'm not saying cheap is better than expensive, and I'm not saying everything works this way, but I do think many things do, especially for pure information that doesn't have a major capital cost associated.
Not really good for consumers in the long term.
Creates a monopsony (monopoly on the buyer side instead of seller side) in the supply chain, with all the same negative externalities to competition.
Same reason its not good to have a "company town" where 1 company is the major employer to 50% of the workers.
Not even the summary of the recipe, I think a lot of the "value" AI provides is decluttering the interface. All these perverse incentives conspire to make it undesirable for web publishers to simply show you a cake recipe. They need to meet SEO metrics for being higher on googles index, they need to monetize with ad and chum boxes. They need to distract and dazzle your senses. In the end, AI strips all these anti patterns out and shows you the meat. Or the "summary" of what should have been there all along. The "AI" isn't learning about what the words mean, it's stripping out cruft and filler that was deliberately added!
I'm not saying LLMs are worthless, but I'm saying if you had a magic browser add-on that simply stripped the BS out and showed you the relevant content, it would meet the use cases of the majority of people using AI (regarding search).
Said differently, Google are bringing you a solution to a problem they (largely) created.
except this isn't a solution. taking away sources just creates another problem down the road. there's no information to crawl from if no one is building websites to crawl from.
>History repeats itself here, with the difference that instead of paying for the data, the ai crawlers simply take it for free.
This discussion was broached originally when discussing whether or not search engines and aggregators had any compensation obligation in respect of news articles. This was a hot topic in the IP and policy circles for a few years.
When the Canadian government attempted to create a mechanism to compensate content creators for the scraped content, there was widespread outrage from tech circles, despite the same community agreeing, across extensive policy discussions, that action had to be taken to prevent this universal man-in-the-middle value capture by search engines.
I've had fairly extensive discussion with the individuals involved in the academic, policy and internal industry analysis of the issue. Watching industry agree to address the issue, then aggressively spend to shape public narratives in public was eye opening.
The recent shift into "AI is obviously going to hoover up all your data and there's nothing you can do now that the theft is laundered through an LLM" is just the latest example of the same trend of short sighted capital-over-everything decision making we've become used to in jurisdictions that have dysfunctional legislatures.
That doesn't feel like a repetition at all? You said that the first time there was "traffic loss but without the extra google money", but that this time there's no extra google money either way.
The fact is that internet is already "tech giants own realm": the power is way beyond public imagination and affects all of us in real life on daily bases, but there are still people thinking they are not the "evil one" here.
It's a catch-22. Without google crawling your site, you don't get any new traffic. But with google crawling your site, you also might not get any traffic.
AI summarization has already causes issues for sites like rtings where people are no longer visiting the site but still making use of the data presented there. Leading to rtings not getting enough traffic to continue to post their data.
It is an existential crisis for websites and when they go away it'll be an existential crisis for AI.
> Without google crawling your site, you don't get any new traffic. But with google crawling your site, you also might not get any traffic.
I may be strange and unusual, but I just have never cared about my Google ranking. I know this makes me out of the ordinary among site owners but I have been humming along fine.
This certainly will disrupt traffic but for some of my sites I honestly think this is a good thing. I want you to want to be there, not just stumble upon my site because you happen to hit the right search keyword. Plus if it gets bad, this does create a new opportunity for others with cross linking and search.
Only issue is what happens when the company that owns the search and has a dominant share of the browser market flags your site with the good old "warning: potential risk ahead" when people try to reach it directly? And buries the "I know the risk let me through" deep in the browser settings. Advocate for different browsers? Google is pushing web attestation in one form or the other. I wish the future would look bleak, because right now it's looking blue, red, yellow and green and it's worse.
> Only issue is what happens when the company that owns the search and has a dominant share of the browser market flags your site with the good old "warning: potential risk ahead" when people try to reach it directly?
My target market is more technical then that so likely, nothing would change for me. Again, I recognize the impact of Google's dominance for some, but if the "attestation" isn't helpful and only hinders using services that people have come to rely on, there will be push back.
I also have been advocating for years for everyone in my circle to avoid using Chrome. A homogenized browser market is a risk, and Chrome is the new IE. I hope you are also a part of the effort to advocate for browser diversity.
Don't forget that other browsers also just use googles web good boy list and if you report false positives point you towards google and cover their ears.
Yes! However most of my users were established through my network, not search.
I know that sites relying on ad income will and are being hurt tremendously by this effort on Google's part. However, if you are in the startup space and make money on services you offer, search should be one of several strategies you are deploying for user growth.
Step 1, Google serves info directly and consumers rejoice
Step 2, Google extinguishes the web and nobody has a reason to publish content, consumers lament but are trapped, Google has created a platform to serve content instead of links
Step 3 (or maybe 2a), Google is now monetizing their content machine
Step 4, Google offers people a way to contribute to the content machine, make some $$ per N views, whatever. People create content within the ecosystem
Step 5, Google is now the internet, more content is created overall, quality is lower overall perhaps, algorithmic echo chambers flourish even more than today, old heads on HN lament, everyone else just goes on living
Since "making money with free content" essentially means manipulating your users into spending more money elsewhere from which you then get some indirect kickbacks, I don't think we need to lament there being fewer opportunities for that particular business model. That's not to say that video centralization doesn't have other actually bad effects.
With google search you don't get money for creating content as people rely on summary provided by the search.
Imagine people would go to youtube and watch previews of your videos solely; or the preview is your video, but condensed and given preferential treatment.
> With google search you don't get money for creating content as people rely on summary provided by the search.
I'm aware. And I'm curious how that will play out. Because same as with Youtube, historically Google Search gave the means and the discoverability to monetize producing valuable free content.
Youtube is directly dependent on people producing free content. If Youtube wouldn't pay its creators as well as it does, it would simply die.
Same with Google Search. Good content and good SEO gives the means for websites with free content to survive. Google usually takes a cut on ad placement on those pages with AdSense.
If Google Search now doesn't pay content creators as well as it used to, what will happen to free content on the web? It's bad for Google and it's bad for the creators.
As far as I know, Tiktok pays significantly less than Youtube. But yes, TikTok is a thing, as well as Twitch.
Still, it's all very centralized platforms, which historically isn't the case for all the monetized free content you usually get from google search (reviews, recipes, travel guides, converter sites etc. etc.)
Stack Exchange committed suicide by closing all the questions. It was already in a steep decline before LLMs, after it got bought by private equity and did things like firing the moderators.
Related but not related: I wonder if, on a YT video, clicking on "Ask AI" and generate a summary of the video counts as seeing the video in its entirety.
> Without google crawling your site, you don't get any new traffic
What about the stories of marketing managers who learned months after the fact that their credit card had expired and their google ad spend had ceased with no affect on traffic? Google isn't always an effective promotional vehicle.
Sounds like a pretty ineffective manager: wasn’t buying the correct ad placement in the first place, used a personal card to sign up for an ostensibly corporate service, didn’t keep track of expiration dates for the card, and was also ignoring email notifications from Google about the expired card. Let me know if I’m missing any other reasons why this manager should be fired instantly.
Nah. Pretty sure he was using a personal card - corporate finance dept would likely better track where each card is being used and their expiration dates to avoid this happening. Also this better tracks with the rest of his sloppy behavior.
Well in addition to what you wrote, the marketing manager ALSO wasn't tracking any ad-related marketing performance indicator (CTR, CR, etc.) in any measurable way for very long periods of time... or they would have caught it almost immediately ("wow ad spend, CTR and CR have all suddenly gone down to 0/0% and have been staying there for days on all our campaigns! What's up with that?").
Internet is more and more becoming a commercialization platform. If you are selling something on your website, you still want Google (or ChatGPT for that matters) to expose customers to your product. The gate is the actual delivery of the product is behind a purchase/signup.
Google and others want to control the entire customer journey, to the point the your website is simply a way to pass metadata to them. They are actually achieving this!
this kills the entire internet vibe of the 90s, early 2k
> is more and more becoming a commercialization platform
FTFY: "couple of decades since has become". The vibes of passion-driven 1990s started to be overwhelmed by the din of money right when the Internet has become a major commerce venue, some time in early 2000s.
Maybe it's time to think about alternative ways to market your products, if search engine ranking and SEO got broken. I have no idea how, I don't need or do that, but it seems we're past breaking point.
You're allowed to exist on the web. The alternative is you are pushed out, your site is not indexed and google / chrome labels it as a security risk when people are trying to reach it directly. The mandate is clear: give up the data or give up the spot.
If your site is all about disseminating information (like Wikipedia), then Google would provide a free mirror of sorts.
If your site is about your product, Google won't be able to serve the sign-up page from AI; the traffic would come your way. Same for a site that sell something: the traffic you're interested in would arrive at your checkout page.
Paid-content sites and ad-supported sites are screwed though, on top of their being screwed by archive.is and ad blockers.
The really confusing part about the ad-supported sites is that most of them are supported by Google's ad products. So Google is eating their own lunch here.
Search Engine Result Page (SERP) ads shown on Google itself are far more valuable than display ads that get shown on random websites. Google has been slashing payouts to those sites for over a decade. More recentlt, they've been slashing search impressions to those sites as well. With search engine ads + Youtube ads + Play Store ads, they can probably cut out the third - party site ads business altogether and not miss a beat.
It was exactly my opinion after thinking about this for a while. This is essentially Google making their search engine into yet another website. Sure, there is certain inertia - people used to using Google - but it will fade out with new generations. They've shot themselves in the foot, they just don't see it yet, and it will take some time for it to become obvious.
You search for a term and then you watch videos until you're satisfied. I don't think it's a very good way to learn about a topic, but millions of people seem to disagree with me. It makes more sense when you remember that 21% of the American population is functionally illiterate.
Sites pay good money to appear on top search results. Looks like the future is going to be sponsored AI sources. It's going to be even more difficult to figure out if google is presenting you with actual information instead of just an ad
I write things on the internet because I want to share ideas. If someone reads my post and tells a friend, that's great. If an AI crawls my posts and passes along the ideas that's great too.
(It doesn't work for ad-funded writing, but while I have substantial sympathy there this has historically been an unpopular argument on HN)
Sure but this means that you’re no longer eligible to make living from your ideas, which can be fine by you but it eliminates entire class of people who used to
make living from intellectual work.
This also could have been fine, it can bring back authenticity however for this to happen no one should be making money from it. Instead, only megacorps make money and they can just ignore your ideas and generate theirs. They control the distribution and the supply now.
Not making a living from ads specifically, sure, but many have things like Substack which actually directly incentivizes them to make good content rather than serving ads.
Sharing ideas with people is nice, the actual problem is your ideas in this case are just a vehicle for generative AI companies to monopolize access to information and control our own cognitive processes, which is not entirely something new but it feels like we are now moving backwards: from free access back to ministry of truth days
Setting aside ad-driven revenue - the ideas, when spat out by an llm, are disconnected from the author. If people like your ideas, they aren’t becoming fans/followers/long-term-readers. That means good luck leveraging some interesting writing into a book, a speaking tour, a podcast, or even any kind of consistent readership. The llm slurps up your content and monetizes it while you get nothing.
I'm not interested in a book, speaking tour, or podcast. I've never had consistent readership because I write about too many unrelated things. I blog because I have ideas I want to share; I don't feel at all ripped off.
People are certainly welcome to feel a lot of different ways, not trying to be prescriptive here. My parent asked: "what exactly do I gain by allowing Googlebot to crawl my sites?" and I was describing what I get out of it, in the hope that others might feel similarly.
Speaking as someone who is a bit more familiar with your site, the variety of content you post is really valuable. I know multiple people, myself included, who have either gone from EA to Contra or Contra to EA thanks to both being on your blog.
More broadly, I love it when an author I trust in one area writes about other topics.
Fair enough, sounds like you won’t be impacted. But the vast majority of people i read online are able to write the content i enjoy because there are paths to earn a living off it. I expect the future of llm search will leave only hobbyists and slop producers standing.
There are other ways to block robots from crawling our sites. I have a robots.txt but place no faith in it, it’s just there because it’s cheap and does stop some of the crawlers.
It should have been a clear extension of the intent of existing copyright/licensing that training would be disallowed without consent, but "move fast and break things"/"possession is nine-tenths of the law" win out
Copyright already exists, the issue is that these companies are doing it legally anyway. For me it is the same issue as with privacy: I'm deeply uncomfortable with the current situation, but there is no political fight for me to fight, because the law is already how I want it to be, it's the public perception, that needs to change, but that is hard to influence, without being rich.
Depends on what you're trying to do. Sell ads over your content? Probably not great. Sell goods? Still good for you. Become influential and spread ideas? Good for you.
Honestly, a majority of news media are just AI generated content (echo chambers). I personally feel Dead Internet Theory is true and has arrived already.
Free speculation: I could see a future where Google populates a footer on results with the website logos of the sources. Presumably, the new web will require users to memorize websites/brands and go directly to those sites if they see a lot of their results are being provided by one source.
Websites may go back to being simply labors of love.
> Websites may go back to being simply labors of love.
The situation may be even worse. Back in the labor of love era, at least webmasters could get feedback from readers. In the LLM era, readers may not even know that the site exists. Without feedback/community, the overall quality of those sites will decrease over time.
I've been mulling over your comment for a few days now.
On the old internet, "know one knows that you're a dog."
On the new internet, "know one knows (for sure) that you're an AI."
In a mixed discussion consisting of both AI and human participants, AI could generate interesting discussions and recite novel-sounding statements for ongoing ones. It doesn't sound plausible to me that an avalanche of profit-driven LLMs would even post -- I would think they'd be the lurkers. My take would be that LLMs actively participating, depending on the content they post, are also labors of love by some nerd out there trying to make the best LLM.
Drain all the profit motives and rent-seeking from the web, and even something like AI joining discussions doesn't sound scary.
> ChatGPT/Claude does this today. I barely click or care for the source when they already have me the info I wanted.
Maybe I'm just #builtdifferent, but I click these a lot. Especially if I'm trying to research or make a decision on something, I want the actual source and not the potentially-fudged summary.
It seems like they should have a model similar to YouTube. If I watch a video on YouTube made by someone, they get a little cash, and it ads up.
Similarly, if I use Gemini uses a website for an answer, it should pay something to those sites for the information it gathered. Sites would need to sign up to earn via Google, and I'd imagine there would be a certain threshold to cross to make it worth cutting checks... but that would make all these AI search tools feel much less scummy while providing site owners an incentive to keep sharing information on the internet.
Where a model like this would get messy is with sites like reddit. It's a very popular source for AI search, but the value comes from the users, not the platform itself.
Actually it cannot work this way, content creators make far more money from ads in the video itself compared to the one yt gives them. If it were for yt money alone basically we will still be in the 2010 yt: folks that doing it just for fun.
The problem with all this AI/llm stuff is that end users doesn't even know your tiny site with a lot of useful information exists at all.
> The problem with all this AI/llm stuff is that end users doesn't even know your tiny site with a lot of useful information exists at all.
This depends on implementation. I primarily use Kagi for any LLM stuff. I cites pretty much everything and links out to the source. I regularly use this for search. The normal search results may not have what I need, but a line in the AI results sounds better and I click through to the source to get more context.
I find clicking through to the source is important, as I've often seen the AI get it wrong. The page has what I need on it, but the AI grabbed the wrong thing and got it backward. I'm probably in the minority, I'm guessing most people don't use LLMs like this.
Maybe there are some exceptions, but my[0] behaviour changed a lot in the last 3 years:
- in the past Google was the entry point for everything, I was opening every single site at the top of my search and navigate through it. E.g.: if the site was a Reddit or HN post I was reading comments, following links, ecc
- today I'm using google 1 in 3 times and I mostly read the "AI Overview" section. The other 2 out of 3 I go directly into chatgtp, claude or gemini and rarely follow any links.
[0] But I see the same pattern basically in all of my colleagues, friends and relatives.
Google's AI summaries already do this. I occasionally click through to see the underlying source the AI summary leaned on to generate the response, but probably only ~20% of the time.
It's kind of like any extinction scenario though: Yeah, there will still be a big ball of rock after the nuclear apocalypse, there will still be life on the rock, and probably even still humans, but that only makes it slightly less tragic.
As far as I know, you don't have a choice. They have no obligation to respect your wishes, and LLMs are legally allowed to scrape & republish your content.
Sending them a zip bomb didn't cause harm. It was their choice to unzip it. Is jwz liable if a child sees his testicle eggcup macro when visiting via HN?
The expected purpose of websites is to spread information, so whether users get it by making a request to your website or to Google is irrelevant. In fact, if they get it from Google it's better because it reduces website load.
If instead the purpose of your website is to manipulate users for financial gain (for instance by showing media attempting to manipulate their purchasing decisions, after receiving a bribe from a vendor), and the information is just a way to lure users, then maybe this malicious business model will finally be no longer possible.
On the one hand, this is an interesting observation. The internet as it exists today is filled with product placement slop and real information is a rare commodity. The loss of these kinds of sites is a blessing.
On the other hand, Google played a big role in creating this problem in the first place. The search results have trended downwards towards this kind of SEO slop for the last 10 years and Google has been unable (or unwilling) to fix it. Plus, the AI results Google shows are not free from commercial influence and will probably only get worse in this regard. Except now this money will flow to Google instead of $random_internet_spammer. I don't know if that's any better.
The idea that Google won't eventually "manipulate users for financial gain" with Gemini is comically naive. That's how they're going to make money from this thing.
> then what exactly do I gain by allowing Googlebot to crawl my sites?
The counter argument is that sites are becoming more AI slop or may intentionally provide poison they don’t want to train on. There may be a cut off date after which training must be carefully curated; and the main body of data has already been collected.
Sites may still get traffic from agents searching for current information. Maybe even the resurgence of RSS? One can dream.
Mechanisms might exist to make you think you have one, the same way copywrite should prevent millions of books being gobbled up by TheZuck but ultimately do you really have a choice?
Yes, Google advertises its crawler IP ranges and it is quite easy to keep track of this and block them. But only if you control the infrastructure that your site runs on of course.
Maybe you want your ideas to spread? If your sites purpose is getting ad impressions then yea no point. But if your purpose is to spread ideas then it is still useful.
I have spent nine years putting out free information. Surely you realise that I have to pay rent and buy food while I do it?
My income isn’t ads, just getting a cut of the sale on the complex products I help you buy. Even that sort of curation takes time and effort.
Even for all the things I do for free without any revenue whatsoever - most of it, really - I do want to feel some recognition. I don’t want the interaction to be mediated by an advertising company.
It's worse than that. They train their models preferentially on what they consider to be high-quality data. But if you look at the usual "references" on search queries, they're often just a post-hoc BS justification that links to spam blogs or Tiktok videos.