To be fair, if my search engine is anything to go on, about 0.5-1% of the requests I get are from human sources. The rest are from bots, and not like people who haven't found I have an API, but bots that are attempting to poison Google or Bing's query suggestions (even though I'm not backed by either). From what I've heard from other people running search engines, it looks the same everywhere.
I don't know what Google's ratio of human to botspam is, but given how much of a payday it would be if anyone were to succeed, I can imagine they're serving their fair number of automated requests.
Requiring a headless browser to automate the traffic makes the abuse significantly more expensive.
If it's such a common issue, I would've thought Google already ignored searches from clients that do not enable JavaScript when computing results?
Besides, you already got auto-blocked when using it in a slightly unusual way. Google hasn't worked on Tor since forever, and recently I also got blocked a few times just for using it through my text browser that uses libcurl for its network stack. So I imagine a botnet using curl wouldn't last very long either.
My guess is it had more to do with squeezing out more profit from that supposed 0.1% of users.
Given that curl-impersonate[1] exists and that a major player in this space is also looking for experience with this library, I'm pretty sure forcing the execution of JS using DOM stuff would be a much more effective deterrent to prevent scraping.
"Why didn't they do it earlier?" is a fallacious argument.
If we accepted it, there would basically only be a single point in time where a change like this could be legitimately made. If the change is made before there is a large enough problem, you'll argue the change was unnecessary. If it's made after, you'll argue the change should have been made sooner.
"They've already done something else" isn't quite as logically fallacious, but shows that you don't experience dealing with adversarial application domains.
Adversarial problems, which scraping is, are dynamic and iterative games. The attacker and defender are stuck in an endless loop of game and counterplay, unless one side gives up. There's no point in defending against attacks that aren't happening -- it's not just useless, but probably harmful, because every defense has some cost in friction to legitimate users.
> My guess is it had more to do with squeezing out more profit from that supposed 0.1% of users.
Yes, that kind of thing is very easy to just assert. But just think about it for like two seconds. How much more revenue are you going to make per user? None. Users without JS are still shown ads. JS is not necessary for ad targeting either.
It seems just as plausible that this is losing them some revenue, because some proprortion of the people using the site without JS will stop using it rather than enable JS.
> "Why didn't they do it earlier?" is a fallacious argument.
I never said that, but admittedly I could have worded my argument
better: "In my opinion, shadow banning non-JS clients from result
computation would be similarly (if not more) effective at preventing SEO
bots from poisoning results, and I would be surprised if they hadn't
already done that."
Naturally, this doesn't fix the problem of having to spend resources
on serving unsuccessful SEO bots that the existing blocking mechanisms
(which I think are based on IP-address rate limiting and the UA's HTTPS
fingerprint) failed to filter out.
> Yes, that kind of thing is very easy to just assert. But just think
about it for like two seconds. How much more revenue are you going to
make per user? None. Users without JS are still shown ads. JS is not
necessary for ad targeting either.
Is JS necessary for ads? No. Does JS make it easier to control what
the user is seeing? Sure it does.
If you've been following the developments on YouTube concerning
ad-blockers, you should understand my suspicion that Search is going in
a similar direction. Of course, it's all speculation; maybe they really
just want to make sure we all get to experience the JS-based
enhancements they have been working on :)
JS is somewhat necessary for ads, they're not in anyway needed for displaying them, but instrumental in verifying that they are actually being displayed to human beings. Ad fraud is an enormous business.
I run a semi-popular website hosting user-generated content, although it's not a search engine; the attacks on it have surprised me, and I've eventually had to put in the same kinds of restrictions on it.
I was initially very hesitant to restrict any kind of traffic, relying on ratelimiting IPs on critical endpoints that needed low friction, and captchas on the higher friction with higher intents, such as signup and password reset pages.
Other than that, I was very liberal with most traffic, making sure that Tor was unblocked, and even ending up migrating off Cloudflare's free tier to a paid CDN due to inexplicable errors that users were facing over Tor that were ultimately related to how they blocked some specific requests over Tor with 403, even though the MVPs on their community forums would never acknowledge such a thing.
Unfortunately, given that Tor is a free rotating proxy, my website got attacked on one of these critical, compute heavy endpoints through multiple exit nodes totaling ~20,000 RPS. I've reluctantly had to block Tor, and a few other paid proxy services discovered through my own research since then.
Another time, a set of human spammers distributed all over the world started sending out a large volume of spam towards my website; with something like 1,000,000 spam messages every day (I still feel this was an attack coordinated by a "competitor" of some sort, especially given a small percentage of messages entitled "I want to get paid for posting" or along those lines).
There was no meaningful differentiator between the spammers and legitimate users, they were using real Gmail accounts to sign up, analysis of their behaviours showed they were real users as opposed to simple or even browser-based automation, and the spammers were based out of the same residential IPs as legitimate users.
I, again, had to reluctantly introduce a spam filter on some common keywords, and although some legitimate users do get trapped from time to time, this was the only way I could get a handle on that problem.
I'm appalled by some of the discussions here. Was I "enshittifying" my website out of unbridled "greed"? I don't think so. But every time I come here, I find these accusations, which makes me think that as a website with technical users, we can definitely do better.
The problem is accountability.
Imagine starting a trade show business in the physical world as an example.
One day you start getting a bunch of people come in to mess with the place. You can identify them and their organization, then promptly remove them. If they continue, there are legal ramifications.
On the web, these people can be robots that look just like real people until you spend a while studying their behavior. Worse if they’re real people being paid for sabotage.
In the real world, you arrest them and find the source. Online they can remain anonymous and protected. What recourse do we have beyond splitting the web into a “verified ID” web, and a pseudonymous analog? We can’t keep treating potential computer engagement the same as human forever. As AI agents inevitably get cheaper and harder to detect, what choice will we have?
To be honest, I don't like initiatives towards a "verified web" either, and am very scared of the effects on anonymity that stuff like Apple's PAT, Chrome's now deprecated WEI or Cloudflare's similar efforts to that end are aimed at.
Not to say that these would just cement the position of Google and Microsoft and block off the rest of us from building alternatives to their products.
I feel that the current state of things are fine; I was eventually able to restrict most abuse in an acceptable way with few false positives. However, what I wished for was that more people would understand these tradeoffs instead of jumping to uncharitable interpretations not backed by real world experience as a conclusion.
> I'm appalled by some of the discussions here. Was I "enshittifying" my website out of unbridled "greed"? I don't think so. But every time I come here, I find these accusations, which makes me think that as a website with technical users, we can definitely do better.
It's if nothing else very evident most people fundamentally don't understand what an adversarial shit show running a public web service is.
There's a certain relatively tiny audience that has congregated on HN for whom hating ads is a kind of religion and google is the great satan.
Threads like this are where they come to affirm their beliefs with fellow adherents.
Comments like yours, those that imply there might be some valid reason for a move like this (even with degrees of separation) are simply heretical. I think these people cling to an internet circa 2002ish and the solution to all problems with the modern internet is to make the internet go back to 2002.
The problem isn’t the necessary fluff that must be added, it’s how easy it becomes to keep on adding it after the necessity subsides.
Google was a more honorable company when the ads were on the right hand side only instead of tricking you in the main results. This is the enshitification people talk about. Decision with no reason other than pure profit at user expense. They were horrendously profitable when they made this dark pattern switch.
Profits today can’t be distinguished accurately between users who know it’s an ad and those who were tricked into thinking it was organic.
> Please don't post insinuations about astroturfing, shilling, brigading, foreign agents, and the like. It degrades discussion and is usually mistaken. If you're worried about abuse, email hn@ycombinator.com and we'll look at the data.
thanks, captain google - the post I’m also responding to breaks a variety of site rules, weird how these types of people always jump in to post this exact thing and not take issue with the post being replied to at all. The fact this site is in denial about this problem is irrelevant to me - luckily that’s why there’s a downvoting/upvoting system. Note that I didn’t call him a shill, I am pointing out that the way the post was written looks like how a shill would post. Hope that is helpful to you.
Such allegations are unfalsifiable and have no meaningful contribution to the discussion; and is similar to alleging someone is a "Nazi" just because they have an alternative opinion.
I simply observed how these threads always go. The faithful turn up to proudly proclaim how much they don't use google, how they cripple their browsing experience so they may remain pure, to experience the internet as it was intended before the original sin (ads) corrupted it.
20000 RPS is very little — a web app / database running on an ordinary desktop computer can process up to 10000 RPS on a bare-metal configuration after some basic optimization. If that is half of your total average load, a single co-located server should be enough to eat entire "attack" without flinching. If you have "competitors" and I assume, that this is some kind of commercial product (including running profitable advertising-based business), you should probably have multiple geographically distributed servers and some kind of BGP-based DDoS protection.
Regarding Tor nodes — there is nothing wrong with locking them out, especially if your website isn't geo-blocked by any governments and there are no privacy concerns related to accessing it.
If, like Google, you lock out EVERYONE, even your logged in users, whose identities and payment details you have already confirmed, then... yes you are "enshittifying" or have ulterior motives.
> they were using real Gmail accounts to sign up
Using Gmail should be a red flag on its own. Google accounts can be purchased by millions, and immediately get resold after being blocked by target website. Same for phones. Only your own accounts / captchas / site rep can be treated as basis of trust. Confirmation e-mail is a mere formality to have some way of contacting your human users. By the time Reddit was created it was already useless as security measure.
RPS is a bad measure. 20k RPS is a little if you're serving static files, a raspberry pi could probably do that. It's a lot if you're mutating a large database table with each request, which depending on the service, isn't unheard of.
> you can’t imagine what’s a compute heavy endpoint
Indeed, I can't. Because "compute heavy" isn't a meaningful description. Is it written in C++? Are results persisted anywhere? Is it behind a queue? What is the caching strategy?
Given that original post mentions free Cloudflare tier, there is a good chance, that "compute" might mean something like "ordinary Python application, making several hundreds database requests". This is also a kind of high-load, but not the worst one by far.
I won't be exactly saying what it is to maintain my privacy, but the compute heavy part of it is not your run out of the mill web traffic but rather performs some heavy processing of input files, this part is written in Go.
This function of the website is different from the user-generated content part of the website where the traffic resembles those of regular dynamic websites with database reads and writes.
Maybe you could require hashcash, so that people who wanted to do automated searches could do it at an expense comparable to the expense of a human doing a search manually. Or a cryptocurrency micropayment, though tooling around that is currently poor.
The only issue with a hash cash is there’s no way to know whether the user’s browser is the one who computed said proof of work, or has delegated it to a different system and is simply relaying its results. At scale, you’d end up with a large botnet that receives proof of work tokens to solve for the scraping network to use.
My impression is that there's less effort for them to go directly to headless browsers. There are several foot guns in using a raw HTML parsing lib and dispatching HTTP requests. People don't care about resource usage, spammers even less and many of them lack the skills.
Most black hat spammers use botnets, especially against bigger targets which have enough traffic to build statistics to fingerprint clients and map out bad ASNs and so on, and most botnets are low powered. You're not running chrome on a smart fridge or an enterprise router.
True, but the bad actor's code doesn't typically run directly on the infected device. Typically the infected router or camera is just acting as a proxy.
Chrome is probably the worst browser possible to run for these things, so it's not the basis for comparison.
We have many smaller browsers, that run javascript, that work on low powered devices as well.
Starting from webkit and stripping down the rendering parts just to execute JavaScript and process the DOM, the RAM usage would be significantly lower.
A major player in this space is apparently looking for people experienced in scraping without using browser automation. My guess is that not running a browser results in using far fewer resources, thus reducing their costs heavily.
Running a headless browser also means that any differences in the headless environment vs. a "headed" one can be discovered, as well as any of your Javascript executing within the page, which significantly makes it difficult to scale your operation.
My experience is that headless browsers use about 100x more RAM, and at least 10x more bandwidth and 10x more processing power, and page loads take about 10x as long time to finish (vs curl). Though these numbers may be a bit low, there are instances you need to add another zero to one or more of them.
There's also considerably more jank with headless browsers, since you typically want to re-use instances to avoid incurring the cost of spawning a new browser for each retrieval.
Your comment is interesting and there are some people doing work on this although not specific to browser automation, e.g. AWS Lambda SnapStart is just them trying to boot your Java Lambda code and freeze the Firecracker MicroVM's snapshot and then starting other Lambda functions from there.
However, even with a VM approach, you tend to lose out on the fact that you can make 100s or 1000s of requests on a small box (~512 MB) every second if it's just restricted to HTTP(s). However, once you're booting up a headless browser, you're probably restricted to no more than loading 3-4 pages per second.
On the other hand you need to be able to do basics like match the headers, sometimes request irrelevant resources, handle malformed documents, catch changing form parameters, and other gotchas. Many would just copy the request from the browser console.
I recently discovered how great the ChatGPT web search feature is. Returns live (!) results from the web and usually finds things that Google doesn't - mostly niche searches in natural language that G simply doesn't get.
Of course, it uses JavaScript, which doesn't help with the problem discussed here.
But I do think that Google is internally seeing a huge drop in usage which is why they're currently running for the money. We're going to see this all across their products soon enough (I'm thinking Gmail).
I've been experimenting with creating single-site browsers[1] for all websites I routinely visit, effectively removing navigational queries from search engines; between that and Claude being able to answer technical questions, it's remarkable how rarely I even use browsers for day-to-day tasks anymore (as in web views with tabs and url bars).
We've been using the web (as in documents interconnected with links between servers) for a great number of tasks it was never quite designed to solve, and the result has always been awkward. It's been very refreshing to move away from the web browser-search engine duo for these things.
For one, and it took me a while to notice what was off, but there are like no ads anymore, anywhere. Not because I use adblockers, but because I simply don't end up directed to places where there are ads. And let me tell you, if you've been away from that stuff for a while, and then come back, holy crap what a dumpster fire.
The web browser has been center stage for a long while, coasting on momentum and old habits, but it turns out it doesn't need to be, and if you work to get rid of it, you get a better and more enjoyable computing experience. Given how much better this feels, I can't help but feel we're in for a big shift in how computers are used.
[1] You can just launch 'chrome --app=url' to make one. Or use Electron if you want to customize the UI yourself.
While I am glad that you seem to have found a new workflow that you like, your description strikes me as a personal experience.
I am aware that a lot of people use searches as a form of navigation, but it’s also very common that people use bookmarks, speed dial, history, pinned tabs, and other browser features instead of searching.
My Firefox is configured to not do online searches when I type into the address bar, instead I get only history suggestions. This setup allows for quick navigation, and does not require any steps to set up new pages that I need to visit.
What I want to say that while you seem to imply that you found a different pattern of use that many people will soon migrate to, I think these patterns have always been popular. People discover and make use of them as needed.
It’s also strange that you put such a negative sentiment on interconnected documents. Do you not realize how important these connections were for you to be able to reach the point you are at now? How else would you have found the things that are useful to you? By watching ads?
Search engines are also … really not really a good example of the strengths of the interconnected web, as they are mostly a one way thing. Consider instead a Hacker News discussion about a blog, and some other blog linking to that discussion, creating these interconnected but still separate communities and documents.
> It’s also strange that you put such a negative sentiment on interconnected documents. Do you not realize how important these connections were for you to be able to reach the point you are at now? How else would you have found the things that are useful to you? By watching ads?
This is specifically in the context of getting things done, not e.g. reading an interesting article for the enjoyment, but as an indirect means accomplish a task.
> I've been experimenting with creating single-site browsers[1] for all websites I routinely visit, effectively removing navigational queries from search engines
The bookmark interface on modern browsers is pretty awkward to access. It's a bigger upfront effort to set up an SSB, but they significantly streamline the user experience once they're set up in a way that aligns with what you want to do.
Web Browsers have a sort of inner platform tendency where they roll their own window management, and it just gets very messy and integrates incredibly poorly with the window management of the operating system.
You can open CI in your browser to see how your build is progressing, and in the same window, with a few keypresses, check your private email and then go buy new tires for your car, file your taxes, and after that go watch some porn.
Web browsers are streamlining an undesirable type of context switching: These are all tasks from separate domains, and I don't understand why it would be desirable that all of these things are easily accessible from the same window at the same time.
Having dedicated launchers opening specialized windows allows for a sort of workspace mise-en-place that makes interacting with the computer much more focused and deliberate. Each tool has its place and function.
While I understand the utility of separating contexts and making “distractions” from the current context harder to access. Doesn’t better integration into your system window management kinda defeat this separation again? Is there a significant difference in having a porn tab open or a porn windows open?
It’s great if this separation works for you and your current setup, but what does prevent future you from building muscle memory to quickly switch back to porn when you want to procrastinate your taxes?
You can add friction when switching contexts using the desktop environment. This is largely impossible with browsers since they largely aren't meaningfully customizable. Opening a tab and navigating to a website is generally speaking something like 4-6 keypresses. On a desktop you can for example add more clicks by put all your launchers in a folder structure grouped by task.
Though I actually set up different user accounts for different tasks, then only add shortcuts for the tools that are in any way relevant for the given context. This creates deliberate friction when context switching, and requires upfront intent when selecting what I do. It's not that anything is off limits per se, but all undesirable state changes are made awkward. I simply can't check my email from my programming account, or check the build status on my social media account.
If I want to go from monitoring a build on CI to e.g. paying the bills, I'd have to log out from the work account and shut everything down, then log into the business account, and open the bank SSB. This makes doing these particular tasks as easy as ever, but directionless task switching a serious pain in the ass.
As a serious computer user getting on for 25 years using text based
search tools I've long made various "single-site" tools. A big
inspiration way back was Surfraw [1], originally created by Julian
Assange. Reality is, most of us use a small number of websites
regularly. nearly all the info I want to touch is three keystrokes
away on the command-line or from within emacs.
When search died, a few years ago practically now, I was still
teaching a level-7 Research Methods course. The universities literally
did not notice that all of the advice we gave students was totally
obsolete and that it was not really possible to conduct academic
research that way.
Research today is very much more like it was in the pre-interent era.
You need to curate and keep in mind a set of reliable sources and
personal, private collections.
Had the misfortune of needing to spend a week using a standard browser
and sites like Google. It was beyond shocking. What I found I can only
describe as a wastescape, a war zone, a bombed-out favela with burned
out cars, overflowing sewers, piles of rubble and dead dogs lying in
gutters.
My first thought was kinda, "Oh sweet Jesus Christ, what happened to my
Internet?", and the very next one was "How does anyone get anything
done now?" How does the economy still function? And of the course the
answers are "They don't" and "It doesn't".
I think this is a really serious situation. There's simply no way that
as "knowledge workers", scientists, or whatever people call us now, we
can be as competitive as we were 10 or 20 years ago given the colossal
degradation of our tools. We have to stop this foolish self-deception
that things are "getting better". Google were a company that created
free search. Well done. But that was then. We remain stuck in this
strange mythology that advertising companies like Google and other
enshitified BigTech are a net asset to the economy. Surely they're a
vast parasitical drain and need digging into the ground so the rest of
us can get on with something resembling progress?
can it find OLD articles? I generally don't like the idea of a search engine which requires me to be logged in to track my search history (and I do mostly use Google in incognito/private browser windows), but I might ignore that if it allows me to do the one thing that Google refuses to do on phones anymore (which might be a sign that they're gonna phase that out from desktop interfaces soon)..
I believe the main intent is to block SERP analysers, which track result positions by keywords. Not that it would help a lot with bot abuse, but will make regular SEO agency life harder and more expensive.
Last month Google have also enstricted YouTube policies which IMHO is a sign, that they are not reaching specific milestones and that'd definitely be reflected over the alphabet stocks
They are going to make Google search even more broken than it is already? Be my guest! Since they are an ads business, I guess they don't really care about their search any longer, or they have sniffed some potential to gather even more information on users using Google, if they require running JS for it to work. Who knows. But anyone valuing their privacy has long left anyway.
> Everyone I know under 25 has stopped using Google search altogether.
completely unhinged take. Everyone I know under 25, as someone under 25, uses Google search at least an order of magnitude more than they use AI querying.
There's absolutely no need for JavaScript on a page that has a text input and two buttons and that has worked without JS for three decades. Given Google's reputation for privacy and the constant attempts at selling their users out, it's fair to assume that the reason they're requiring JavaScript is not noble.
I don't disagree with you. I use NoScript which lets me selectively enable every JS source a site has ever since marketers and advertisers have weaponized it, and you'd be surprised what you find and what works with minimal JS. If anything, it's very educational.
You miss the point. non-necessity =/= evil, but it does require a non-evil reason. JavaScript could be used on a site for some neat rendering or game where it’s necessary to do that neat thing. Without such a need, the person is inferring the change now is likely nefarious based on other actions from the same company and their motives.
I’m not necessarily agreeing with the OP, but I can understand their point without naively misconstruing it.
It's a well intentioned bolt on for adding reactivity without reloading the page, but it's been hijacked by the ad industrial complex to keep tabs on your behavior for people who do not have your best interests in mind. that usage of it, I would say, qualifies for a weak definition of evil.
But as usual, nobody really cares because it’s also useful and convenient, even if there’s a bunch of ad crap and fingerprinting and tracking and other stuff, basically taking away more and more control over how you want to consume the contents of a site, same as DRM.
Contrast that to a static site (or a server side rendered one to a lesser degree) which is more like a newspaper - if you have it, you can read it, cut out bits that you’re interested in, stash them away for safe keeping etc.
The more nuanced answer is that most technologies aren’t inherently evil or good but it depends on how they’re used. Even then the answer still leans towards “yes”.
Whether it's evil or not is a difficult question. I'd say it's at least as bad as satan, considering we can actually confirm its existence. But that it arose naturally from this grotesque universe means it is a valid part of things. Maybe it is we who are evil and it that punishes us.
Just fyi the entire browsing and checkout process of Amazon.com works fine without JavaScript, discovering that radicalized me against so called web apps. it just takes actually reading the html spec and maintaining state in the querystring or via session cookie. Latency can be lower than the monstrosities people build with react in the right circumstances.
You could probably get it working with declarative shadow dom, streaming in the AI generated content at the end of the html document and slotting it into place. There are no doubt a lot of gotchas but at first glance it seems feasible. Here’s a demo I found of something like that: https://github.com/dgp1130/out-of-order-streaming
The example repo is a little confusing to me, since it seems to use client-side JS to demonstrate that it doesn't need client-side JS: "It bootstraps a service worker and [...] No client-side JavaScript!"
But I guess the point is that the code in the service worker could have been on the server instead?
The trick seems to be using a template element with a slot and then slotting in the streamed content at the end. But you could probably also do it using just CSS to reposition the content from the bottom to the top, similarly to how many websites handle navigation menus, assuming that the client supports CSS.
Well, I read the HN headline and said to myself, I bet this requirement is pitched as "...to enhance the user experience...", and, yep, it's there.
That's akin with a response to some incident where companies "Take [user security etc.] seriously", when the immediate thought is, yeah, but if you did, that [thing] probably wouldn't have happened.
Dunno why I wrote all that - I don't use Google search, because I wanted to enhance (aka unenshitten) my search experience.
Honestly I wouldn't be surprised that if Google requires some Proof-of-work done on browser's host's CPU/GPU to validate search results and make it infeasible for bots therefore.
That brings up an interesting conundrum. If PoW were implemented, could known-valid (i.e. goodstanding for over a decade) accounts be switched over to PoS instead? Or paying accounts?
PoW could be written into infrequent pages such as the registration page and reset password page. It could run while the user fills in the form. I might implement this on some sites that get attacked.
This gives me an idea: thanks to anti-spam mechanisms residential proxies + headless browsers provide a better experiences than regular browsing on real devices.
Instead of PoW, maybe just make the clients prove they are capable of proxying browser sessions?
To be fair, if my search engine is anything to go on, about 0.5-1% of the requests I get are from human sources. The rest are from bots, and not like people who haven't found I have an API, but bots that are attempting to poison Google or Bing's query suggestions (even though I'm not backed by either). From what I've heard from other people running search engines, it looks the same everywhere.
I don't know what Google's ratio of human to botspam is, but given how much of a payday it would be if anyone were to succeed, I can imagine they're serving their fair number of automated requests.
Requiring a headless browser to automate the traffic makes the abuse significantly more expensive.
If it's such a common issue, I would've thought Google already ignored searches from clients that do not enable JavaScript when computing results?
Besides, you already got auto-blocked when using it in a slightly unusual way. Google hasn't worked on Tor since forever, and recently I also got blocked a few times just for using it through my text browser that uses libcurl for its network stack. So I imagine a botnet using curl wouldn't last very long either.
My guess is it had more to do with squeezing out more profit from that supposed 0.1% of users.
Given that curl-impersonate[1] exists and that a major player in this space is also looking for experience with this library, I'm pretty sure forcing the execution of JS using DOM stuff would be a much more effective deterrent to prevent scraping.
[1] https://github.com/lwthiker/curl-impersonate
"Why didn't they do it earlier?" is a fallacious argument.
If we accepted it, there would basically only be a single point in time where a change like this could be legitimately made. If the change is made before there is a large enough problem, you'll argue the change was unnecessary. If it's made after, you'll argue the change should have been made sooner.
"They've already done something else" isn't quite as logically fallacious, but shows that you don't experience dealing with adversarial application domains.
Adversarial problems, which scraping is, are dynamic and iterative games. The attacker and defender are stuck in an endless loop of game and counterplay, unless one side gives up. There's no point in defending against attacks that aren't happening -- it's not just useless, but probably harmful, because every defense has some cost in friction to legitimate users.
> My guess is it had more to do with squeezing out more profit from that supposed 0.1% of users.
Yes, that kind of thing is very easy to just assert. But just think about it for like two seconds. How much more revenue are you going to make per user? None. Users without JS are still shown ads. JS is not necessary for ad targeting either.
It seems just as plausible that this is losing them some revenue, because some proprortion of the people using the site without JS will stop using it rather than enable JS.
> "Why didn't they do it earlier?" is a fallacious argument.
I never said that, but admittedly I could have worded my argument better: "In my opinion, shadow banning non-JS clients from result computation would be similarly (if not more) effective at preventing SEO bots from poisoning results, and I would be surprised if they hadn't already done that."
Naturally, this doesn't fix the problem of having to spend resources on serving unsuccessful SEO bots that the existing blocking mechanisms (which I think are based on IP-address rate limiting and the UA's HTTPS fingerprint) failed to filter out.
> Yes, that kind of thing is very easy to just assert. But just think about it for like two seconds. How much more revenue are you going to make per user? None. Users without JS are still shown ads. JS is not necessary for ad targeting either.
Is JS necessary for ads? No. Does JS make it easier to control what the user is seeing? Sure it does.
If you've been following the developments on YouTube concerning ad-blockers, you should understand my suspicion that Search is going in a similar direction. Of course, it's all speculation; maybe they really just want to make sure we all get to experience the JS-based enhancements they have been working on :)
> Is JS necessary for ads? No.
JS is somewhat necessary for ads, they're not in anyway needed for displaying them, but instrumental in verifying that they are actually being displayed to human beings. Ad fraud is an enormous business.
I run a semi-popular website hosting user-generated content, although it's not a search engine; the attacks on it have surprised me, and I've eventually had to put in the same kinds of restrictions on it.
I was initially very hesitant to restrict any kind of traffic, relying on ratelimiting IPs on critical endpoints that needed low friction, and captchas on the higher friction with higher intents, such as signup and password reset pages.
Other than that, I was very liberal with most traffic, making sure that Tor was unblocked, and even ending up migrating off Cloudflare's free tier to a paid CDN due to inexplicable errors that users were facing over Tor that were ultimately related to how they blocked some specific requests over Tor with 403, even though the MVPs on their community forums would never acknowledge such a thing.
Unfortunately, given that Tor is a free rotating proxy, my website got attacked on one of these critical, compute heavy endpoints through multiple exit nodes totaling ~20,000 RPS. I've reluctantly had to block Tor, and a few other paid proxy services discovered through my own research since then.
Another time, a set of human spammers distributed all over the world started sending out a large volume of spam towards my website; with something like 1,000,000 spam messages every day (I still feel this was an attack coordinated by a "competitor" of some sort, especially given a small percentage of messages entitled "I want to get paid for posting" or along those lines).
There was no meaningful differentiator between the spammers and legitimate users, they were using real Gmail accounts to sign up, analysis of their behaviours showed they were real users as opposed to simple or even browser-based automation, and the spammers were based out of the same residential IPs as legitimate users.
I, again, had to reluctantly introduce a spam filter on some common keywords, and although some legitimate users do get trapped from time to time, this was the only way I could get a handle on that problem.
I'm appalled by some of the discussions here. Was I "enshittifying" my website out of unbridled "greed"? I don't think so. But every time I come here, I find these accusations, which makes me think that as a website with technical users, we can definitely do better.
The problem is accountability. Imagine starting a trade show business in the physical world as an example.
One day you start getting a bunch of people come in to mess with the place. You can identify them and their organization, then promptly remove them. If they continue, there are legal ramifications.
On the web, these people can be robots that look just like real people until you spend a while studying their behavior. Worse if they’re real people being paid for sabotage.
In the real world, you arrest them and find the source. Online they can remain anonymous and protected. What recourse do we have beyond splitting the web into a “verified ID” web, and a pseudonymous analog? We can’t keep treating potential computer engagement the same as human forever. As AI agents inevitably get cheaper and harder to detect, what choice will we have?
To be honest, I don't like initiatives towards a "verified web" either, and am very scared of the effects on anonymity that stuff like Apple's PAT, Chrome's now deprecated WEI or Cloudflare's similar efforts to that end are aimed at.
Not to say that these would just cement the position of Google and Microsoft and block off the rest of us from building alternatives to their products.
I feel that the current state of things are fine; I was eventually able to restrict most abuse in an acceptable way with few false positives. However, what I wished for was that more people would understand these tradeoffs instead of jumping to uncharitable interpretations not backed by real world experience as a conclusion.
> I'm appalled by some of the discussions here. Was I "enshittifying" my website out of unbridled "greed"? I don't think so. But every time I come here, I find these accusations, which makes me think that as a website with technical users, we can definitely do better.
It's if nothing else very evident most people fundamentally don't understand what an adversarial shit show running a public web service is.
There's a certain relatively tiny audience that has congregated on HN for whom hating ads is a kind of religion and google is the great satan.
Threads like this are where they come to affirm their beliefs with fellow adherents.
Comments like yours, those that imply there might be some valid reason for a move like this (even with degrees of separation) are simply heretical. I think these people cling to an internet circa 2002ish and the solution to all problems with the modern internet is to make the internet go back to 2002.
The problem isn’t the necessary fluff that must be added, it’s how easy it becomes to keep on adding it after the necessity subsides.
Google was a more honorable company when the ads were on the right hand side only instead of tricking you in the main results. This is the enshitification people talk about. Decision with no reason other than pure profit at user expense. They were horrendously profitable when they made this dark pattern switch.
Profits today can’t be distinguished accurately between users who know it’s an ad and those who were tricked into thinking it was organic.
Not all enshitification is equal.
[dead]
[flagged]
https://news.ycombinator.com/newsguidelines.html
> Please don't post insinuations about astroturfing, shilling, brigading, foreign agents, and the like. It degrades discussion and is usually mistaken. If you're worried about abuse, email hn@ycombinator.com and we'll look at the data.
thanks, captain google - the post I’m also responding to breaks a variety of site rules, weird how these types of people always jump in to post this exact thing and not take issue with the post being replied to at all. The fact this site is in denial about this problem is irrelevant to me - luckily that’s why there’s a downvoting/upvoting system. Note that I didn’t call him a shill, I am pointing out that the way the post was written looks like how a shill would post. Hope that is helpful to you.
Such allegations are unfalsifiable and have no meaningful contribution to the discussion; and is similar to alleging someone is a "Nazi" just because they have an alternative opinion.
Shilling for what?
I simply observed how these threads always go. The faithful turn up to proudly proclaim how much they don't use google, how they cripple their browsing experience so they may remain pure, to experience the internet as it was intended before the original sin (ads) corrupted it.
20000 RPS is very little — a web app / database running on an ordinary desktop computer can process up to 10000 RPS on a bare-metal configuration after some basic optimization. If that is half of your total average load, a single co-located server should be enough to eat entire "attack" without flinching. If you have "competitors" and I assume, that this is some kind of commercial product (including running profitable advertising-based business), you should probably have multiple geographically distributed servers and some kind of BGP-based DDoS protection.
Regarding Tor nodes — there is nothing wrong with locking them out, especially if your website isn't geo-blocked by any governments and there are no privacy concerns related to accessing it.
If, like Google, you lock out EVERYONE, even your logged in users, whose identities and payment details you have already confirmed, then... yes you are "enshittifying" or have ulterior motives.
> they were using real Gmail accounts to sign up
Using Gmail should be a red flag on its own. Google accounts can be purchased by millions, and immediately get resold after being blocked by target website. Same for phones. Only your own accounts / captchas / site rep can be treated as basis of trust. Confirmation e-mail is a mere formality to have some way of contacting your human users. By the time Reddit was created it was already useless as security measure.
RPS is a bad measure. 20k RPS is a little if you're serving static files, a raspberry pi could probably do that. It's a lot if you're mutating a large database table with each request, which depending on the service, isn't unheard of.
This comment is so out of touch I’m almost speechless.
> > critical, compute heavy endpoints through multiple exit nodes totaling ~20,000 RPS
> 20000 RPS is very little
If I had to guess you’ve never hosted non-static websites so you can’t imagine what’s a compute heavy endpoint.
> Using Gmail should be a red flag on its own.
Yes, ban users signing up with Gmail then.
And this is not an isolated case, discussions on DDoS, CAPTCHAs, etc. here always have these out of touch people coming out of the woodwork. Baffling.
> you can’t imagine what’s a compute heavy endpoint
Indeed, I can't. Because "compute heavy" isn't a meaningful description. Is it written in C++? Are results persisted anywhere? Is it behind a queue? What is the caching strategy?
Given that original post mentions free Cloudflare tier, there is a good chance, that "compute" might mean something like "ordinary Python application, making several hundreds database requests". This is also a kind of high-load, but not the worst one by far.
I won't be exactly saying what it is to maintain my privacy, but the compute heavy part of it is not your run out of the mill web traffic but rather performs some heavy processing of input files, this part is written in Go.
This function of the website is different from the user-generated content part of the website where the traffic resembles those of regular dynamic websites with database reads and writes.
I run a not-very-popular site -- at least 50% of the traffic is bots. I can only imagine how bad it would be if the site was a forum or search engine.
Maybe you could require hashcash, so that people who wanted to do automated searches could do it at an expense comparable to the expense of a human doing a search manually. Or a cryptocurrency micropayment, though tooling around that is currently poor.
The only issue with a hash cash is there’s no way to know whether the user’s browser is the one who computed said proof of work, or has delegated it to a different system and is simply relaying its results. At scale, you’d end up with a large botnet that receives proof of work tokens to solve for the scraping network to use.
My impression is that there's less effort for them to go directly to headless browsers. There are several foot guns in using a raw HTML parsing lib and dispatching HTTP requests. People don't care about resource usage, spammers even less and many of them lack the skills.
Most black hat spammers use botnets, especially against bigger targets which have enough traffic to build statistics to fingerprint clients and map out bad ASNs and so on, and most botnets are low powered. You're not running chrome on a smart fridge or an enterprise router.
True, but the bad actor's code doesn't typically run directly on the infected device. Typically the infected router or camera is just acting as a proxy.
There are ways to detect that and it will still require a lot of CPU and ram behind the proxies.
Chrome is probably the worst browser possible to run for these things, so it's not the basis for comparison.
We have many smaller browsers, that run javascript, that work on low powered devices as well.
Starting from webkit and stripping down the rendering parts just to execute JavaScript and process the DOM, the RAM usage would be significantly lower.
A major player in this space is apparently looking for people experienced in scraping without using browser automation. My guess is that not running a browser results in using far fewer resources, thus reducing their costs heavily.
Running a headless browser also means that any differences in the headless environment vs. a "headed" one can be discovered, as well as any of your Javascript executing within the page, which significantly makes it difficult to scale your operation.
My experience is that headless browsers use about 100x more RAM, and at least 10x more bandwidth and 10x more processing power, and page loads take about 10x as long time to finish (vs curl). Though these numbers may be a bit low, there are instances you need to add another zero to one or more of them.
There's also considerably more jank with headless browsers, since you typically want to re-use instances to avoid incurring the cost of spawning a new browser for each retrieval.
Is it possible to pause a VM just after the browser has started up? Then map it as copy-on-write memory and spin up many VMs from that "image".
Your comment is interesting and there are some people doing work on this although not specific to browser automation, e.g. AWS Lambda SnapStart is just them trying to boot your Java Lambda code and freeze the Firecracker MicroVM's snapshot and then starting other Lambda functions from there.
However, even with a VM approach, you tend to lose out on the fact that you can make 100s or 1000s of requests on a small box (~512 MB) every second if it's just restricted to HTTP(s). However, once you're booting up a headless browser, you're probably restricted to no more than loading 3-4 pages per second.
... but then you have even larger overhead, as well as the added layer of complexity from managing VMs on top of headless browsers.
On the other hand you need to be able to do basics like match the headers, sometimes request irrelevant resources, handle malformed documents, catch changing form parameters, and other gotchas. Many would just copy the request from the browser console.
The change rate for Chromium is also so high that it's hard to spot the addition of code targeting whatever you are doing on the client side.
so much more expensive and slow vs just scraping the html. It is not hard to scrape raw html if the target is well-defined (like google).
> bots that are attempting to poison Google or Bing's query suggestions
This seems like yet another example of Google and friends inviting the problem they're objecting to.
Just tested (ignoring AI search engines, non-english, non-free):
Search engines which require JavaScript:
Google, Bing, Ecosia, Yandex, Qwant, Gibiru, Presearch, Seekr, Swisscows, Yep, Openverse, Dogpile, Waldo
Search engines which do not require JavaScript:
DuckDuckGo, Yahoo Search, Brave Search, Startpage, AOL Search, giveWater, Mojeek
Kagi.com works without JS
Have just updated my text: "ignoring non-free" :-)
I've put off learning JavaScript for over 20 years, now I'm not going to be able to search for anything
What's next? Not working for an adtech company?
You can use DuckDuckGo without Javascript.
What I find amusing is that this is Google. It's their bots, and now LLMs as well, that have hammered people's websites for years.
Have they hammered people's websites? I find that the Google bot makes as few requests as it can, and it respects robots.txt.
I recently discovered how great the ChatGPT web search feature is. Returns live (!) results from the web and usually finds things that Google doesn't - mostly niche searches in natural language that G simply doesn't get.
Of course, it uses JavaScript, which doesn't help with the problem discussed here.
But I do think that Google is internally seeing a huge drop in usage which is why they're currently running for the money. We're going to see this all across their products soon enough (I'm thinking Gmail).
I've been experimenting with creating single-site browsers[1] for all websites I routinely visit, effectively removing navigational queries from search engines; between that and Claude being able to answer technical questions, it's remarkable how rarely I even use browsers for day-to-day tasks anymore (as in web views with tabs and url bars).
We've been using the web (as in documents interconnected with links between servers) for a great number of tasks it was never quite designed to solve, and the result has always been awkward. It's been very refreshing to move away from the web browser-search engine duo for these things.
For one, and it took me a while to notice what was off, but there are like no ads anymore, anywhere. Not because I use adblockers, but because I simply don't end up directed to places where there are ads. And let me tell you, if you've been away from that stuff for a while, and then come back, holy crap what a dumpster fire.
The web browser has been center stage for a long while, coasting on momentum and old habits, but it turns out it doesn't need to be, and if you work to get rid of it, you get a better and more enjoyable computing experience. Given how much better this feels, I can't help but feel we're in for a big shift in how computers are used.
[1] You can just launch 'chrome --app=url' to make one. Or use Electron if you want to customize the UI yourself.
While I am glad that you seem to have found a new workflow that you like, your description strikes me as a personal experience.
I am aware that a lot of people use searches as a form of navigation, but it’s also very common that people use bookmarks, speed dial, history, pinned tabs, and other browser features instead of searching. My Firefox is configured to not do online searches when I type into the address bar, instead I get only history suggestions. This setup allows for quick navigation, and does not require any steps to set up new pages that I need to visit.
What I want to say that while you seem to imply that you found a different pattern of use that many people will soon migrate to, I think these patterns have always been popular. People discover and make use of them as needed.
It’s also strange that you put such a negative sentiment on interconnected documents. Do you not realize how important these connections were for you to be able to reach the point you are at now? How else would you have found the things that are useful to you? By watching ads?
Search engines are also … really not really a good example of the strengths of the interconnected web, as they are mostly a one way thing. Consider instead a Hacker News discussion about a blog, and some other blog linking to that discussion, creating these interconnected but still separate communities and documents.
> It’s also strange that you put such a negative sentiment on interconnected documents. Do you not realize how important these connections were for you to be able to reach the point you are at now? How else would you have found the things that are useful to you? By watching ads?
This is specifically in the context of getting things done, not e.g. reading an interesting article for the enjoyment, but as an indirect means accomplish a task.
> I've been experimenting with creating single-site browsers[1] for all websites I routinely visit, effectively removing navigational queries from search engines
Surely it would make more sense to use bookmarks?
The bookmark interface on modern browsers is pretty awkward to access. It's a bigger upfront effort to set up an SSB, but they significantly streamline the user experience once they're set up in a way that aligns with what you want to do.
Web Browsers have a sort of inner platform tendency where they roll their own window management, and it just gets very messy and integrates incredibly poorly with the window management of the operating system.
You can open CI in your browser to see how your build is progressing, and in the same window, with a few keypresses, check your private email and then go buy new tires for your car, file your taxes, and after that go watch some porn.
Web browsers are streamlining an undesirable type of context switching: These are all tasks from separate domains, and I don't understand why it would be desirable that all of these things are easily accessible from the same window at the same time.
Having dedicated launchers opening specialized windows allows for a sort of workspace mise-en-place that makes interacting with the computer much more focused and deliberate. Each tool has its place and function.
While I understand the utility of separating contexts and making “distractions” from the current context harder to access. Doesn’t better integration into your system window management kinda defeat this separation again? Is there a significant difference in having a porn tab open or a porn windows open?
It’s great if this separation works for you and your current setup, but what does prevent future you from building muscle memory to quickly switch back to porn when you want to procrastinate your taxes?
You can add friction when switching contexts using the desktop environment. This is largely impossible with browsers since they largely aren't meaningfully customizable. Opening a tab and navigating to a website is generally speaking something like 4-6 keypresses. On a desktop you can for example add more clicks by put all your launchers in a folder structure grouped by task.
Though I actually set up different user accounts for different tasks, then only add shortcuts for the tools that are in any way relevant for the given context. This creates deliberate friction when context switching, and requires upfront intent when selecting what I do. It's not that anything is off limits per se, but all undesirable state changes are made awkward. I simply can't check my email from my programming account, or check the build status on my social media account.
If I want to go from monitoring a build on CI to e.g. paying the bills, I'd have to log out from the work account and shut everything down, then log into the business account, and open the bank SSB. This makes doing these particular tasks as easy as ever, but directionless task switching a serious pain in the ass.
As a serious computer user getting on for 25 years using text based search tools I've long made various "single-site" tools. A big inspiration way back was Surfraw [1], originally created by Julian Assange. Reality is, most of us use a small number of websites regularly. nearly all the info I want to touch is three keystrokes away on the command-line or from within emacs.
When search died, a few years ago practically now, I was still teaching a level-7 Research Methods course. The universities literally did not notice that all of the advice we gave students was totally obsolete and that it was not really possible to conduct academic research that way.
Research today is very much more like it was in the pre-interent era. You need to curate and keep in mind a set of reliable sources and personal, private collections.
Had the misfortune of needing to spend a week using a standard browser and sites like Google. It was beyond shocking. What I found I can only describe as a wastescape, a war zone, a bombed-out favela with burned out cars, overflowing sewers, piles of rubble and dead dogs lying in gutters.
My first thought was kinda, "Oh sweet Jesus Christ, what happened to my Internet?", and the very next one was "How does anyone get anything done now?" How does the economy still function? And of the course the answers are "They don't" and "It doesn't".
I think this is a really serious situation. There's simply no way that as "knowledge workers", scientists, or whatever people call us now, we can be as competitive as we were 10 or 20 years ago given the colossal degradation of our tools. We have to stop this foolish self-deception that things are "getting better". Google were a company that created free search. Well done. But that was then. We remain stuck in this strange mythology that advertising companies like Google and other enshitified BigTech are a net asset to the economy. Surely they're a vast parasitical drain and need digging into the ground so the rest of us can get on with something resembling progress?
[1] http://surfraw.org/
can it find OLD articles? I generally don't like the idea of a search engine which requires me to be logged in to track my search history (and I do mostly use Google in incognito/private browser windows), but I might ignore that if it allows me to do the one thing that Google refuses to do on phones anymore (which might be a sign that they're gonna phase that out from desktop interfaces soon)..
I believe the main intent is to block SERP analysers, which track result positions by keywords. Not that it would help a lot with bot abuse, but will make regular SEO agency life harder and more expensive.
Last month Google have also enstricted YouTube policies which IMHO is a sign, that they are not reaching specific milestones and that'd definitely be reflected over the alphabet stocks
Previous discussion: Google.com search now refusing to search for FF esr 128 without JavaScript (2025-01-16, 92 points), https://news.ycombinator.com/item?id=42719865
They are going to make Google search even more broken than it is already? Be my guest! Since they are an ads business, I guess they don't really care about their search any longer, or they have sniffed some potential to gather even more information on users using Google, if they require running JS for it to work. Who knows. But anyone valuing their privacy has long left anyway.
Almost everyone I know has moved a lot of their searching onto ChatGPT or WhatsApp AI querying.
Everyone I know under 25 has stopped using Google search altogether.
I think the only people disabling JavaScript must be GenX graybeards such as myself or security experts.
> Everyone I know under 25 has stopped using Google search altogether.
completely unhinged take. Everyone I know under 25, as someone under 25, uses Google search at least an order of magnitude more than they use AI querying.
Everyone I know under 25 hasn’t heard of chatgpt.
How is that possible?
[dead]
Related discussion as linked in article: ("users on social media")
Google.com search now refusing to search for FF esr 128 without JavaScript
https://news.ycombinator.com/item?id=42719865
It does not even work with javascript enabled! Always asking for some cookies permissions, captcha, Gmail login...
…and all the results are ads and seo blogspam.
Don't be evil.
Is JavaScript is now evil?
There's absolutely no need for JavaScript on a page that has a text input and two buttons and that has worked without JS for three decades. Given Google's reputation for privacy and the constant attempts at selling their users out, it's fair to assume that the reason they're requiring JavaScript is not noble.
> There's absolutely no need for JavaScript on a page that has a text input and two buttons
The whole web is evil then. Hacker news has JavaScript for simple upvote buttons, is it also evil?
HN is usable w/o JavaScript. It doesn't block my access because I choose not to allow it to execute arbitrary code on my computer.
* execute arbitrary code in one of the best studied sandboxes on the planet, which happens to be running on your computer.
… and is an extension of the largest surveillance apparatus ever built by mankind.
> which happens to be running on your computer.
not if you turn it off
Voting works for HN without js. Just forces a page refresh.
I don't disagree with you. I use NoScript which lets me selectively enable every JS source a site has ever since marketers and advertisers have weaponized it, and you'd be surprised what you find and what works with minimal JS. If anything, it's very educational.
You miss the point. non-necessity =/= evil, but it does require a non-evil reason. JavaScript could be used on a site for some neat rendering or game where it’s necessary to do that neat thing. Without such a need, the person is inferring the change now is likely nefarious based on other actions from the same company and their motives.
I’m not necessarily agreeing with the OP, but I can understand their point without naively misconstruing it.
Okay. But is it evil?
It's a well intentioned bolt on for adding reactivity without reloading the page, but it's been hijacked by the ad industrial complex to keep tabs on your behavior for people who do not have your best interests in mind. that usage of it, I would say, qualifies for a weak definition of evil.
For argument's sake, yes, unironically: https://www.gnu.org/philosophy/javascript-trap.html
In contrast, this is less evil: https://www.gnu.org/software/librejs/
But as usual, nobody really cares because it’s also useful and convenient, even if there’s a bunch of ad crap and fingerprinting and tracking and other stuff, basically taking away more and more control over how you want to consume the contents of a site, same as DRM.
Contrast that to a static site (or a server side rendered one to a lesser degree) which is more like a newspaper - if you have it, you can read it, cut out bits that you’re interested in, stash them away for safe keeping etc.
The more nuanced answer is that most technologies aren’t inherently evil or good but it depends on how they’re used. Even then the answer still leans towards “yes”.
Yes
Whether it's evil or not is a difficult question. I'd say it's at least as bad as satan, considering we can actually confirm its existence. But that it arose naturally from this grotesque universe means it is a valid part of things. Maybe it is we who are evil and it that punishes us.
Javascript is like Flash-lite. Is it evil? No. It's great, even.
What every last commercial site uses it for IS evil, without a doubt.
It's literally almost an anagram of Satan's Armpits.
It uses a lot of data, it is a security risk, it is a privacy risk, and it forces you to throw away your old devices.
how much extra data does JS on google use vs without JS? We must be talking about kb that are probably also cached.
On the average site JS will add anywhere from hundreds of KB to over a MB of data to download: https://httparchive.org/reports/state-of-javascript#bytesJs
Especially when it’s foundational to how the site runs, e.g. the typical way how Vue, React or Angular are used.
I personally use a free (and small) mobile data plan, so that several MB/search would make me pay infinite times more. I switched to DDG.
No, limiting user freedom as to whether or not to use it is.
It is in Google’s hands
Javascript has always been evil.
Yes.
Yeah, roughly since 1996.
Always has been.
If you had to estimate percentage of the web could exist without losing functionality without using JavaScript?
Just fyi the entire browsing and checkout process of Amazon.com works fine without JavaScript, discovering that radicalized me against so called web apps. it just takes actually reading the html spec and maintaining state in the querystring or via session cookie. Latency can be lower than the monstrosities people build with react in the right circumstances.
My phones have had JavaScript off by default for years. I'm amazed by how many sites work fine and are pleasurable to use.
If you had to estimate the percentage of the malicious web without JavaScript.
I only know of JavaScript / Spectre combo are there tales of other evils lurking in the great wild?
I browse with JavaScript off by default on my phone. Guess I'm going to DuckDuckGo now.
You can also ditch Chrome by switching to DuckDuckGo browser.
> quality of search results
A.k.a ads
How else are you going to load a hideously incorrect AI summary block without your initial page latency being through the roof?
You could probably get it working with declarative shadow dom, streaming in the AI generated content at the end of the html document and slotting it into place. There are no doubt a lot of gotchas but at first glance it seems feasible. Here’s a demo I found of something like that: https://github.com/dgp1130/out-of-order-streaming
The example repo is a little confusing to me, since it seems to use client-side JS to demonstrate that it doesn't need client-side JS: "It bootstraps a service worker and [...] No client-side JavaScript!"
But I guess the point is that the code in the service worker could have been on the server instead?
The trick seems to be using a template element with a slot and then slotting in the streamed content at the end. But you could probably also do it using just CSS to reposition the content from the bottom to the top, similarly to how many websites handle navigation menus, assuming that the client supports CSS.
Iframes lazy
Object content as lazy
Embed lazy
Image lazy
Link rel=import (not support that widely though)
Heck if you wanted to get REALLY cute you go use multipart-mixed-replace headers.
Or SSE
now that i think about it you could do it quite nicely with svg's and foreignObject
iframes?
this is the only time in 15 years i've wished hn had a lol react. :D
Why?, iframes are Very much underappreciated
Kagi it is, then
10$/mo is way too expensive
It saves me more than that most months. This month, a single superior query result saved me four hours of driving and $100.
For you?
For me, it's worth every cent.
Is kagi that good?
You can find the endless reviews:
https://hn.algolia.com/?query=kagi&sort=byPopularity
Possibly one of the most reviewed SaaS companies here.
And yes, it's pretty great. But it's just Search (with some AI tossed in).
Incidentally, Kagi works with JavaScript disabled.
I tried it, it’s fine but I prefer the mix of uBlacklist and Bing with a duck branding and ChatGPT.
Yes
Well, I read the HN headline and said to myself, I bet this requirement is pitched as "...to enhance the user experience...", and, yep, it's there.
That's akin with a response to some incident where companies "Take [user security etc.] seriously", when the immediate thought is, yeah, but if you did, that [thing] probably wouldn't have happened.
Dunno why I wrote all that - I don't use Google search, because I wanted to enhance (aka unenshitten) my search experience.
Honestly I wouldn't be surprised that if Google requires some Proof-of-work done on browser's host's CPU/GPU to validate search results and make it infeasible for bots therefore.
That brings up an interesting conundrum. If PoW were implemented, could known-valid (i.e. goodstanding for over a decade) accounts be switched over to PoS instead? Or paying accounts?
PoW could be written into infrequent pages such as the registration page and reset password page. It could run while the user fills in the form. I might implement this on some sites that get attacked.
This gives me an idea: thanks to anti-spam mechanisms residential proxies + headless browsers provide a better experiences than regular browsing on real devices.
Instead of PoW, maybe just make the clients prove they are capable of proxying browser sessions?
Last days of Rome.
Yet another reason to stop supporting Google with your clicks. Remember when their moto was “Don’t be evil”?