

Breaking an NDA (allegedly) is civil, not criminal
Breaking an NDA (allegedly) is civil, not criminal
If you’re deliberately belittling me I won’t engage. Goodbye.
“You criticize society yet you participate in it. Curious.”
To be clear, I am not minimizing the problems of scrapers. I am merely pointing out that this strategy of proof-of-work has nasty side effects and we need something better.
These issues are not short term. PoW means you are entering into an arms race against an adversary with bottomless pockets that inherently requires a ton of useless computations in the browser.
When it comes to moving towards something based on heuristics, which is what the developer was talking about there, that is much better. But that is basically what many others are already doing (like the “I am not a robot” checkmark) and fundamentally different from the PoW that I argue against.
Go do heuristics, not PoW.
It depends on the website’s setting. I have the same phone and there was one website where it took more than 20 seconds.
The power consumption is significant, because it needs to be. That is the entire point of this design. If it doesn’t take significant a significant number of CPU cycles, scrapers will just power through them. This may not be significant for an individual user, but it does add up when this reaches widespread adoption and everyone’s devices have to solve those challenges.
It is basically instantaneous on my 12 year old Keppler GPU Linux Box.
It depends on what the website admin sets, but I’ve had checks take more than 20 seconds on my reasonably modern phone. And as scrapers get more ruthless, that difficulty setting will have to go up.
The Cryptography happening is something almost all browsers from the last 10 years can do natively that Scrapers have to be individually programmed to do. Making it several orders of magnitude beyond impractical for every single corporate bot to be repurposed for.
At best these browsers are going to have some efficient CPU implementation. Scrapers can send these challenges off to dedicated GPU farms or even FPGAs, which are an order of magnitude faster and more efficient. This is also not complex, a team of engineers could set this up in a few days.
Only to then be rendered moot, because it’s an open-source project that someone will just update the cryptographic algorithm for.
There might be something in changing to a better, GPU resistant algorithm like argon2, but browsers don’t support those natively so you would rely on an even less efficient implementation in js or wasm. Quickly changing details of the algorithm in a game of whack-a-mole could work to an extent, but that would turn this into an arms race. And the scrapers can afford far more development time than the maintainers of Anubis.
These posts contain links to articles, if you read them you might answer some of your own questions and have more to contribute to the conversation.
This is very condescending. I would prefer if you would just engage with my arguments.
On the contrary, I’m hoping for a solution that is better than this.
Do you disagree with any part of my assessment? How do you think Anubis will work long term?
I get that website admins are desperate for a solution, but Anubis is fundamentally flawed.
It is hostile to the user, because it is very slow on older hardware andere forces you to use javascript.
It is bad for the environment, because it wastes energy on useless computations similar to mining crypto. If more websites start using this, that really adds up.
But most importantly, it won’t work in the end. These scraping tech companies have much deeper pockets and can use specialized hardware that is much more efficient at solving these challenges than a normal web browser.
As long as it’s not an exit node, nobody will be able to tell what the traffic is. It’s all encrypted including the metadata.
I think they mean uPnP
How does that increase the risk compared to something like JBOD or overlayfs? In both cases you will lose data if a drive fails. Keep in mind that this is btrfs raid0, not regular raid. If anything that decreases the chance of corruption because the metadata is redundantly stored on both drives.
No mention of systemd? This is unacceptable.
A disk failure will cause you to lose data, yes. But that’s also the case in all the other solutions discussed here. Backups should be handled separately and are not part of the original question.
Have you considered simply setting btrfs to RAID 0?
Even if you computer is not exposed to the internet: are you certain that every other device on the network is safe (even on public wifi)? Would you immediately raise the alarm if you saw a second printer in the list with the same name, or something like “Print to file”? I think I personally could fall for that under the right circumstances.
Is this a threat?
“Safe” being defined in a user-hostile manier, i.e. with unmodified Google components and not rooted.
“Google-controlled” would be a better word.
With this approach you would lose the subvolume structure and deduplication if I’m not mistaken.
No, you got downvoted because you were insulting and incorrect.
MS Paint isn’t marketed or treated as a source of truth. LLMs are.