@koper

koper@feddit.nl · 15 days ago

MS Paint isn’t marketed or treated as a source of truth. LLMs are.

koper@feddit.nl · 23 days ago

Breaking an NDA (allegedly) is civil, not criminal

koper@feddit.nl · 1 month ago

If you’re deliberately belittling me I won’t engage. Goodbye.

koper@feddit.nl · 1 month ago

“You criticize society yet you participate in it. Curious.”

koper@feddit.nl · edit-2 1 month ago

To be clear, I am not minimizing the problems of scrapers. I am merely pointing out that this strategy of proof-of-work has nasty side effects and we need something better.

These issues are not short term. PoW means you are entering into an arms race against an adversary with bottomless pockets that inherently requires a ton of useless computations in the browser.

When it comes to moving towards something based on heuristics, which is what the developer was talking about there, that is much better. But that is basically what many others are already doing (like the “I am not a robot” checkmark) and fundamentally different from the PoW that I argue against.

Go do heuristics, not PoW.

koper@feddit.nl · 1 month ago

It depends on the website’s setting. I have the same phone and there was one website where it took more than 20 seconds.

The power consumption is significant, because it needs to be. That is the entire point of this design. If it doesn’t take significant a significant number of CPU cycles, scrapers will just power through them. This may not be significant for an individual user, but it does add up when this reaches widespread adoption and everyone’s devices have to solve those challenges.

koper@feddit.nl · 1 month ago

It is basically instantaneous on my 12 year old Keppler GPU Linux Box.

It depends on what the website admin sets, but I’ve had checks take more than 20 seconds on my reasonably modern phone. And as scrapers get more ruthless, that difficulty setting will have to go up.

The Cryptography happening is something almost all browsers from the last 10 years can do natively that Scrapers have to be individually programmed to do. Making it several orders of magnitude beyond impractical for every single corporate bot to be repurposed for.

At best these browsers are going to have some efficient CPU implementation. Scrapers can send these challenges off to dedicated GPU farms or even FPGAs, which are an order of magnitude faster and more efficient. This is also not complex, a team of engineers could set this up in a few days.

Only to then be rendered moot, because it’s an open-source project that someone will just update the cryptographic algorithm for.

There might be something in changing to a better, GPU resistant algorithm like argon2, but browsers don’t support those natively so you would rely on an even less efficient implementation in js or wasm. Quickly changing details of the algorithm in a game of whack-a-mole could work to an extent, but that would turn this into an arms race. And the scrapers can afford far more development time than the maintainers of Anubis.

These posts contain links to articles, if you read them you might answer some of your own questions and have more to contribute to the conversation.

This is very condescending. I would prefer if you would just engage with my arguments.

koper@feddit.nl · 1 month ago

On the contrary, I’m hoping for a solution that is better than this.

Do you disagree with any part of my assessment? How do you think Anubis will work long term?

koper@feddit.nl · 1 month ago

I get that website admins are desperate for a solution, but Anubis is fundamentally flawed.

It is hostile to the user, because it is very slow on older hardware andere forces you to use javascript.

It is bad for the environment, because it wastes energy on useless computations similar to mining crypto. If more websites start using this, that really adds up.

But most importantly, it won’t work in the end. These scraping tech companies have much deeper pockets and can use specialized hardware that is much more efficient at solving these challenges than a normal web browser.

koper@feddit.nl · 5 months ago

As long as it’s not an exit node, nobody will be able to tell what the traffic is. It’s all encrypted including the metadata.

koper@feddit.nl · 6 months ago

I think they mean uPnP

koper@feddit.nl · edit-2 7 months ago

How does that increase the risk compared to something like JBOD or overlayfs? In both cases you will lose data if a drive fails. Keep in mind that this is btrfs raid0, not regular raid. If anything that decreases the chance of corruption because the metadata is redundantly stored on both drives.

koper@feddit.nl · 7 months ago

No mention of systemd? This is unacceptable.

koper@feddit.nl · 7 months ago

A disk failure will cause you to lose data, yes. But that’s also the case in all the other solutions discussed here. Backups should be handled separately and are not part of the original question.

koper@feddit.nl · 7 months ago

Have you considered simply setting btrfs to RAID 0?

koper@feddit.nl · 9 months ago

Even if you computer is not exposed to the internet: are you certain that every other device on the network is safe (even on public wifi)? Would you immediately raise the alarm if you saw a second printer in the list with the same name, or something like “Print to file”? I think I personally could fall for that under the right circumstances.

koper@feddit.nl · 10 months ago

Is this a threat?

koper@feddit.nl · edit-2 11 months ago

“Safe” being defined in a user-hostile manier, i.e. with unmodified Google components and not rooted.

“Google-controlled” would be a better word.

koper@feddit.nl · 1 year ago

With this approach you would lose the subvolume structure and deduplication if I’m not mistaken.

koper@feddit.nl · 1 year ago

No, you got downvoted because you were insulting and incorrect.