Lvxferre [he/him]

I have two chimps within, called Laziness and Hyperactivity. They smoke cigs, drink yerba, fling shit at each other, and devour the faces of anyone who comes close to them.

They also devour my dreams.

  • 2 Posts
  • 290 Comments
Joined 2 years ago
cake
Cake day: January 12th, 2024

help-circle

  • I don’t see what the problem is with using AI for translations. if the translations are good enough and cheap enough, they should be used.

    Because machine translations for any large chunk of text are consistently awful: they don’t get references right, they often miss the point of the original utterance, they ignore cultural context, so goes on. It’s like wiping your arse with an old sock - sure, you could do it in a pinch, but you definitively don’t want to do it regularly!

    Verbose example, using Portuguese to English

    I’ll give you an example, using PT→EN because I don’t speak JP. Let’s say Alice tells Bob “ma’ tu é uma nota de três pila, né?” (literally: “bu[t] you’re a three bucks bill, isn’t it?”) . A human translator will immediately notice a few things:

    • It’s an informal and regional register. If Alice typically uses this register, it’s part of her characterisation; else, it register shift is noteworthy. Either way, it’s meaningful.
    • There’s an idiom there; “nota de três pila” (three bucks bill). It conveys some[thing/one] is blatantly false.
    • There’s a rhetorical question, worded like an accusation. The scene dictates how it should be interpreted.

    So depending on the context, the translator might translate this as “ain’t ya full of shit…”, or perhaps “wow, you’re as fake as Monopoly money, arentcha?”. Now, check how chatbots do it:

    • GPT-4o mini: “But you’re a three-buck note, right?”
    • Llama 4 Scout: “But you are a three-dollar bill, aren’t you?”; or “You’re a three-dollar bill, right?” (it offers both alternatives)

    Both miss the mark. If you talk about three dollar bills in English, lots of people associate it with gay people, creating an association that simply does not exist in the original. The extremely informal and regional register is gone, as well as the accusatory tone.

    With Claude shitting this pile of idiocy, that I had to screenshot because otherwise people wouldn’t believe me:


    [This is wrong on so many levels I don’t… I don’t even…]

    This is what you get for AI translations between two IE languages in the same Sprachbund, that’ll often do things in a similar way. It gets way worse for Japanese → English - because they’re languages from different families, different cultures, that didn’t historically interact that much. It’s like the dumb shit above, multiplied by ten.

    If they’re not good enough, another business can offer better translations as a differentiator.

    That “business” is called watching pirated anime with fan subs, made by people who genuinely enjoy anime and want others to enjoy it too.




  • I was going to explain stuff, but given I’m verbose as fuck, it’s simply easier to link Wikipedia. A few highlights:

    sees 10 distinct colors looking at a rainbow, whereas the rest of us see only five.

    The number of distinct colours you see in the rainbow isn’t just dependent on your colour vision. I have an in-depth explanation here (up to the traffic light), but to keep it short: what you consider “distinct colours” or “hues of the same colour” is largely culture-dependent.

    Plus it depends on the rainbow itself; example here

    You’re likely to distinguish way more colours for the inner rainbow than the outer one. (For me it’s six vs. three)

    “A true tetrachromat has another type of cone in between the red and green — somewhere in the orange range — and its 100 shades theoretically would allow her to see 100 million different colors.”

    Emphasis mine. While tetrachromats are expected to have a fourth type of cone between the red and green, people with cones elsewhere wouldn’t magically become “false” tetrachromats.

    Unfortunately, in this day and age it would likely be very frustrating, especially since most tetrachromats are likely unaware of their unique abilities.

    This was written in 2001. Say hello to 2025. LEDs make this trivial - because they allow you to reliably produce light in narrow wavelengths. For example, a mix of 620nm (red) and 530nm (green) lights would be completely different from 570nm (yellow) light, even if for trichromats they’re the same type of yellow.

    To a tetrachromat, television and photography would fail to reproduce colours correctly.

    I think a good equivalent would be a TV without one of the colour channels… say, if the TV is missing the green channel it shows purple, green and grey all the same. For tetrachromats all TVs would be like this, since they’d be missing the fourth colour channel.


    Further genetic info: humans encode colour vision into the chromosomes 7 (blue opsin) and X (red and green opsins). At least in theory you could have a mutation in one of those three genes, that makes the associated cone cells absorb light in a different wavelength; and, if the person has both the mutant and ancestral alleles of the gene, at the same time, they would be tetrachromat.

    In practice this means that tetrachromacy among men is possible, but you’re far more likely to find it among women.





  • I heavily recommend people interested in bad faith argumentation (how to identify it, how to combat it) to read this text. It’s didactic, because of how obviously the guy is twisting things to prove black and white.

    Nicotine contributes to the taste of cigarettes and the pleasures of smoking. The presence of nicotine, however, does not make cigarettes a drug or smoking addiction.

    Yeah, and gravity doesn’t work on Fridays. /s

    Coffee, Mr. Chairman, contains caffeine and few people seem to enjoy coffee that does not. Does that make coffee a drug?

    Interesting fallacy he uses here - it’s like a loaded question, but instead of building it around an assumption, he does it around the connotation of a word (drug), to create a false equivalence.

    Yes, caffeine is a drug. Yes, it’s addictive. And abstinence syndrome is a pain. The reason you don’t see it being enforced as other drugs is because it’s relatively benign, but you can’t say the same about nicotine. (NB: this is coming from a smoker who drinks a buttload of coffee and yerba.)

    Are coffee drinkers drug addicts?

    Chaining another rhetorical question to further impact the appeal to emotion of the above.

    People can and do quit smoking

    Yeah, people can and do quit crack cocaine too. It doesn’t stop it being a drug.

    Smoking is not intoxicating; no one gets drunk from cigarettes and no one has said that smokers do not function normally. Smoking does not impair judgment.

    Unless something in the report is suggests that, he’s building a straw man and beating it to death.

    Point five, Phillip Morris research does not establish that smoking is addictive.

    Yeah, and my cat’s research does not establish that scratching furniture damages it. /s



  • We (people in general) are dealing with two sets of crazy people, when it comes to AI:

    1. A crowd who overestimates AI capabilities. They often believe AI is “intelligent”, AGI is coming “soon”, AI will replace our jobs, the future is AI, all that babble.
    2. A crowd who believes generative models are only flash and smoke, a bubble that’ll burst and leave nothing behind. A Ponzi scheme of sorts.

    Both are wrong. And they’re wrong in the same way: failure to see tech as tech. And you often see criticism towards #1 (it’s fair!), but I’m glad to see criticism towards #2 (also fair!) popping up once in a while, like the author does.

    …case in point best usage case for LLMs is

    • the task is tedious, repetitive, basic. The info equivalent of cleaning dishes.
    • the amount of errors in the output is OK for its purpose.


  • This sounds sensible, as long as the library is geared towards the local Māori community.

    And, really. The underlying idea of the Dewey Decimal System is solid: have only a few top level categories, but subdivide and number them recursively. However you don’t need to stick to the exact same categories as Dewey did. It’s often good to deviate from them - because those categories depend a lot on the relevance and association between topics, and both things are situational and culturally dependent. Cue to the example - I see no connection between gardening and conflict resolution, but the Māori people clearly do, so if the library is for them it’s sensible it groups both things together.


  • This is the sort of term begging for a false equivalence: that a social class is as worth existing as an ethnic group. It’s fucking dumb, specially when you use a definition like this:

    classicide has been used […] to describe the unique forms of genocide which pertain to the annihilation of a class through murder or displacement and the destruction of the bourgeoisie to form an equal proletariat

    Emphasis mine. So, basically: if you want a classless society, by demoting the borghesia to proletariat, that’s literally like genocide? *rolls eyes*

    [The fun part is when you realise that the borghesia is the main responsible for the proletarisation of itself. As such it would be committing auto-classicide.]

    I should stop talking politics in this account. And in case anyone wonders why I spelled “borghesia” this way, it’s because I’m fucking tired to do it in English, I never get it right the first 2~3 times, so might as spell it as in Italian.


  • To the bots. Roboti ite domum!

    It would be funnier if he said “robotes eunt domus”, like in Life of Brian. But no, he had to use correct Latin!

    Serious now. If I had a website I’d probably try Nepenthes or Iocaine; poisoning those fuckers seems to be way more fun than just restricting their access. I’m glad the guy found a good solution for his problem with the tools at hand.

    Now, before checking HN comments, let me guess: at least one will defend those big businesses DDoS-ing the internet for the sake of their models.

    You don’t have to fend off anything, you just have to fix your server to support this modest amount of traffic. // Everyone else is visiting your site for entirely self-serving purposes, too. // I don’t understand why people are ok with Google scraping their site (when it is called indexing), fine with users scraping their site (when it is called RSS reading), but suddenly not ok with AI startups scraping their site. // If you publish data to the public, expect the public to access it. If you don’t want the public (this includes AI startups) to access it, don’t publish it.

    BINGO!