• 4 Posts
  • 1.05K Comments
Joined 1 year ago
cake
Cake day: June 15th, 2023

help-circle

  • “It’s popular so it must be good/true” is not a compelling argument. I certainly wouldn’t take it on faith just because it has remained largely unquestioned by marketers.

    The closest research I’m familiar with showed the opposite, but it was specifically related to the real estate market so I wouldn’t assume it applies broadly to, say, groceries or consumer goods. I couldn’t find anything supporting this idea from a quick search of papers. Again, if there’s supporting research on this (particularly recent research), I would really like to see it.







  • We find that the MTEs are biased, signif-icantly favoring White-associated names in 85.1% of casesand female-associated names in only 11.1% of case

    If you’re planning to use LLMs for anything along these lines, you should filter out irrelevant details like names before any evaluation step. Honestly, humans should do the same, but it’s impractical. This is, ironically, something LLMs are very well suited for.

    Of course, that doesn’t mean off-the-shelf tools are actually doing that, and there are other potential issues as well, such as biases around cities, schools, or any non-personal info on a resume that might correlate with race/gender/etc.

    I think there’s great potential for LLMs to reduce bias compared to humans, but half-assed implementations are currently the norm, so be careful.








  • I think there are two problems that make this hard to answer:

    1. Not all sentences that can be parsed grammatically can also be parsed logically.

    2. Human-language sentences do not contain all the information needed to evaluate them.

    It is impossible to fully separate context from human language in general. The sentence “it is cold” is perfectly valid, and logically coherent, but in order to evaluate it you’d need to draw external information from the context. What is “it”? Maybe we can assume “it” refers to the weather, as that is common usage, but that information does not come from the sentence itself. And since the context here is on the Internet, where there is no understanding of location, we can’t really evaluate it that way.

    It’s hot somewhere, and it’s cold somewhere. Does that mean the statement “it is cold” is both true and false, or does that mean there is insufficient information to evaluate it in the first place? I think this is largely a matter of convention. I have no doubt that you could construct a coherent system that would classify such statements as being in a superposition of truth and falsehood. Whether that would be useful is another matter. You might also need a probabilistic model instead of a simple three-state evaluation of true/false/both. I mean, if we’re talking about human language, we’re talking about things that are at least a little subjective.

    So I don’t think the question can be evaluated properly without defining a more restrictive category of “sentences”. It seems to me like the question uses “sentence” to mean “logical statements”, but without a clearer definition I don’t know how to approach that. Sentences are not the same as logical statements. If they were, we wouldn’t need programming languages :)

    Apologies for the half-baked ideas. I think it would take a lifetime to fully bake this.





  • Yeah, they were able (and thus legally required) to hand over the user’s recovery email address, which is what got them caught. You don’t need to enter a recovery email address, and you can of course choose to use an equally-secure service for recovery.

    One big technical issue to note is that Proton doesn’t use end-to-end encryption for email headers, which includes recipients and subject lines, among other things. So that’s potentially exposed to law enforcement as well. I believe Tuta does encrypt headers.