lordmauve

joined 1 year ago
[โ€“] [email protected] 3 points 1 month ago

Yeah, decorate it just with a tremendous amount of dark red paint, spattered away from the fan, heaviest in the fan corner

[โ€“] [email protected] 3 points 1 month ago (1 children)

I don't deny that this kind of thing is useful for understanding the capabilities and limitations of LLMs but I don't agree that "the best match of a next phrase given his question, and not because it can actually consider the situation." is an accurate description of an LLM's capabilities.

While they are dumb and unworldly they can consider the situation: they evaluate a learned model of concepts in the world to decide if the first word of the correct answer is more likely to be yes or no. They can solve unseen problems that require this kind of cognition.

But they are only book-learned and so they are kind of stupid about common sense things like frying pans and ovens.

[โ€“] [email protected] 3 points 2 months ago

I think it's fair to colour seasons and episodes with different scales because they are measuring different things.

Due to the Central Limit Theorem, average of 20+ episodes will have a smaller standard deviation than individual episodes.

For example, an individual episode with a score of 6 you'll probably watch. A whole season with a score of 6, maybe not.