eating3645

joined 1 year ago
[–] [email protected] 9 points 1 month ago

I'm almost certain that it is, though the fact that there's even a shred of doubt says it all.

[–] [email protected] 1 points 1 month ago

Abbreviation for New York, the country would be USA.

[–] [email protected] 8 points 1 month ago (2 children)

If you pay for boost, it does not share any info with 3rd parties.

[–] [email protected] 37 points 1 month ago (6 children)

Shout-out for boost

[–] [email protected] 150 points 1 month ago (2 children)

Lol telegram calling signal insecure is too funny.

[–] [email protected] 2 points 2 months ago

It changes with the season, but lately I have been loving "Boots of Spanish Leather."

[–] [email protected] 0 points 3 months ago (1 children)

Looks like a joke to me

[–] [email protected] 35 points 10 months ago

ಠ⁠︵⁠ಠ

[–] [email protected] 6 points 10 months ago* (last edited 10 months ago)

Let me expand a little bit.

Ultimately the models come down to predicting the next token in a sequence. Tokens for a language model can be words, characters, or more frequently, character combinations. For example, the word "Lemmy" would be "lem" + "my".

So let's give our model the prompt "my favorite website is"

It will then predict the most likely token and add it into the input to build together a cohesive answer. This is where the T in GPT comes in, it will output a vector of probabilities.

"My favorite website is"

"My favorite website is "

"My favorite website is lem"

"My favorite website is lemmy"

"My favorite website is lemmy."

"My favorite website is lemmy.org"

Woah what happened there? That's not (currently) a real website. Finding out exactly why the last token was org, which resulted in hallucinating a fictitious website is basically impossible. The model might not have been trained long enough, the model might have been trained too long, there might be insufficient data in the particular token space, there might be polluted training data, etc. These models are massive and so determine why it's incorrect in this case is tough.

But fundamentally, it made up the first half too, we just like the output. Tomorrow some one might register lemmy.org, and now it's not a hallucination anymore.

[–] [email protected] 14 points 10 months ago* (last edited 10 months ago) (3 children)

Very difficult, it's one of those "it's a feature not a bug" things.

By design, our current LLMs hallucinate everything. The secret sauce these big companies add is getting them to hallucinate correct information.

When the models get it right, it's intelligence, when they get it wrong, it's a hallucination.

In order to fix the problem, someone needs to discover an entirely new architecture, which is entirely conceivable, but the timing is unpredictable, as it requires a fundamentally different approach.

[–] [email protected] 1 points 10 months ago

Maybe, it depends on how serious this is.

Total agreement, small scale LLMs are super cool but just don't have as high quality output, if they're good enough for the job, they're perfect.

view more: next ›