@hoshikarakitaridia

hoshikarakitaridia@sh.itjust.works · 1 year ago

Based?

hoshikarakitaridia@sh.itjust.works · 1 year ago

Inspired in the traditional sense or inspired on a basis of datasets with concrete numbers? Huge difference.

hoshikarakitaridia@sh.itjust.works · edit-2 1 year ago

This is literally the bottleneck of all of fediverse imo.

With ease of use integrated into the fediverse, half of social media could become irrelevant.

hoshikarakitaridia@sh.itjust.works · 1 year ago

Yes.

And let’s also pin down that this is the exact issue we need more laws on. What makes an image copyrightable? When can a copyright get violated? And more specifically: whatever the AI model encompasses, can that inhibit fully copyrighted material? Can a copyrighted image be assumed by noting down all of its features?

This is the exact corner that we are fighting over currently.

hoshikarakitaridia@sh.itjust.works · 1 year ago

Because the training, and therefore the datasets are an important part of the work with AI. A lot of ppl are arguing that therefore, the ppl who provided the data (e.g. artists) should get a cut of the revenue or a static fee or something similar for compensation. Because looking at a picture is deemed fine in our society, but copying it and using it for something else is seen more critically.

Btw. I am totally with you regarding the need to not hinder progress, but at the end of the day, we need to think about both the future prospects and the morality.

There was something about labels being forced to pay a cut of the revenue to all bigger artists for every CD they’d sell. I can’t remember what it was exactly, but something like that could be of use here as well maybe.

hoshikarakitaridia@sh.itjust.works · edit-2 1 year ago

“scam bot operators will just use stolen credits cards -”

And that’s not true. Yes, there will be a small portion that do it, but this is where this idea is pretty smart.

Taking your credit card information is a functional hurdle, but also a legal risk.

There’s a bunch of companies and people who will stop using bots just because they can’t implement it, don’t want to implement it, or don’t have the time. Also, don’t forget if there’s one person who provides 10.000 active bots, that means providing credit card information 10.000x times, but also 120.000$ per year. If you wanna do it legally, this shit is expensive, and probably not worth it for a lot of ppl.

And there’s also a bunch of ppl who are weighting the risk of being exposed for fake credit cards, and they stop using bots because they are not willing to commit fraud.

I get that this will turn off even more users and it’s obviously a bad pr move, but you can’t understate that it is quite effective for the things he says he wants to achieve.

hoshikarakitaridia@sh.itjust.works · 1 year ago

Can someone help me understand the use of this? I genuinely can not figure out a practical scenario of when you need this.

hoshikarakitaridia@sh.itjust.works · 1 year ago

I mean that’s obviously a turn off for a lot of ppl but I don’t think the inherent idea is stupid if they wanna get rid of all of the bots

hoshikarakitaridia@sh.itjust.works · 1 year ago

As a music producer, you notice 192k MP3. The next jumps you probably don’t notice. I’m still a flac snob because I have to work a lot with original quality files, but for the average users there’s probably not even a difference between MP3 192k+ and flac or wav or opus or whatever.

hoshikarakitaridia@sh.itjust.works · 1 year ago

sampling a fraction of another person’s imagery or written work.

So citing is a copyright violation? A scientific discussion on a specific text is a copyright violation? This makes no sense. It would mean your work couldn’t build on anything else, and that’s plain stupid.

Also to your first point about reasoning and advanced collage process: you are right and wrong. Yes an LLM doesn’t have the ability to use all the information a human has or be as precise, therefore it can’t reason the same way a human can. BUT, and that is a huge caveat, the inherit goal of AI and in its simplest form neural networks was to replicate human thinking. If you look at the brain and then at AIs, you will see how close the process is. It’s usually giving the AI an input, the AI tries to give the desired output, them the AI gets told what it should have looked like, and then it backpropagates to reinforce it’s process. This already pretty advanced and human-like (even look at how the brain is made up and then how AI models are made up, it’s basically the same concept).

Now you would be right to say “well in it’s simplest form LLMs like GPT are just predicting which character or word comes next” and you would be partially right. But in that process it incorporates all of the “knowledge” it got from it’s training sessions and a few valuable tricks to improve. The truth is, differences between a human brain and an AI are marginal, and it mostly boils down to efficiency and training time.

And to say that LLMs are just “an advanced collage process” is like saying “a car is just an advanced horse”. You’re not technically wrong but the description is really misleading if you look into the details.

And for details sake, this is what the paper for Llama2 looks like; the latest big LLM from Facebook that is said to be the current standard for LLM development:

https://arxiv.org/pdf/2307.09288.pdf

hoshikarakitaridia@sh.itjust.works · edit-2 1 year ago

Oh god I hate you so much for this. It’s beautiful that it’s possible but I also want you to know you’re instigating cybercrimes.

hoshikarakitaridia@sh.itjust.works · 1 year ago

And that is the point.

It sounds stupidly simple, but AIs in itself was the idea to do the learning and solving problems more like a human would. By learning how to solve similar problems, and transfer the knowledge to a new problem.

Technically there’s an argument that our brain is nothing more than an AI with some special features (chemicals for feelings, reflexes, etc). But it’s good to remind ourselves we are nothing inherently special. Although all of us are free to feel special of course.

hoshikarakitaridia@sh.itjust.works · edit-2 1 year ago

Well what an interesting question.

Let’s look at the definitions in Wikipedia:

Sentience is the ability to experience feelings and sensations.

Experience refers to conscious events in general […].

Feelings are subjective self-contained phenomenal experiences.

Alright, let’s do a thought experiment under the assumptions that:

experience refers to the ability to retain information and apply it in some regard
phenomenal experiences can be described by a combination of sensoric data in some fashion
performance is not relevant, as for the theoretical possibility, we only need to assume that with infinite time and infinite resources the simulation of sentience through AI needs to be possible

AI works by telling it what information goes in and what goes out, and it therefore infers the same for new patterns of information and it adjusts to “how wrong it was” to approximate the correction. Every feeling in our body is either chemical or physical, so it can be measured / simulated through data input for simplicity sake.

Let’s also say for our experiment that the appropriate output it is to describe the feeling.

Now I think, knowing this, and knowing how good different AIs can already comment on, summarize or do any other transformative task on bigger texts that exposes them to interpretation of data, that it should be able to “express” what it feels. Let’s also conclude that based on the fact that everything needed to simulate feeling or sensation it can be described using different inputs of data points.

This brings me to the logical second conclusion that there’s nothing scientifically speaking of sentience that we wouldn’t be able to simulate already (in light of our assumptions).

Bonus: while my little experiment is only designed for theoretical possibility and we’d need some proper statistical calculations to know if this is practical in a realistic timeframe already and with a limited amount of resources, there’s nothing saying it can’t. I guess we have to wait for someone to try it to be sure.

hoshikarakitaridia@sh.itjust.works · 1 year ago

Kind of. Iirc it’s a very controversial practice and whenever the police pulls it out in a public case it gets protested again (for good reason). Also, even if the practice is legal right now, there’s a lot of limitations to it. Obviously it’s nudging the ethical boundaries of police work either way.

hoshikarakitaridia@sh.itjust.works · 1 year ago

Just to further your opinion:

Only ~30% of sexual assaults get reported according to this: https://www.rainn.org/statistics/criminal-justice-system

And psychologists will probably tell you that after the assault there is shame, embarrassment, fear of repercussions due to power dynamics and anxiety about what will happen. Of course this doesn’t mean our society is dysfunctional, as we already managed to get a lot of systems in place to drive up this figure, but they’re all far from perfect, and we need more studies and more improvements to do victims of SA better justice.

hoshikarakitaridia@sh.itjust.works · edit-2 1 year ago

Understatement of the fucking century. I mean I know they got some data wrong but this is Blizzard level of workplace harassment.

With the assault and shit that’s straight up criminally illegal.

I’m so sorry for her; she probably gets the love she deserves now but it shouldn’t have taken so long and it’s time to start from scratch with this company.

hoshikarakitaridia@sh.itjust.works · edit-2 1 year ago

I mean I would say maybe “regurgitating their training data” is putting it a bit too simple. But it’s true, we’re currently at the point where the AI can mimic real text. But that’s it - no one tells it not to lie rn, the programmatic goal of the AI is to get indistinguishable from real text with no bearing on the truthfulness of the information whatsoever.

Basically we train our AIs to pretend to know, not to know. And sometimes it’s good at pretending, sometimes it isn’t.

The “right” way to handle what the CEOs are doing would be to let go of a chunk of the staff, then let the rest write their articles with the help of chatgpt. But most CEOs are a bit too gullible when it comes to the abilities of AI.

hoshikarakitaridia@sh.itjust.works · 1 year ago

Huh that’s the first time I’ve heard that.

Not disagreeing but does someone have some sweet and juicy sources? Would love to look into that.

hoshikarakitaridia@sh.itjust.works · 1 year ago

Yeah the there’s probably nothing to infringe upon with Twitter code. The only basis it theoretically had was if business secrets had been shared, and from what I’ve heard there’s basically no evidence of any business secrets bring leaked to Threads; it’s even unclear if there’s any non-disclosure or stuff that could be violated if developers are talking about code and structure.

Looks like the cease and desist wasn’t worth the paper it was written on.