• onlinepersona@programming.dev
    link
    fedilink
    English
    arrow-up
    17
    arrow-down
    5
    ·
    2 days ago

    I should probably write a blog post about it. Basically it’s there to possibly get commercial LLMs in trouble for scraping licensed stuff. LLMs have been tricked into revealing their training data and gotten in trouble for that. There are also ongoing lawsuits due to those revelations. Maybe the most notable is the one against Github’s Microsoft’s CoPilot for spitting out licensed (GPL and also copyrighted from private repos) code.

    Whether the lawsuits will be successful or not is yet to be determined (Japan already considers nearly everything fair game for training AIs and machine learning). Whether they will have an impact if they are successful is also unknown. It just costs me a key-stroke (and the occasional response to a friendly question like yours), so I do it 🤷 Once all my hope is lost, I might stop.

    Anti Commercial-AI license

    • Sturgist@lemmy.ca
      link
      fedilink
      English
      arrow-up
      14
      ·
      2 days ago

      Thanks for the informative and concise explanation!

      Japan already considers nearly everything fair game

      This… THIS blows my mind. The Japanese system being what it is, where copyright and patent is enforced by companies so viciously. My understanding is if companies don’t enforce it through law suits then their patents and copyrights are weakened a lot faster than in say the US. Hence Nintendo’s penchant for being litigious.

      • onlinepersona@programming.dev
        link
        fedilink
        English
        arrow-up
        4
        arrow-down
        2
        ·
        2 days ago

        Yeah, Japan’s decision took me by surprise and at the same time it didn’t. They have been into AI for a long time, but as you say, copyright is quite important to them. This is on of the articles I remember. After that, I did question adding the signature, but lawsuits in the US are still ongoing and with Facebook being in litigation right now due to torrenting Anna’s Archive give me… some hope (despite the current political situation).

        Anti Commercial-AI license

        • Sturgist@lemmy.ca
          link
          fedilink
          English
          arrow-up
          5
          ·
          2 days ago

          despite the current political situation

          Yeesh. Yeah, kinda of a surreal time we’re in right now.

          Thanks for the article, I’ll have to read it later on. Mind if I get back to you after?

            • Sturgist@lemmy.ca
              link
              fedilink
              English
              arrow-up
              4
              ·
              1 day ago

              The policy allows AI to use any data “regardless of whether it is for non-profit or commercial purposes, whether it is an act other than reproduction, or whether it is content obtained from illegal sites or otherwise.”

              What. The. Actual. Fuck.

              I’ve been to Japan quite a lot. My wife’s half Japanese, and was born there. So we go to visit family often.
              After all these years, everything I’ve learned about them and their culture. Still, all the time I’m blown away by the crazy dichotomies. Often a leader in technology. But if you needed a Koseki (family register, fills a similar niche as a birth certificate but more nuanced) up until really not that long ago, as far as my wife knew, you could go in person, send a registered mail request/application, or send it by fax.
              Buy so many different things from vending machines, amazing cell service almost everywhere, even in the middle of nowhere…but almost all transactions are cash only.

              Despite having the world’s third-largest economy, Japan’s economic growth has been sluggish since the 1990s. Japan has the lowest per-capita income in the G-7. With the effective implementation of AI, it could potentially boost the nation’s GDP by 50% or more in a short time. For Japan, which has been experiencing years of low growth, this is an exciting prospect.

              Personally doubt. I have the personal opinion that we’re seeing a bubble. Kinda like Tulips. ML algorithms are definitely useful for a lot of (very specific) things, but it’s like NFTs, slap AI-ish buzzwords in there, add a couple zeros to the end of the figure on the bill, and laugh all the way to the bank.

              Western data access is also key to Japan’s AI ambitions. The more high-quality training data available, the better the AI model. While Japan boasts a long-standing literary tradition, the amount of Japanese language training data is significantly less than the English language resources available in the West. However, Japan is home to a wealth of anime content, which is popular globally. It seems Japan’s stance is clear – if the West uses Japanese culture for AI training, Western literary resources should also be available for Japanese AI.

              AHEM …HAHAHAHAHAHAHAHAHAHSHSHSHSHAHHAHAHSHHSGDBDJXUEBRJEUEBDJDU!

              YARRRRRR! AVAST YE SCURVY SEADOG-CHAN!

              Fuckin love it. So very Japanese.

              Buddy. Thanks so much for linking that article. Informative and hilarious. Was a really good read. I think asking about the Anti Commercial-AI License was the best decision I made today. Hooooooo. That second to last section (last quoted) had me in stitches. If you remember any other articles, or ever decide to do that write up (can’t remember, blog? 🤔) one day, hit me up. I get the sample size is miniscule, but I’d be interested in reading stuff you find interesting.