Tag Archives: AI

Bluesky Says It Has “No Intention” Of Taking User Content To Train AI Tools



Social network Bluesky, in a post on Friday, says that it has “no intention” of taking user content to train generative AI tools. It made the statement the same day X’s new terms of service that spell out how it can analyze user text and other information to train its generative AI tools go into effect, The Verge reported.

“A number of artists and creators have made their home on Bluesky, and we hear their concerns with other platforms training on their data,” Bluesky says in a post. “We do not use any of your content to train generative AI, and have no intention of doing so.

Other companies could still potentially scrape your Bluesky posts for training. Bluesky’s robots.txt doesn’t exclude crawlers from Google, OpenAI, or others, meaning those companies may crawl Bluesky data. 

“Bluesky is an open and public social network, much like website on the Internet itself,” spokesperson Emily Liu tells The Verge. “Just as robots.txt files don’t always prevent outside companies from crawling those sites, the same applies here. That said, we’d like to do our part to ensure that outside orgs respect user consent and are actively discussing within the team on how to achieve this.”

TechCrunch reported Bluesky, a social network that’s experiencing a surge in users this week as users abandon X, says it has “no intention” of using user consent to train AI generative tools. The social network made the announcement on the same day that X, (formerly Twitter) is implementing its new terms of service that allow the platform to use public posts to train AI.

“A number of artists and creators have made their home on Bluesky, and we here their concerns with other platforms training on their data,” Bluesky said in a post on its app. “We do not use any of your content to train generative AI, and have no intention of doing so.”

The company went on to note that it uses AI internally to help with moderation and that it also uses the technology in its “Discover”algorithmic feed. However, Bluesky says “none of these are Gen AI systems trained on user content.”

Bluesky has seen an increase in users following the U.S. presidential election as X gains more of a right-wing approach, especially after Musk used the platform to campaign for President-elect Donald Trump.

Engadget reported Bluesky, which has surged in the days following the US election, said on Friday that it won’t train on its user’s posts for generative AI. 

The declaration stands in stark contrast to the AI training policies of X (Twitter) and Meta’s Threads. Probably not coincidentally, Bluesky’s announcement came the same day X’s new terms of service, allowing third-party partners to train on user posts, went into effect.

Although Bluesky is still he underdog in a race on X and Threads, the platform has picked up steam after the U.S. election. It passed the 15 million user threshold on Wednesday after adding more than a million in the past week.

In my opinion, Bluesky is doing the right thing by encouraging users to ditch X  (formerly Twitter) and make an account on Bluesky.


OpenAI Launches ChatGPT Search Competing With Google Search



OpenAI on Thursday launched a search feature within ChatGPT, its viral chatbot, that positions the high-powered artificial intelligence startup to better compete with search engines like Google, Microsoft’s Bing and Perplexity, CNBC reported.

ChatGPT search offers up-to-the-minute sports scores, stock quotes, news, weather and more, powered by real-time web search and partnerships with news and data providers, according to the company. It began beta-testing the search engine, called SearchGPT, in July.

The release could have implications for Google as the dominant search engine. Since the launch of ChatGPT in November 2022, Alphabet investors have been concerned that OpenAI could take market share from Google in search by giving consumers new ways to seek information online.

Shares of Alphabet were down about 1% following the news.

The move also positions OpenAI as more of a competitor to Microsoft and it’s businesses. Microsoft has invested close to $14 billion in OpenAI, yet OpenAI’s products directly compete with Microsoft’s AI and search tools, such as Copilot and Bing.

The Verge reported ChatGPT is officially an AI-powered web search engine. The company is enabling real-time information in conversations for paid subscribers today (along with SearchGPT waitlist users), with free, enterprise, and education users gaining access in the coming weeks.

Rather than launching a separate product, web search will be integrated into ChatGPT’s existing interface. The feature determines when to tap into web results based on queries, through users can also manually trigger web searches. ChatGPT’s web search integration finally closes a key competitive gap with rivals like Microsoft Copilot and Google Gemini, which have long offered real-time internet access in their AI conversations.

The new search functionality will be available across all ChatGPT platforms: iOS, Android, and desktop apps for macOS and Windows. The search functionality was built with “a mix of search technologies,” including Microsoft’s Bing. The company wrote in a blog on Thursday that the underlying search model is a fine-tuned version of GPT-4o.

ArsTechnica reported: One of the biggest bummers about the modern internet has been the decline of Google Search. Once an essential part of using the web, it’s now a shadow of its former self, full of SEO-fueled junk and AI-generated spam.

On Thursday, OpenAI announced a new feature of ChatGPT that could potentially replace Google Search for some people: an upgraded web search capability for its AI assistant that provides answers with source attribution during conversations. The feature, officially called “ChatGPT with Search,” makes web search automatic based on user questions, with an option to manually trigger searches through a new web search icon.

In my opinion, it sounds like OpenAI is enabling users to more easily connect with ChatGPT in order to find the information they are looking for. It sounds like a competitor to Google Search.


The AI Bill Driving A Wedge Through Silicon Valley



California’s push to regulate artificial intelligence has riven Silicon Valley, as opponents warn the legal framework could undermine competition and the US’s position as the world leader in technology, Financial Times reported.

Having waged a fierce battle to amend or water down the bill as it passed through California’s legislature, executive at companies including OpenAI and Meta are waiting anxiously to see if Gavin Newsom, the state’s Democratic governor, will sign it into law. He has until September 30 to decide.

California is the heart of the burgeoning AI industry, and with no federal law to regulate the technology across the US — let alone a uniform global standard — the ramifications would extend far beyond the state.

Why does California want to regulate AI?

The rapid development of AI tools that can generate humanlike responses to questions have magnified perceived risks around the technology, ranging from legal disputes such as copyright infringement to misinformation and a proliferation of deepfakes. Some even think it could pose a threat to humanity.

The Verge reported artificial intelligence is moving quickly. It’s now able to mimic humans convincingly enough to fuel massive phone scams or spin up nonconsensual deepfake imagery with celebrities to be used in harassment campaigns. The urgency to regulate this technology has never been more critical — so, that’s what California, home to many of AI’s biggest players, is trying to do with a bill known as SB 1047.

SB 1047, which passed the California State Assembly and Senate in late August, is now on the desk of California Governor Gavin Newsom — who will determine the fate of the bill. While the EU and some other governments have been hammering out AI regulation for years now, SB 1047 would be the strictest framework in the US so far.

CCN reported a California bill that intends to promote the “safe and secure” development of frontier AI models has exposed a major rift in Silicon Valley.

Senior tech executives, prominent investors, and politicians on both sides of the aisle are among the bill’s critics. Meanwhile, supporters of SB-1047 include Elon Musk, Vitalik Buterin, and, most recently, an alliance of current and former employees of AI companies, including OpenAI, Google, Anthropic, Meta, and xAI.

Having made its way through the California legislature, only Governor Gavin Newsom’s approval is needed to sign SB-1047 into law. As the deadline for him to decide on his position approaches, Newsom has come under pressure from both sides.

In my opinion, there are going to be people who are all for Governor Newsom signing SB-1047, and people who don’t want the Governor to do that. Californian’s will have to wait and see what the outcome will be. 


AMD Buys AI Equipment Maker For Nearly $5 Billion



Advanced Micro Devices agreed to pay nearly $5 billion to ZT Systems, a designer of data-center equipment for cloud computing and artificial intelligence, bolstering the chip maker’s attack on Nvidia’s dominance in AI computation, The Wall Street Journal reported.

The deal, among AMD’s largest, is part of a push to offer a broader menu of chips, software and system designs to big data-center customers such as Microsoft and Facebook owner Meta Platforms, promising better performance through tight linkages between those products.

Secaucus, N.J.-based ZT Systems, which isn’t publicly traded, was founded in 1994. It designs and makes servers, server racks and other infrastructure that house and connect chips in the giant data centers that power artificial-intelligence systems such as ChatGPT.

AMD posted a press release titled: “AMD to Significantly Expand Data Centers AI Systems Capabilities with Acquisition of Hyperscale Solutions Provider ZD Systems”

Strategic acquisition to provide AMD with industry-leading systems to expertise to accelerate deployment of optimized rack-scale solutions addressing $400 billion data center AI accelerator opportunity in 2027 —

ZT Systems, a leading provider of AI and general purpose compute infrastructure for the world’s largest hyper scale providers, brings extensive AI systems expertise that complements AMD silicon and software capabilities.

Addition of world-class design and customer enablement teams to accelerate deployment of AMD AI rack scale system with cloud and enterprise customers.

AMD to seek strategic partner to acquire ZT System’s industry-leading manufacturing business.

Transaction expected to be accretive on a non-GAAP basis by the end of 2025…

Reuters reported AMD said on Monday it plans to acquire server maker ZT Systems for $4.9 billion as the company seeks to expand its portfolio of artificial intelligence chips and hardware and battle Nvidia.

AMD plans to pay for 75% of the ZT Systems acquisition with cash and the remainder in stock. The company had $5.34 billion in cash and short-term investments as of the second quarter.

The computing requirements for AI have dictated that tech companies string together thousands of chips in clusters to achieve the necessary amount of data crunching horsepower. Stringing together the vast numbers of chips has meant the makeup of whole server systems has become increasingly important, which is why AMD is acquiring ZT Systems.

The addition of ZT Systems engineers will allow AMD to more quickly test and roll out its latest AI graphics processing units (GPU’s) at the scale cloud computing giants such as Microsoft require said AMD CEO Lisa Su in an interview with Reuters.

In my opinion, it looks like AMD is ready to see if it can overtake Nvidia. It will be interesting to see if AMD can do that.


Mark Zuckerberg Argues That ‘Open Source AI’ Is The Path Forward



Meta posted “Expanding our open source large language models responsibly”. From the Meta blog:

Takeaways:

  • Meta is committed to openly accessible AI. Read Mark Zuckerberg’s letter detailing why open source is good for developers, good for Meta, and good for the world.
  • Open source has multiple benefits: It helps ensure that more people around the world can access the opportunities that AI provides, guards against concentrating power in the hands of a small few, and deploys technology more equitably. And we believe it will lead to more safe AI outcomes across society. That’s why we continue to advocate for making open access to the AI industry standard.
  • We’re bringing open intelligence to all by introducing Llama 3.1 collection of models, which expand context length to 128K, add support across eight languages, and include Llama 3.1 405B — the first frontier-level open source AI model.
  • As we improve the capabilities of our models, we’re also scaling our evaluations, red teaming, and mitigations, including for catastrophic risks.
  • We’re bolstering our system-level safety approach with new security and safety tools, which include Llama Guard 3 (an input and output multilingual moderation tool), Prompt Guard (a tool to protect against prompt injections), and CyberSecEval 3 (evaluations that help AI model and product developers understand and reduce generative AI cybersecurity risk). We’re also continuing to work with a global set of partners to create industry-wide standards that benefit the open source community.
  • We prioritize responsible AI development, and want to empower others to do the same. As part of our responsible release efforts, we’re giving developers new tools and resources to implement the best practices in our Responsible Use Guide.

ArsTechnica reported: In the AI world, there’s a buzz in the air about a new AI language model released Tuesday by Meta: Llama 3.1 405B. The reason? It’s potentially the first time anyone can download a GPT-4-class large language model (LLM) for free and run it on their own hardware.  

You’ll still need some beefy hardware: Meta says it can run on a “single sever node,” which isn’t desktop PC-grade equipment. But it’s a provocative shot across the bow of “closed” AI model vendors such as OpenAI and Anthropic.

“Llama 3.1 405B is the first openly available model that rivals the top AI models when it comes to state-of-the-art capabilities in general knowledge, steerability, math, tool use, and multilingual translation,” says Meta. Company CEO Mark Zuckerberg calls 405B “the first frontier-level open source AI model.”

The Register reported: First teased alongside the launch of its smaller eight- and 70-billion parameter siblings earlier this spring, Meta’s Llama 3.1 405B was trained on more than 15 trillion tokens — think of these a fragments of words, phrases, figures and punctuation — using 16,000 Nvidia H100 GPUs.

According to The Register, in total, the Facebook giant says training the 405-billion-parameter model required the equivalent of 30.84 million GPU hours and produced the equivalent of 11,390 tons of CO2 emissions.

In my opinion, I don’t think that large corporations should be using resources that humans need in order to feed an AI. This includes a huge amount of water, and also adds CO2 emissions into the air.


Apple, Nvidia Anthropic Used Thousands Of Swiped YouTube Videos To Train AI



Tech companies are turning to controversial tactics to feed their data-hungry artificial intelligence models, vacuuming up books, websites, photos, and social media posts often unbeknownst to the creators, WIRED reported.

AI companies are generally secretive about their sources of training data, but an investigation by Proof News found some of the wealthiest AI companies in the world have used material from thousands of YouTube videos to train AI. Companies did so despite YouTube’s rules against harvesting materials from the platform without permission.

One investigation found that subtitles from 173,536 YouTube videos, siphoned from more than 48,000 channels, were used by Silicon Valley heavyweights, including Anthropic, Nividia, Apple, and Salesforce.

The dataset, called YouTube Subtitles, contains video transcripts from educational and online learning channels like Khan Academy, MIT, and Harvard. The Wall Street Journal, NPR, and the BBC also had their videos used to train AI, as did The Late Show with Stephen Colbert, Last Week Tonight With John Oliver, and Jimmy Kimmel Live.

9TO5Mac reported a number of tech giants, including Apple, trained AI models on YouTube videos without the consent of the creators, according to a new report today. 

They did this by using subtitle files downloaded by a third party from more than 170,000 videos. Creators affected include tech reviewer Marquees Brownlee (MKBHD), MrBeast, PewDePie, Stephen Colbert, John Oliver, and Jimmy Kimmel.

The subtitle files are effectively transcripts of the video content.

The downloads were reportedly preformed by a non-profit called EleutherAI, which says it helps developers train AI models. While the aim appears to have been to provide training materials to small developers and academics, the dataset has also been used by several tech giants, including Apple.

According to 9to5Mac, it’s important to emphasize here that Apple didn’t download the data itself, but this was instead preformed by EleutherAI. It is this organization which appears to have broken YouTube’s terms and conditions.

The Verge reported as part of its investigation, Proof News also released an interactive lookup tool. You can use its search engine feature to see if your content— or your favorite’s YouTuber’s — appears in the dataset.

The subtitles dataset is part of a larger collection of material from the nonprofit EleutherAI called The Pile, an open-source collection that also contains datasets of books, Wikipedia articles, and more. Last year, an analysis of one dataset called Books3 revealed which authors work had been used to train AI systems, and the dataset has been cited in lawsuits by authors against the companies that used it to train AI.

In my opinion, scraping from content creator’s works – even if it’s only the audio part of the YouTube video – should be illegal. The work made by humans should not be fed to AI systems without the creator’s consent. 


‘Little Tech’ Brings A Big Flex To Sacramento



One of Silicon Valley’s heaviest hitters is wading into the fight over California’s AI regulations, Politico reported.

Y Combinator — the venture capitalist firm that brought us Airbnb, Dropbox, and DoorDash — today issued its opening salvo against a bill by state Sen. Scott Wiener that would require large AI models to undergo safety testing.

Weiner, a San Francisco Democrat whose district includes YC, says he’s proposing reasonable precautions for a powerful technology. But the tech leaders at Y Combinator disagree, and are joining a chorus of other companies and groups that say it will stifle California’s emerging marquee industry.

“This bill, as it stands, could gravely harm California’s ability to retain its AI talent and remain the location of choice for AI companies,” read the letter, which was signed by more than 140 AI startup founders.

It’s the first time the startup incubator, led by prominent SF tech denizen Garry Tan, has publicly weighted in on the bill. They argue it could hurt the many fledgling companies Y Combinator supports — about half of which are now AI-related.

Adam Thierer posted “Coalition Letter on California SB-1047, “The Safe and Secure Innovation for Frontier Artificial Intelligence Systems Act.” on R Street:

Dear Senator Wiener and members of the California State Legislature,

We, the undersigned organizations and individuals, are writing to express our serious concerns about SB 1047, the Safe and Secure Innovation for Frontier Artificial Intelligence Systems Act. We believe that the bill, as currently written, would have severe unintended consequences that could stifle innovation, harm California’s economy, and undermine America’s global leadership on AI.

Our main concerns with SB 1047 are as follows: 

The application of the precautionary principle, codified as a “limited duty exemption,” would require developers to guarantee that their models cannot be missed for various harmful purposes, even before training begins. Given the general-purpose nature of AI technology, this is an unreasonable and impractical standard that could expose developers to criminal and civil liability for actions beyond their control.

The bill’s compliance requirements, including implementing safety guidance from multiple sources and paying fees to fund the Frontier Model Division, would be expensive and time consuming for may AI companies. This could drive businesses out of California and discourage new startups from forming. Given California’s current budget deficit and the state’s reliance upon capitol gains taxation, even a marginal shift of AI startups to other states could be deleterious to the state government’s fiscal position…

Y Combinator also posted a separate letter to Senator Wiener and two people who are on important committees. Here is a small piece from that letter:

Liability and regulation that is unusual in its burdens: The responsibility for the misuse of LLMs should rest with those who abuse these tools, not with the developers who create them. Developers cannot predict all possible applications of their models, and holding them liable for unintended misuse could stifle innovation and discourage investment in AI research. Furthermore, creating a penalty of purjury would mean that AI software developers could go to jail simply for failing to anticipate misuse of their software – a standard product liability no other product in the world suffers from.

In my opinion, it appears that Y Combinator has concerns about California’s rules regarding safety in AI. I’m not sure why the company is so upset about the state requiring safety protocols in their AI.