Technology

Garbage In, Garbage Out: Officials Face Seemingly Impossible Task of Regulating AI

Image courtesy of Pixabay

As the use of artificial intelligence via platforms like Chat GPT skyrockets, U.S. lawmakers are finding themselves faced with some new questions. Just as officials had to consider accountability for social networks, where most of the content that appears was posted by the general public, so they’re now dealing with accountability for AI platforms. 

Who’s accountable for ensuring that AIs put out correct, non-toxic information? No one knows, at least not yet, and it’s easy to see why.

The Problem with Artificial Intelligence

AIs become more intelligent through training on more and more data sets, and the easiest way to find enormous amounts of data to train an AI on is to look online. However, the problem is that not everything that’s posted on is factual information, especially when you’re dealing with social media. 

Some of the content that’s posted — on social networks or elsewhere — is merely opinion rather than fact. On the other hand, some of it is just plain wrong: either misinformation like rumors or even worse, disinformation posted deliberately with malicious intent.

Unfortunately, AIs can’t tell the difference between true and false information unless a human informs them the information is false. Additionally, many studies of AI assistants like Siri and Alexa have demonstrated how human biases can creep into the technology, which is supposed to be unbiased.

U.S. lawmakers are also concerned about potential impacts from artificial intelligence on national security and education. In particular, officials are concerned about ChatGPT, an AI program capable of quickly writing answers to a wide variety of questions. It immediately became the fastest-growing consumer application ever recorded, attracting over 100 million monthly active users in a matter of months.

Calls for Accountability in AI

All these factors and more raise many questions about accountability for artificial intelligence. In April, the National Telecommunications and Information Administration, which is part of the Commerce Department, called for public input on potential accountability measures. The agency cited “growing regulatory interest” in an “accountability mechanism” for AI.

Specifically, officials want to know if they could any measures in place to ensure “that AI systems are legal, effective, ethical, safe and otherwise trustworthy.” NTIA Administrator Alan Davidson told Reuters that “responsible” artificial intelligence systems may offer “enormous benefits…,” but “companies and consumers need to be able to trust them.

President Joe Biden had said previously that it’s unclear if AI is dangerous, adding that tech companies “have a responsibility… to make sure their products are safe before making them public.” 

How AI Models are Trained

Of course, an artificial intelligence can only be as good as the data used to train it. Twitter CEO Elon Musk threatened to sue Microsoft after accusing it of illegally using the social network’s data to train its AI model. On one hand, Musk’s threat is indicative of Big Tech’s claim of ownership over the data it has gathered — usually provided by their users free. These tech behemoths make a mint by charging other companies for using the data collected, and this must be what Musk had in mind for Microsoft if it did use Twitter’s data.

According to CNBC, AI experts see social networks as valuable sources of data for their models because they capture back-and-forth conversations via an informal environment. AIs must be fed terabytes of data for training purposes, and much of that data is scraped from sites like Twitter, Reddit and StackOverflow. 

Many of the first AI models were developed in universities and research labs, usually without any expectations of profits. However, as Big Tech companies like Microsoft move in on these AI models by pouring in large amounts of capital, the groups behind these AI models are starting to look for profits.

As a result, the owners of the data on which these AIs are being trained are starting to demand payment for access to their data. For example, Reddit said in April that it would start charging companies for AI models to receive access to its data for training purposes. Other companies, including Universal Music Group and Getty Images are demanding payment for their data being used to train artificial intelligence models.

A Critical Question for Artificial Intelligence Models

However, setting aside the need for AI models to train on vast amounts of data, one thing that’s not being discussed much is whether social networks are really the best sources on which to train AI models. It’s no secret that social networks are dens for disinformation and misinformation.

Humans are not infallible, so they might accidentally post incorrect information or share rumors, neither of which are suitable for training AI models because they don’t represent factual information. Additionally, we return to the issue of human bias because social networks are typically filled with biased posts. 

What’s worse it that some studies have indicated that Facebook and other social networks are actively silencing conservative voices. If that continues, AI models that train on social networks will have an inherently liberal bias, simply because of the data they were trained on.

AIs Shown to Spread False Information

Even setting aside the issue of politics and liberal versus conservative, there’s no way to verify that the social media posts being used to train an AI model are sharing factual information. Social networks are a place to express opinions, but what AIs need are facts so that they can learn to identify true and false information.

For example, a study conducted at Stanford University revealed that AIs can’t always accurately identify hate speech. Even humans often can’t agree on this issue, so an artificial intelligence model is inherently limited to the biases of the person or people who told it what constitutes hate speech.

However, the problem with misinformation or disinformation may be an even bigger problem. For example, one study found that ChatGPT tends to make up phony anonymous sources when tasked with writing a news article about former New York City Mayor Michael Bloomberg. In fact, those so-called “anonymous sources” appeared to “skewer” Bloomberg for “using his wealth to influence public policy,” according to NBC New York. 

More and more studies that demonstrate ChatGPT and its successors like ChatGPT-4 will spread false information if given the chance to do so. As things stand now, the sudden popularity of this AI highlights the need for greater awareness of the shortcomings of artificial intelligence and greater study on how to train it and potentially regulate it.

About the author

avatar

Brian Wallace

Brian Wallace is the Founder and President of NowSourcing, an industry leading infographic design agency based in Louisville, KY and Cincinnati, OH which works with companies that range from startups to Fortune 500s. Brian also runs #LinkedInLocal events nationwide, and hosts the Next Action Podcast.  Brian has been named a Google Small Business Advisor for 2016-present and joined the SXSW Advisory Board in 2019.