If you’ve been following artificial intelligence (AI) lately – and you should be – then you may have started thinking about how it’s going to change the world. In terms of its potential impact on society, it’s been compared to the introduction of the Internet, the invention of the printing press, even the first use of the wheel. Maybe you’ve played with it, maybe you know enough to worry about what it might mean for your job, but one thing you shouldn’t ignore: like any technology, it can be used for both good and bad.
If you thought cyberattacks/cybercrimes were bad when
done by humans or simple bots, just wait to see what AI can do. And, as Ryan Health wrote
in Axios, “AI can also weaponize modern
medicine against the same people it sets out to cure.”
We may need DarkBERT, and the Dark Web,
to help protect us.
Credit: Help Net Security
A new study
showed how AI can create much more effective, cheaper spear phishing campaigns,
and the author notes that the campaigns can also use “convincing
voice clones of individuals.”
He notes: “By engaging in natural language dialog with targets, AI
agents can lull victims into a false sense of trust and familiarity
prior to launching attacks.”
It’s worse than that. A recent article in The Washington Post warned:
That is just the beginning, experts, executives and government officials fear, as attackers use artificial intelligence to write software that can break into corporate networks in novel ways, change appearance and functionality to beat detection, and smuggle data back out through processes that appear normal.
The outdated architecture of the internet’s main protocols, the ceaseless layering of flawed programs on top of one another, and decades of economic and regulatory failures pit armies of criminals with nothing to fear against businesses that do not even know how many machines they have, let alone which are running out-of-date programs.
Credit: Reuters/Kacper Pempel illustration |
Health care should be worried too. The World Health
Organization (WHO) just
called for caution in use of AI in health care, noting that, among other
things, AI could “generate responses that can
appear authoritative and plausible to an end user; however, these responses may
be completely incorrect or contain serious errors…generate and disseminate
highly convincing disinformation in the form of text, audio or video content
that is difficult for the public to differentiate from reliable health content.”
It's going to get worse before it gets
better; the WaPo article warns: “AI will
give far more juice to the attackers for the foreseeable future.” This may be where solutions like DarkBERT come
in.
Now, I don’t know much about the Dark
Web. I know vaguely that it exists, and that people often (but don’t exclusively)
use it for bad things. I’ve never used
Tor, the software often used to keep activity on the Dark Web anonymous. But some clever researchers in South Korea
decided to create a Large Language Model (LLM) trained on data from the Dark
Web – fighting fire with fire, as it were. This is what they call DarkBERT.
The researchers went this route because:
“Recent
research has suggested that there are
clear differences in the language used in the Dark Web
compared to that of the Surface Web.”
LLMs trained on data from the Surface Web were going to miss or not
understand much of what was happening on the Dark Web, which is what some users
of the Dark Web are hoping.
Credit: Jin, et. alia |
They demonstrated DarkBERT’s effectiveness against
three potential Dark Web problems:
- Ransomware Leak Site Detection: identifying “the selling or publishing of private, confidential data of organizations leaked by ransomware groups.”
- Noteworthy Thread Detection: “automating the detection of potentially malicious
- threads.”
- Threat Keyword Inference: deriving “a set of keywords that are semantically related to threats and drug sales in the Dark Web.”
On each task, DarkBERT was more effective than
comparison models.
The researchers aren’t releasing DarkBERT more broadly
yet, and the paper has not yet been peer reviewed. They know they still have more to do: “In the
future, we also plan to improve the performance of Dark Web domain specific
pretrained language models using more recent architectures and
crawl additional data to allow the construction of a multilingual
language mode.”
Still, what they demonstrated was impressive. Geeks for Geeks raved:
DarkBERT emerges as a beacon of hope in the relentless battle against online malevolence. By harnessing the power of natural language processing and delving into the enigmatic world of the dark web, this formidable AI model offers unprecedented insights, empowering cybersecurity professionals to counteract cybercrime with increased efficacy.
It can’t come soon enough. The New York Times reports
there is already a wave of entrepreneurs offering solutions to try to identify
AI-generated content – text, audio, images, or videos – that can be used for
deepfakes or other nefarious purposes. But
the article notes that it’s like antivirus protection; as AI defenses get
better, the AI generating the content gets better too. “Content
authenticity is going to become a major problem for society as a whole,”
one such entrepreneur admitted.
When even Sam
Altman and other
AI leaders are calling for AI oversight, you know this is something we all
should worry about. As the WHO warned, “there is
concern that caution that would normally be exercised for any new technology is
not being exercised consistently with LLMs.” Our enthusiasm for AI’s potential is
outstripping our ability to ensure our wisdom in using them.
Credit: Natalie Peeples/Axios |
Some experts have recently called for an Intergovernmental Panel on Information Technology – including but not limited to AI – to “consolidate and summarize the state of knowledge on the potential societal impacts of digital communications technologies,” but this seems like a necessary but hardly sufficient step.
Similarly, the WHO has proposed their own guidance for
Ethics and
Governance of Artificial Intelligence for Health. Whatever oversight bodies, legislative requirements,
or other safeguards we plan to put in place, they’re already late.
In any event, AI from the Dark Web is likely to ignore
and try to bypass any laws, regulations, or ethical guidelines that society
might be able to agree to, whenever that might be. So I’m cheering for solutions like DarkBERT
that can fight it out with whatever AI emerges from there.
No comments:
Post a Comment