By Christopher Dela Cruz and Megan Carr
Staff Writers
The Department of Homeland Security is paying Rutgers $3 million to oversee development of computing methods that could monitor suspicious social networks and opinions found in news stories, Web blogs and other Web information to identify indicators of potential terrorist activity.
The software and algorithms could rapidly detect social networks among groups by identifying who is talking to whom on public blogs and message boards, researchers said. Computers could ideally pick out entities trying to conceal themselves under different aliases.
It would also be able to sift through massive amounts of text and decipher opinions - such as anti-American sentiment - that would otherwise be difficult to do manually.
The program is designed to sift rapidly through huge amounts of data. It has also been described as a sort of "Super Google" researchers such as Eduard Hovy at The University of Southern California, to explain the scope and quickness of the technology.
One of the ideal results would be for Homeland Security officials to be able to "find a suspicious group based on its pre-event communication activity before they act," according to a PowerPoint presentation used by researchers to explain the project.
Nicholas Belkin, a University professor who studied in the field of Information Retrieval Systems, said these techniques have the potential for abuse if they fall into the wrong hands.
"If you can identify relations between terrorists, it could be used for other groups who may not want to be identified," Belkin said. "It could be used to identify members of groups who want to form a demonstration or oppose a particular event or government policy."
"Congress has enacted many laws designed to protect personal privacy in the digital environment," said Greg Lastowka, an assistant professor of law at the Rutgers School of Law-Camden. "Whether current laws are sufficient is contested."
But Professor Paul Kantor, one of the 30 researchers involved with the new DHS University Affiliate Center at Rutgers, said there is research specifically devoted to privacy protection in data analysis, and this can "both help us protect our citizens' privacy and also help us develop techniques that will protect the privacy of our data from our adversaries."
"If you don't develop it, then your enemy could be developing it," said University of Illinois Professor Jiawei Han, another researcher working on the project. "You cannot base it on the technology itself whether to do it. If you see the potential to save people's lives, you should be able to develop it."
The Rutgers Center for Discrete Mathematics and Theoretical Computer Science will lead the team made up of researchers from the University of Southern California, the University of Illinois at Urbana-Champaign and the University of Pittsburgh. The group includes researchers from AT&T Laboratories, Bell Labs'/Lucent Technologies, Princeton University, Rensselaer Polytechnic Institute and Texas Southern University.
Rutgers will get $1 million per year for three years. The DHS will fund the entire team $10.2 million over three years.
For one of the preliminary tests, Hovy compiled articles from both the news and opinion sections of The Wall Street Journal. The software counted all the words and phrases used, and determined which were used most frequently, especially in the editorials. The software could then remember those certain phrases and look out for them when searching other data.
The research may include "non-textual mediums such as speech, video and geo-spatial data," said Christopher Kelly, a spokesperson for DHS.
"Ultimately, they are going to look to use some of these things in the agencies," said Fred Roberts, the director of DIMACS. "Maybe some of these things will be useful for the Coast Guard, maybe they will be useful for FEMA, etc."
Acclaimed security and technology expert Bruce Schneier, the chief technology officer of BT Counterpane, said the technology would eventually be able to summarize books, figure out cultural trends on blogs, and analyze newspapers and determine what they're reporting on.
"These techniques will never find terrorists," Schneier said. "Most of the value of the research lies outside terrorism."
According to Schneier, these data-mining techniques work best when there's a well-defined profile you're searching for, a reasonable number of attacks per year and a low-cost of false alarms. Schneier contends there are trillions of connections between people and events - things the data mining system will have to "look at" - and very few actual plots, resulting in too many false alarms, a waste of valuable resources and a breach of civil liberties.
"I can say that cost of false alarms is always an issue in homeland security," Roberts said. "The cost of letting a weapon of mass destruction into the United States, however, is massive compared to the cost of having to unnecessarily open a container to inspect what is inside," he said.
Director of the Technology and Liberty Project in the American Civil Liberties Union Jay Stanley said Homeland Security fishing out social networks on the Internet to try to find signs of suspicious activity is "way beyond the role that Americans ever wanted their law enforcement to play."
"The government should be focusing in traditional law enforcement, tracking down known leads known suspects and not trying to sift the entire population of innocent people to try to get out a terrorist out of it," Stanley said.
A problem could arise if any agency somehow obtained data from an Internet service provider because Belkin said it would be easy to use the technology to monitor e-mails.
"[These government agencies] have restrictions and they have lawyers and they are always checking what they are allowed to do," Roberts said. "And people know blogs are public, so people shouldn't put things they want private in them."
Ideally, new methods would be able to take writing samples from various forms of Internet media and determine who wrote what, Roberts said. The methods would - for example - be able to verify that letters were actually written by Osama bin Laden or if the same person wrote all of the plays credited to William Shakespeare.
Businesses could track opinions on their products, Hovy said. Banks could track credit card fraud and determine deviations from regular patterns in card usage, giving them rapid warning to contact the card user to see if someone might have stolen their card or card number.
Roberts described a scenario in which the Centers for Disease Control and Prevention would be looking for outbreaks of certain diseases. Investigators would be able to sort through doctors' reports to see if there was an unusual number of patients reporting the same symptoms, determine if pharmacies were selling a large number of the same kind of drugs, find out if www.askdoctor.com was getting a large number of hits about a particular disease and do all of this quickly to see if the pattern fit one of the diseases they were monitoring.
"You need to be able to do this rapidly to get the job done, whether it is a disease, terrorist plot or a credit card fraud plot," Roberts said.
The universities were selected with guidance from Congress, Homeland Security Presidential Directives and experts both within and outside government, Kelly said.
The project will also create educational programs - such as new courses, certificate programs and visits to national labs and corporate partner locations.
"Right after Sept. 11, none of us had ever done anything like this before, and we decided to do something useful as citizens," Roberts said.




Be the first to comment on this article!