When film director James Cameron succeeded in a record-breaking dive to the deepest point of world’s ocean, he tweeted: “Hitting bottom never felt so good. Can’t wait to share what I’m seeing w/ you.” But to get to the bottom of the Mariana Trench, Cameron relied on a custom-built submersible vehicle. By the same token, to explore the Deep Web and Darknet, we need some special tools and techniques. Some of them are similar to or closely related to those we use to explore the Surface Web. Depending on one’s overall goals, different tools and techniques will help reach different depths. For most users, there are generally two different but related approaches to access the Deep Web and Darknet:
Use special search engines accessed from regular browsers such as Internet Explorer, Firefox, Chrome, Safari, etc.
Use special search engines that can be accessed only from a TOR browser.
The research community and those familiar with technology can go even deeper by developing a custom-built crawling program using link-crawling techniques and API programming skills.
One easy way to gain access to the Deep Web is to use alternative/special search engines that are designed specifically for the purpose. These alternative search engines are designed to access different parts of the Deep Web (see Table 1), but the challenge is that all search engines developed so far only crawl or index a small part of the Deep Web. Therefore, it is still necessary to visit the right online directory or hidden website listings (e.g. https://sites.google.com/site/howtoaccessthedeepnet/working-links-to-the-deep-web). Since these websites are not indexed, they will not be found using normal search tools. However, their URLs can be found using other means and, once the URL is known, one can then access some of these sites on the Deep Web using regular browsers.
Some public databases are considered part of the Deep Web because most of their content cannot be crawled or indexed by usual search engines. For most users, they may be interacting with part of the Deep Web regularly, but they may not be aware of it. For example, the directory of the U.S. Library of Congress (www.loc.gov) is an online database that resides on the Deep Web. Other sites utilizing the Deep Web include economic data site FreeLunch.com, Census.gov, Copyright.gov, PubMed, Web of Science, WWW Virtual Library, Directory of Open Access Journals, FindLaw, and Wolfram Alpha. In addition to these publicly available databases, there are plenty of pay-to-use databases (such as Westlaw and LexisNexis) and subscription-only services (found at most academic libraries) that utilize the Deep Web. One can only have access to these databases if they subscribe to them. In addition, there is also a vast amount of information that is private and password-protected (such as credit card and PayPal accounts) located on the Deep Web. Access to this part of the Deep Web is technologically restricted and legally protected.
With the prevalence of Web 2.0 and smartphone devices, a plethora of information is stored in various social networks that are generally not accessible through regular search engines. Many of them require users to be authorized (via registration or by becoming friends with other people) to access the data. Some of these services, like Twitter and Facebook, provide a public application program interface, or API so that users can acquire the information in the network on a vast scale. But many of them, like YikYak and Wechat (both require log-in ID), limit users’ accessibility to their massive database for reasons of security and privacy.
Instant messaging (IM) is another cistern of information in the Deep Web. Previously taking the form of online chatrooms, IM services provide a private and convenient space for people to exchange information, which is usually person-to-person and not archived. This is widely used in online chatting and technical support. Nowadays, some mobile applications allow users to save their messaging history locally so that they can be accessed later if necessary. In addition, instant messaging is becoming more multimedia based, making it harder to archive the messaging history. To access this part of the Deep Web, the best way to record the information is while the conversation is taking place, via screenshots or videotaping.
The Darknet has been increasingly used for trades, conversations, and information/file sharing and transfer in recent years because users are capable of maintaining anonymity, keeping their online activities private. To access the anonymous sites of the Deep Web, visitors must use a TOR (The Onion Router) browser to access websites with the “.onion” domain. Different from Surface Web browsers, the TOR browser allows users to connect to web pages anonymously, making it extremely difficult for anyone to track one’s online activities if one follows all the protocols as required by TOR. Unlike the Surface Web, Darknet pages on the TOR network tend to be unreliable, often going down for hours or days or sometimes disappearing permanently. They can also be very slow to load since TOR is routing the connection through randomly selected servers to protect anonymity. While TOR browsers exist for Android and iOS, these are not secure and not recommended. Similarly, TOR add-ons for other browsers are not secure and are usually not supported by the TOR organization, thus not recommended either.
Since the arrest of Ulbricht in 2013, dozens of Silk Road replacements have sprung up Medusa-like as hidden services deployed on the TOR network. A new and improved version of Silk Road, called Silk Road 2.0, sprung up and was shut down again by law enforcement agencies in November 2013. Figure 4 shows a sample listing under the “Drugs” category on the Agora Darknet marketplace. Among the thousands of listings under this category are advertisements for MDMA, cocaine, Oxycodone, and heroin, among others. Just as on eBay and Amazon, sellers receive feedback scores from their customers, including detailed comments about the quality of the product, delivery time, and other related e-commerce metrics. Indeed, just as the growth of the web “flattened” informational flows, these Darknet marketplaces represent a fundamental shift in the illicit underground economy towards enabling worldwide access and distribution of products and services that have historically required significant investments in the “last mile” of the supply chain. This disruption creates the potential for massive shifts in the international supply chain of goods and services, particularly those that are illegal or subject to taxation or other forms of regulation.
While the Darknet gained notoriety for illegal activities, there are myriad legitimate and benign uses for law-abiding citizens as well. Some are based on familiar concepts, like image sharing (e.g., http://www.zw3crggtadila2sg.onion/imageboard/), which take advantage of the increased security provided by the Deep Web. Others are more unique to Deep Web culture, such as secure whistleblowing sites and eBook collections focused on subversive works (e.g., https://xfmro77i3lixucja.onion.lt/). Journalists have used SecureDrop or GlobalLeaks to share files via the TOR network. Public accounts indicate that Chelsea Manning, Julian Assange, and Edward Snowden all used the TOR network one way or the other to share the massive troves of classified U.S. government files before they leaked them online.
To combat illegal activity on the Darknet, many law enforcement groups have adopted the practices and techniques of online criminals and many network investigative techniques, as they are called by law enforcement agencies, are often similar or identical to routine hacking techniques (Ablon et al., 2014; Mckinnon, 2015). To pierce the dense layers of the anonymity offered by TOR, the Federal Bureau of Investigation (FBI) used a powerful app called Metasploit in “Operation Torpedo,” a 2012 sting against the users of three Darknet child pornography websites.
The FBI also participated in an international legal effort codenamed “Operation Onymous” last year using similar hacking techniques and malware. Using these hacking techniques to study the Deep Web and Darknet raises perplexing legal and ethical questions for researchers due to privacy concerns and the possible violation of well-established Institutional Review Board (IRB) research protocols. Researchers run the risk of doing the wrong thing as they pursue legitimate research projects.