The MIT Technology
Review Emerging Technologies conference opened today with a keynote by Tim Berners-Lee, inventor of the World Wide Web. Promising “a one-hour talk in 30 minutes,” Berners-Lee gave an animated, rapid-fire presentation -- more like a 90-minute talk in 30 minutes -- about the Semantic Web, his latest initiative.
Berners-Lee’s early remarks focused on his development of the Web. “Making the Web was really simple because there was already this morass of things being developed on the Internet,” including protocols such as TCP/IP and other standards. “All I had to do on top of that to create the Web was to create a single global space, which some people said was rather arrogant…. HTTP was a new scheme for the Web… and the idea was that it would minimally constraining.” And HTML, the language he created to drive the Web, would be “the cloth on which a tapestry would be made – the jewels, the colors…”
Based on this fast-growing morass of websites and the interactions between them, what’s come out of it? Dot-com companies that have come and gone, new ways of thinking – and more recently, wikis and blogs. “The original thing I wanted to do was make it a collaborative medium, a place where we can all meet and read and write…. Collaborative things are exciting, and the fact people are doing wikis and blogs shows they’re [embracing] its creative side.”
But from the very beginning of the Web, Berners-Lee had hoped that he would be able to incorporate descriptive information into the Web’s fundamental design, but for various reasons it didn’t make the cut. “One thing I wanted to put in the original design was the ‘typing’ of links,” he said. For example, let’s say you link your website to another site. At the moment, the hyperlink connecting them contains very little information: just an address to get to the other website’s content. But Berners-Lee’s idea was to include “metadata” with each hyperlink to describe the relationship
between the two sites. For example: do the people linking their two websites know each other personally, professionally, or not at all? If they’re colleagues, how are they working together, and in what fields? Where are they working?
“When we put one link to another, a human being knows what that link may mean, but a machine doesn’t,” he said. But this idea of embedding large amounts of machine-readable metadata into HTML didn’t make it into the original Web standard. Now, he’s trying to change that, with an initiative called the Semantic Web.
“The Semantic Web looks at integrating data across the Web,” Berners-Lee said. As the World Wide Web Consortium
explains, “The Web can reach its full potential only if it becomes a place where data can be shared and processed by automated tools as well as by people. For the Web to scale, tomorrow's programs must be able to share and process data even when these programs have been designed totally independently. The Semantic Web is a vision: the idea of having data on the web defined and linked in a way that it can be used by machines not just for display purposes, but for automation, integration and reuse of data across various applications.”
For the Semantic Web to function properly, websites would be designed in ways fundamentally different to traditional HTML. For example, in traditional HTML, if I wanted to assign a page a particular color, I would simply include a bit of code stating exactly what that color should be. Color=Red, basically. But with the Semantic Web, you wouldn’t do this. Rather, you’d tell the website to go to a URL that defines
a universal standard of what that color looks like. So instead of coding a webpage to say “Color=Red,” you’d say something like “Color=http://internationalcolorstandardsite.org/colors/red/v2” and your website would know to connect to this site to identify the color. This would hold true for all data you include in your website: color, people, zipcodes, images, etc. Data would all be connected to URLs containing descriptive information about that data. Information would not be static or absolute; instead it’s “an abstract concept” that gets sucked up from another website explaining exactly how to define it.
An early example of the Semantic Web in action is the Creative Commons initiative, which gives content publishers a simple way of clarifying how their content may be used by others. The Creative Commons team has created a collection of copyright licenses, each stating whether a person’s content can be used for commercial or noncommercial purposes, can be redistributed or edited, with or without the owner’s permission, etc. The system is very flexible, so a person may personalize their license with different combinations of these elements. When a content publisher, like a blogger, places a Creative Commons license on their website, they do so by adding a piece of code to their site’s HTML that refers to their personalized license. This code is made of a collection of URLs, each of which defines a particular element of the license, such as the content’s redistribution policy. So when search engines and other automated tools pick up that blogger’s website, they’ll access these URLs and “understand” your copyright policy as you intended it.
Easy? Maybe not. But Berners-Lee is confident in his vision. “The Web is a tangle, your life is a tangle – get used to it.”
Berners-Lee sees the Semantic Web having a range of uses. Online information will connect seamlessly because of the common concepts they share. “That’s what it’s all about – connecting things,” he said. The Semantic Web will help artificial intelligence projects, online translators and other technologies that require access to large amounts of descriptive data to work properly. Berners-Lee also offered a real-world example. “Sometimes, in an emergency, like when a virus breaks out, you need to correlate data between a number of databases,” he said. The Semantic Web, he explained, will make this much easier.
It’s also helping build powerful social networking tools -- friend-of-a-friend networks in which people write a little bit about themselves as metadata, and connections get formed based on this information. “Who knows what sort of Google will be built on top of this stuff,” Berners-Lee wondered. Computers will be able to browse the Web and find what we’re looking for based on what they know about our needs and the descriptive metadata they find on relevant websites. “A human being browse the Web? That will be a little old fashioned,” he joked.
Berners-Lee noted that the success of the Semantic Web will depend on royalty-free technical standards. “Standards must be royalty free” to foster innovation and encourage the growth of new markets. “It is very important that we make sure we are not tripped up” by proprietary standards, he said. “With so many ridiculous patents out there, there’s always the threat” that an “underwater patent will torpedo innovation.”
Following his speech, Berners-Lee took questions from the audience, moderated by Ethernet inventor and 3Com co-founder Bob Metcalfe. Berners-Lee said the Web was originally a “play project” that his bosses at Switzerland’s CERN laboratory let him explore in his spare time. The structure of CERN, with its many groups of researchers working independently, influenced the structure of the Web. “Because it was a lab, it acted more like a web in itself,” so coming up with a virtual web for CERN staff to share information with each other made a lot of sense.
Once he developed the idea, he started to promote it through Internet
discussion groups, though not necessarily the groups frequented by fellow scientists. “Hypertext wasn’t considered ‘real’ computing, so I sent it out to alternative news groups,” he said. Some people like the University of Illinois’ Marc Andreesen embraced the idea and ran with it; he went on to found Netscape.
Others were less supportive because they didn’t like the technical structure behind it. “Why do I have to use your horrible angle brackets?” they would say to him.
“Do you remember the names of these people?” Metcalfe asked rather mischievously. Berners-Lee laughed and waved off the question.
Despite being the inventor of the Web, Berners-Lee didn’t patent the standard, allowing others to build upon it -- and profit on it. “Some people have said, ‘Isn’t it a shame all these commercial things came about?’” he noted. “But most people wanted a commercial browser.” The private sector helped spread the Web beyond the confines of research and academia. The MarcAndreesens of the world contributed a lot to the adoption of the Webm making it commercially viable, he noted. Berners-Lee added that he still uses Netscape, despite its fall in popularity, on a Mac with the OS X operating system, and has started playing with Mozilla’s new open source Firefox browser as well.
Berners-Lee also described how his work on the Web has changed over the years from being a sole endeavor to a distributed effort with lots of contributors. He waxed nostalgically over the days when he could make all the decisions himself, acknowledging the challenges of achieving consensus in distributed group projects. “If you take little groups, they form their own little cultures. And when you get these groups together, they don’t share their ideas, and have different values towards how things should be built…. This takes a lot more energy than figuring out how to do it yourself…. Making consensus, communicating with other people is hard work.”
“I had the luxury to do this myself… with nobody there to object,” he continued. “But now we’re doing things … where there are lot of people interested in getting involved. … If you want to do something, do it yourself.”
As a final question, Metcalfe asked Berners-Lee about his thoughts on the Web
as an educational tool. “I’d like to see lots of curricula like the MIT Open Courseware initiative
being picked up by K-12,” he said. “The tricky thing is that when you try to put down things like encyclopedia articles, like Wikipedia (which he earlier referred to as “The Font of All Knowledge”) . You really need to keep education materials sown together. So I’d love to see a student be able to fly through this courseware, maybe in 3-D, following his or her interests. I know it takes a huge amount of efforts to keep these things up to date, but I’d [even] like to see teachers help contribute to it.”
“Students can work together [on the Web] when they can interact with simulations, with teachers, but particularly with each other,” he concluded. “And for that we need lots of tools, lots of standards, lots of technology… There’s lots of work to do out there.”