Documentos de Académico
Documentos de Profesional
Documentos de Cultura
conceptual information from the wider Web to create a more knowledge-driven approach to
search. They defined three key elements to creating a true web of concepts:
Information extraction: pulling structured data (addresses, phone numbers, prices, stock
numbers and such) out of Web documents and associating it with an entity
Linking: mapping the relationships between entities (connecting an actor to films hes
starred in and to other actors he has worked with)
Analysis: discovering categorizing information about an entity from the content (such as
the type of food a restaurant serves) or from sentiment data (such as whether the restaurant has
positive reviews).
Google and Microsoft have just begun to tap into the power of that kind of knowledge. And their
respective entity databases remain in their infancy. As of June 1, Satori had mapped over 400
million entities and Knowledge Graph had reached half a billiona tiny fraction of the potential
index of entities that the two search tools could amass.
In interviews with Ars, members of the teams at both Google and Microsoft walked us through
the inner workings of Knowledge Graph and Satori. Additionally, we dug through the components
of both search technologies to understand how they work, how they differ from the "old school"
search, and what projects like these mean to the future of the Web.
high-performance graph processing tool that Google developed to handle many of its Web
indexing tasksthough Thakur declined to discuss those sorts of details.
The schema for Googles Knowledge Graph is based on the same principles, but with some
significant changes to make it scale to Googles needs. Thakur said that when Google purchased
Metaweb, Freebases database had 12 million entities; Knowledge Graph now tracks 500 million
entities and over 3.5 billion relationships between those entities. To ensure that the entities
themselves didnt become bloated with underused data and hinder the scaling-up of the
Knowledge Graph, Googles team threw out the user-defined schema from Freebase and turned
to their most reliable gauge of the data users wanted: Google's search query stream.
We have the luxury of having access to searches, which are like the zeitgeist, Thakur said.
The search stream gives us a window into what people care about and what properties they look
for. The Knowledge Graph team processed Googles stream of search data to prioritize the
properties assigned to entities based on what users were most interested inhow tall buildings
are, what movies an actor starred in, how many times a celebrity went to rehab.
Abstract
One of the methods includes the actions of determining that a first search query
includes a respective text reference to each of one or more predetermined
attributes, wherein each attribute is associated with a first entity type;
For each of a plurality of entities of the first entity type, generating a combined
search query that includes the first search query and a name of the entity;
Obtaining search results for each of the plurality of entities using the combined
search query for each respective entity; and using the obtained search results to
generate combined search results to include in a response to the first search
query.